Site Reliability Engineer Sr. StaffHewlett Packard Enterprise • San Jose, CA, United States

Site Reliability Engineer Sr. Staff

Hewlett Packard Enterprise • San Jose, CA, United States

2 days ago

Job type

Full-time

Job description

Site Reliability Engineer Sr. Staff

This role has been designed as 'Hybrid' with an expectation that you will work on average 2 days per week from an HPE office.

Who We Are :

Hewlett Packard Enterprise is the global edge-to-cloud company advancing the way people live and work. We help companies connect, protect, analyze, and act on their data and applications wherever they live, from edge to cloud, so they can turn insights into outcomes at the speed required to thrive in today's complex world. Our culture thrives on finding new and better ways to accelerate what's next. We know varied backgrounds are valued and succeed here. We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good. If you are looking to stretch and grow your career our culture will embrace you. Open up opportunities with HPE.

Job Description

Responsibilities

As a Staff Software Engineer, you will play a key role in designing, building, and optimizing cloud infrastructure and deployment systems. Your work will directly impact scalability, security, and operational efficiency across our platforms. Key responsibilities include :

Enhance Infrastructure as Code (IAC) and enforce best practices.
Optimize cloud infrastructure for scalability, security, and cost-effectiveness.
Develop internal tools to support and streamline cloud platform operations.
Improve CI / CD pipelines and deployment workflows using FluxCD and Jenkins.
Address container image vulnerabilities and standardize remediation processes.
Build Amazon Machine Images (AMIs) aligned with CIS and STIG benchmarks.
Strengthen monitoring, alerting, and observability using Prometheus, Grafana, and logging tools.
Troubleshoot complex production issues to ensure system reliability and customer satisfaction.
Fine-tune distributed systems such as Apache Kafka and Cassandra.
Collaborate with development, security, and operations teams to align infrastructure with application needs.

Basic Qualifications

Minimum of 12 years of hands-on experience in Infra Ops, Dev Ops, or Site Reliability Engineering (SRE).

Proficiency with Linux systems, especially Debian-based distributions.

Strong experience with cloud platforms such as AWS and GCP.

Expertise in Infrastructure as Code tools like Terraform, Packer, and Ansible.

Solid programming skills in Python and / or Golang.

Deep understanding of containerization (Docker, Container) and orchestration tools (AWS EKS, GCP GKE).

Experience with GitOps workflows.

Proven track record in implementing and maintaining CI / CD pipelines.

Strong background in security and familiarity with security programs.

Experience with monitoring and logging tools (Prometheus, Grafana, ELK).

Knowledge of both relational (SQL) and non-relational databases.

Excellent problem-solving and debugging skills with a strong sense of ownership.

Experience managing distributed systems like Apache Kafka and Cassandra.

Effective communicator and collaborative team player.

Preferred Qualifications

Experience contributing to open-source projects.

Background in security engineering or related disciplines.

Additional Skills :

Cloud Architectures, Cross Domain Knowledge, Design Thinking, Development Fundamentals, DevOps, Distributed Computing, Microservices Fluency, Full Stack Development, Security-First Mindset, Solutions Design, Testing & Automation, User Experience (UX)

What We Can Offer You

Health & Wellbeing

We strive to provide our team members and their loved ones with a comprehensive suite of benefits that supports their physical, financial and emotional wellbeing.

Personal & Professional Development

We also invest in your career because the better you are, the better we all are. We have specific programs catered to helping you reach any career goals you have whether you want to become a knowledge expert in your field or apply your skills to another division.

Unconditional Inclusion

We are unconditionally inclusive in the way we work and celebrate individual uniqueness. We know varied backgrounds are valued and succeed here. We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good.

USD Annual Salary : $148,000.00 - $340,500.00

HPE is an Equal Employment Opportunity / Veterans / Disabled / LGBT employer. We do not discriminate on the basis of race, gender, or any other protected category, and all decisions we make are made on the basis of qualifications, merit, and business need. Our goal is to be one global team that is representative of our customers, in an inclusive environment where we can continue to innovate and grow together.

Create a job alert for this search

Site Reliability Engineer • San Jose, CA, United States

Related jobs

Site Reliability Engineer (Senior or Staff), Fabric

MongoDB • San Francisco, CA, United States

Full-time

Platform Engineering is the department within SRE that is responsible for a range of critical infrastructure and operational functions that support the broader engineering organization.Among these ...Show more

Last updated: 3 days ago • Promoted

Site Reliability Engineer (SRE)

SS&C Technologies • San Francisco, CA, United States

Full-time

As a leading financial services and healthcare technology company based on revenue, SS&C is headquartered in Windsor, Connecticut, and has 27,000+ employees in 35 countries.Some 20,000 financial se...Show more

Last updated: 3 days ago • Promoted

Site Reliability Engineer

PsiQuantum • Palo Alto, CA, United States

Full-time

Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Show more

Last updated: 30+ days ago • Promoted

Staff Site Reliability Engineer

NVIDIA • Santa Clara, CA, United States

Full-time

Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and availability using the combination of...Show more

Last updated: 3 days ago • Promoted

Staff Site Reliability Engineer

Checkr • San Francisco, CA, United States

Full-time

Checkr is building the data platform to power safe and fair decisions.Established in 2014, Checkr’s innovative technology and robust data platform help customers assess risk and ensure safety and c...Show more

Last updated: 3 days ago • Promoted

Sr. Engineer - Site Reliability (Remote)

CrowdStrike Holdings, Inc. • San Francisco, CA, United States

Remote

Full-time

As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn't changed - we're here to stop breaches...Show more

Last updated: 3 days ago • Promoted

Site Reliability Engineer

Runloop AI, Inc • San Francisco, CA, United States

Full-time

Runloop is building the foundational infrastructure for the next generation of AI development.We provide AI engineers and data scientists with lightning-fast, secure, and reproducible code sandboxe...Show more

Last updated: 3 days ago • Promoted

Site Reliability Engineer

Rethink recruit • San Francisco, CA, United States

Full-time

Last updated: 3 days ago • Promoted

Site Reliability Engineer

Insight Global • Santa Clara, CA, United States

Full-time

Insight Global is looking for a seasoned SRE to join one of our largest technology clients' multifaceted and fast-paced Infrastructure, Planning and Processes organization where you will be working...Show more

Last updated: 3 days ago • Promoted

Site Reliability Engineer

Redwood Materials • San Francisco, CA, United States

Full-time

Redwood is localizing a global battery supply chain that seamlessly integrates recovery, reuse, and recycling.We are seeking a highly skilled and motivated Site Reliability Engineer to collect requ...Show more

Last updated: 3 days ago • Promoted

Site Reliability Engineer

Xai • Palo Alto, CA, United States

Full-time

AIs mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellen...Show more

Last updated: 3 days ago • Promoted

Staff Site Reliability Engineer

Zscaler • San Jose, CA, United States

Full-time

Serving thousands of enterprise customers around the world including 45% of Fortune 500 companies, Zscaler (NASDAQ : ZS) was founded in 2007 with a mission to make the cloud a safe place to do busin...Show more

Last updated: 3 days ago • Promoted

Sr. Site Reliability Engineer

CENTRL Inc • San Francisco, CA, United States

Full-time

CENTRL is a rapidly growing Silicon Valley technology company specializing in third-party risk, due diligence, cyber risk, and security. With offices in the SF Bay Area, NY, Australia, and India, CE...Show more

Last updated: 1 day ago • Promoted

Sr. Staff Engineer

Bio-Rad Laboratories • Pleasanton, CA, United States

Full-time

You'll drive the development of hardware products that directly impact healthcare innovation and improve lives worldwide. You'll collaborate cross-functionally to.Your expertise in electrical engine...Show more

Last updated: 30+ days ago • Promoted

Site Reliability Engineer Staff

HPE • San Jose, CA, United States

Full-time

Site Reliability Engineer Staff.This role has been designed as ‘Hybrid’ with an expectation that you will work on average 2 days per week from an HPE office. Hewlett Packard Enterprise is the global...Show more

Last updated: 3 days ago • Promoted

Sr Site Reliability Engineer

F5 Networks • San Jose, CA, United States

Full-time

Sr Site Reliability Engineer page is loaded## Sr Site Reliability Engineerremote type : Hybridlocations : San Jose : Seattletime type : Full timeposted on : Posted Todayjob requisition id : RP1034618At F...Show more

Last updated: 3 days ago • Promoted

Staff Site Reliability Engineer - Kubernetes

Fivetran • Oakland, CA, United States

Full-time

From Fivetran's founding until now, our mission has remained the same : to make access to data as simple and reliable as electricity. With Fivetran, customer data arrives in their warehouses, canonic...Show more

Last updated: 30+ days ago • Promoted

Site Reliability Engineer

Rockwoods Inc • Pleasanton, CA, US

Full-time

Note : Candidates must have relevant experience in Medical / Healthcare domains, this is mandatory.Senior SRE Engineer - Pleasanton, 5 days office. Primary work : 24x7 On-call support and setting up mo...Show more

Last updated: 21 days ago • Promoted