Talent.com
Principal Site Reliability Engineer

Principal Site Reliability Engineer

HPESan Jose, CA, United States
3 days ago
Job type
  • Full-time
Job description

Principal Site Reliability Engineer

This role has been designed as ‘Hybrid’ with an expectation that you will work on average 2 days per week from an HPE office.

Who We Are :

Hewlett Packard Enterprise is the global edge-to-cloud company advancing the way people live and work. We help companies connect, protect, analyze, and act on their data and applications wherever they live, from edge to cloud, so they can turn insights into outcomes at the speed required to thrive in today’s complex world. Our culture thrives on finding new and better ways to accelerate what’s next. We know varied backgrounds are valued and succeed here. We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good. If you are looking to stretch and grow your career our culture will embrace you. Open up opportunities with HPE.

Job Description :

U.S. Citizenship required due to federal client's regulations

As a Staff Software Engineer, you will play a key role in designing, building, and optimizing cloud infrastructure and deployment systems. Your work will directly impact scalability, security, and operational efficiency across our platforms. Key responsibilities include :

  • Enhance Infrastructure as Code (IAC) and enforce best practices.
  • Optimize cloud infrastructure for scalability, security, and cost-effectiveness.
  • Develop internal tools to support and streamline cloud platform operations.
  • Improve CI / CD pipelines and deployment workflows using FluxCD and Jenkins.
  • Address container image vulnerabilities and standardize remediation processes.
  • Build Amazon Machine Images (AMIs) aligned with CIS and STIG benchmarks.
  • Strengthen monitoring, alerting, and observability using Prometheus, Grafana, and logging tools.
  • Troubleshoot complex production issues to ensure system reliability and customer satisfaction.
  • Fine-tune distributed systems such as Apache Kafka and Cassandra.
  • Collaborate with development, security, and operations teams to align infrastructure with application needs.

U.S. Citizenship required due to federal client's regulations

Basic Qualifications

  • Minimum of 12 years of hands-on experience in Infra Ops, Dev Ops, or Site Reliability Engineering (SRE).
  • Proficiency with Linux systems, especially Debian-based distributions.
  • Strong experience with cloud platforms such as AWS and GCP.
  • Expertise in Infrastructure as Code tools like Terraform, Packer, and Ansible.
  • Solid programming skills in Python and / or Golang.
  • Deep understanding of containerization (Docker, Container) and orchestration tools (AWS EKS, GCP GKE).
  • Experience with GitOps workflows.
  • Proven track record in implementing and maintaining CI / CD pipelines.
  • Strong background in security and familiarity with security programs.
  • Experience with monitoring and logging tools (Prometheus, Grafana, ELK).
  • Knowledge of both relational (SQL) and non-relational databases.
  • Excellent problem-solving and debugging skills with a strong sense of ownership.
  • Experience managing distributed systems like Apache Kafka and Cassandra.
  • Effective communicator and collaborative team player.
  • Preferred Qualifications

  • Experience contributing to open-source projects.
  • Background in security engineering or related disciplines.
  • Additional Skills :

    Cloud Architectures, Cross Domain Knowledge, Design Thinking, Development Fundamentals, DevOps, Distributed Computing, Microservices Fluency, Full Stack Development, Security-First Mindset, Solutions Design, Testing & Automation, User Experience (UX)

    What We Can Offer You :

    Health & Wellbeing

    We strive to provide our team members and their loved ones with a comprehensive suite of benefits that supports their physical, financial and emotional wellbeing.

    Personal & Professional Development

    We also invest in your career because the better you are, the better we all are. We have specific programs catered to helping you reach any career goals you have — whether you want to become a knowledge expert in your field or apply your skills to another division.

    Unconditional Inclusion

    We are unconditionally inclusive in the way we work and celebrate individual uniqueness. We know varied backgrounds are valued and succeed here. We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good.

    Let's Stay Connected :

    Follow @HPECareers on Instagram to see the latest on people, culture and tech at HPE.

    #unitedstates #networking

    Job : Engineering

    Job Level : TCP_05

    States with Pay Range Requirement

    The expected salary / wage range for a U.S.-based hire filling this position is provided below. Actual offer may vary from this range based upon geographic location, work experience, education / training, and / or skill level. If this is a sales role, then the listed salary range reflects combined base salary and target-level sales compensation pay. If this is a non-sales role, then the listed salary range reflects base salary only. Variable incentives may also be offered. Information about employee benefits offered can be found at

    USD Annual Salary : $152,000.00 - $349,000.00

    HPE is an Equal Employment Opportunity / Veterans / Disabled / LGBT employer. We do not discriminate on the basis of race, gender, or any other protected category, and all decisions we make are made on the basis of qualifications, merit, and business need. Our goal is to be one global team that is representative of our customers, in an inclusive environment where we can continue to innovate and grow together. Please click here : Equal Employment Opportunity.

    Hewlett Packard Enterprise is EEO Protected Veteran / Individual with Disabilities.

    HPE will comply with all applicable laws related to employer use of arrest and conviction records, including laws requiring employers to consider for employment qualified applicants with criminal histories.

    Create a job alert for this search

    Site Reliability Engineer • San Jose, CA, United States

    Related jobs
    • Promoted
    Principal Site Reliability Engineer

    Principal Site Reliability Engineer

    FortinetSanta Clara, CA, United States
    Full-time
    At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers. Our team members enjoy solving complex problems, and obsess...Show moreLast updated: 26 days ago
    • Promoted
    Site Reliability Engineer I

    Site Reliability Engineer I

    ProsperSan Francisco, CA, United States
    Full-time
    As a Site Reliability Engineer I at Prosper, you will play a crucial role in enhancing the reliability, scalability, and maintainability of our technology platform. This entry-level position is desi...Show moreLast updated: 8 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    FortinetSunnyvale, CA, United States
    Full-time
    At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers. Our team members enjoy solving complex problems, and obsess...Show moreLast updated: 3 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    PsiQuantumPalo Alto, CA, United States
    Full-time
    Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Rethink recruitSan Francisco, CA, United States
    Full-time
    Runloop is building the foundational infrastructure for the next generation of AI development.We provide AI engineers and data scientists with lightning-fast, secure, and reproducible code sandboxe...Show moreLast updated: 3 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Redwood Materials, Inc.San Francisco, CA, United States
    Full-time
    Redwood is localizing a global battery supply chain that seamlessly integrates recovery, reuse, and recycling—keeping critical minerals in circulation and driving the energy transition.Founded in 2...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Runloop AI, IncSan Francisco, CA, United States
    Full-time
    Runloop is building the foundational infrastructure for the next generation of AI development.We provide AI engineers and data scientists with lightning-fast, secure, and reproducible code sandboxe...Show moreLast updated: 3 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Insight GlobalSanta Clara, CA, United States
    Full-time
    Insight Global is looking for a seasoned SRE to join one of our largest technology clients' multifaceted and fast-paced Infrastructure, Planning and Processes organization where you will be working...Show moreLast updated: 3 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Runloop AISan Francisco, CA, United States
    Full-time
    Runloop is building the foundational infrastructure for the next generation of AI development.We provide AI engineers and data scientists with lightning-fast, secure, and reproducible code sandboxe...Show moreLast updated: 12 days ago
    • Promoted
    Principal Site Reliability Engineer

    Principal Site Reliability Engineer

    Hewlett Packard Enterprise Development LPSan Jose, CA, United States
    Full-time
    Principal Site Reliability Engineer.This role has been designed as 'Hybrid' with an expectation that you will work on average 2 days per week from an HPE office. Hewlett Packard Enterprise is the gl...Show moreLast updated: 3 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    PSI QuantumPalo Alto, CA, United States
    Full-time
    Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    XaiPalo Alto, CA, United States
    Full-time
    AIs mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellen...Show moreLast updated: 3 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    ReplitFoster City, CA, United States
    Full-time
    Replit is the agentic software creation platform that enables anyone to build applications using natural language.With millions of users worldwide and over 500,000 business users, Replit is democra...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Signify TechnologyPalo Alto, CA, US
    Full-time
    Competitive, based on experience.We are a technology startup advancing healthcare with a safety-focused AI platform that assists medical professionals by managing patient communications, including ...Show moreLast updated: 21 days ago
    • Promoted
    Site Reliability Engineer I

    Site Reliability Engineer I

    Prosper.comSan Francisco, CA, United States
    Full-time
    As a Site Reliability Engineer I at Prosper, you will play a crucial role in enhancing the reliability, scalability, and maintainability of our technology platform. This entry-level position is desi...Show moreLast updated: 3 days ago
    • Promoted
    Site Reliability Engineer - Supercomputing

    Site Reliability Engineer - Supercomputing

    XaiPalo Alto, CA, United States
    Full-time
    Site Reliability Engineer - Supercomputing.We are seeking a talented Site Reliability Engineer (SRE) to join our SuperComputing team. In this role, you'll ensure the reliability, scalability, and pe...Show moreLast updated: 3 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    P2PSan Francisco, CA, United States
    Full-time
    Our mission is to bring web3 to a billion people, by providing builders with the tools they need to build exceptional onchain products. Alchemy is the only complete developer platform that offers th...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Rockwoods IncPleasanton, CA, US
    Full-time
    Note : Candidates must have relevant experience in Medical / Healthcare domains, this is mandatory.Senior SRE Engineer - Pleasanton, 5 days office. Primary work : 24x7 On-call support and setting up mo...Show moreLast updated: 21 days ago