Talent.com
No longer accepting applications
Senior Site Reliability Engineer

Senior Site Reliability Engineer

Five9San Ramon, CA, US
30+ days ago
Job type
  • Full-time
Job description

Job Description

Job Description

Join us in bringing joy to customer experience. Five9 is a leading provider of cloud contact center software, bringing the power of cloud innovation to customers worldwide.

Living our values everyday results in our team-first culture and enables us to innovate, grow, and thrive while enjoying the journey together. We celebrate diversity and foster an inclusive environment, empowering our employees to be their authentic selves.

We are seeking a Site Reliability Engineer (SRE) to join our team and help build and maintain highly reliable, scalable systems. This role combines software engineering and operations expertise to ensure our services meet ambitious reliability targets while enabling rapid development and deployment.

This position requires approximately 50% software development and 50% operational work, focusing on automation, monitoring, and system reliability rather than manual operations. The team works collaboratively with our platform, application and database teams to provide a reliable and available service.

Key Responsibilities

Observability & Monitoring

  • Dashboards & Metrics : Design and implement comprehensive dashboards. These dashboards cover OS / platform level monitoring and application-level monitoring. These dashboards are broken into primary (RED) and secondary indicators (USE).
  • Availability & Reliability : Establish and maintain SLIs (Service Level Indicators), SLOs (Service Level Objectives), and error budgets for the service.
  • Performance Monitoring : Build alerting systems and performance monitoring to proactively identify and resolve issues before they impact users
  • Incident Response : Participate in on-call rotations and lead incident response efforts, including post-mortem analysis and remediation. Maintain the official on-call routing. Assign and track application level problems to the engineering team.

Infrastructure Automation & Deployment

  • CI / CD Pipeline Management : Maintain continuous integration and deployment pipelines working with our cloud and on-premise deployment teams.
  • Infrastructure as Code : Develop and maintain infrastructure using tools like Terraform, Ansible, or similar
  • Configuration Management : Automate system configuration and ensure consistency across environments. Provide recommendations for and implement best practices for configuration control.
  • Security & Compliance

  • Security Automation : Ensure security scanning systems are in place and review escalated vulnerabilities.
  • Access Control : Maintain proper authentication, authorization, and audit logging systems
  • Compliance Reporting : Ensure systems meet regulatory requirements and industry standards
  • Security Incident Response : Participate in security incident response and remediation efforts
  • Cost Optimization

  • Resource Management : Monitor and optimize cloud resource usage and costs looking for planned and unplanned resource changes
  • Capacity Planning : Analyze usage patterns and plan for future capacity needs
  • Cost Analysis : Provide recommendations for cost-effective architecture and resource allocation
  • Right-sizing : Implement automated scaling and resource optimization strategies
  • Common Services & Platform Engineering

  • Shared Infrastructure : Build and maintain common services like notification systems, caching layers, and message queues or third-party software stacks.
  • Database Operations : Manage database reliability, performance, and scaling (where not handled by dedicated DB teams)
  • Service Mesh & Networking : Implement and maintain service discovery, load balancing, and network policies
  • Developer Tools : Create and maintain tools and platforms that improve developer productivity and system reliability
  • Required Qualifications

    Technical Skills

  • Programming Languages : Proficiency in at least two of : Python, Shell, PHP, Java, or similar languages
  • Cloud Platforms : Experience with one of AWS, GCP, or Azure infrastructure and services
  • Containerization : Hands-on experience with Docker, Kubernetes, and container orchestration
  • Monitoring & Observability : Experience with Prometheus, Grafana, ELK stack, or similar tools
  • Infrastructure as Code : Proficiency with Terraform, CloudFormation, or similar tools
  • Version Control : Expert-level Git usage and collaborative development practices
  • SRE-Specific Knowledge

  • SLI / SLO Management : Experience defining and maintaining service level objectives
  • Error Budget Policy : Understanding of error budget concepts and implementation
  • Toil Reduction : Track record of identifying and eliminating repetitive manual work
  • Capacity Planning : Experience with performance testing and capacity management
  • Preferred Qualifications

  • Bachelor's degree in Computer Science, Engineering, or equivalent experience
  • Experience with microservices architecture and distributed systems
  • Knowledge of security best practices and compliance frameworks
  • Experience with chaos engineering and reliability testing
  • Previous experience in an SRE or DevOps role at a technology company
  • Contributions to open-source projects or technical communities
  • Success Metrics

  • Reliability : Maintain or improve service availability and reliability metrics
  • Toil Reduction : Measurable reduction in manual operational work through automation
  • Incident Response : Effective participation in incident response with focus on prevention
  • Code Quality : High-quality, well-tested code contributions to infrastructure and tooling
  • Collaboration : Effective partnership with development teams to improve system reliability
  • Team Culture & Values

  • Blameless Post-mortems : Learn from failures without blame or punishment
  • Automation First : Prefer automated solutions over manual processes
  • Measuring Everything : Data-driven decision making and continuous improvement
  • Sharing Knowledge : Document and share expertise across the team
  • Work-Life Balance : Sustainable on-call practices and reasonable operational load
  • Growth Opportunities

  • Opportunity to work on cutting-edge infrastructure and reliability challenges
  • Exposure to large-scale distributed systems and modern cloud technologies
  • Professional development budget for conferences, training, and certifications
  • Career progression path toward senior SRE, staff engineer, or management roles
  • Collaboration with engineering teams across the organization
  • Work Location : This role is fully remote for candidates who reside outside the 50 mile radius of our San Ramon office. For candidates who reside within 50 miles of our San Ramon location, this role is Hybrid and would require 3 days a week (M, W, TH) in our San Ramon office.

    As part of our continued commitment to diversity, equity, and inclusion, Five9 supports pay transparency during the entire recruitment process. Actual compensation packages are based on several factors that are unique to each candidate including, but not limited to : skill set, depth of experience, certifications, and specific work location. The range displayed reflects the minimum and maximum target for new hire salaries for the job across the United States. Your recruiter can share more about the specific compensation package during your hiring process.

    Additionally, the total compensation package for this position may also include an annual performance bonus, stock, and / or other applicable incentive compensation plans.

    Our total reward package also includes :

  • Health, dental, and vision coverage, beginning on the first day of employment. Five9 covers 100% of the employee portion of the health, dental and vision coverage and shares a high portion of the dependent cost. We also offer Short & Long-Term Disability, Basic Life Insurance, and a 401k saving plan with employer matching.
  • Access to an innovative mental health support platform that offers personalized care and resources in areas such as : therapy, coaching and self-guided mindfulness exercises for all covered employees and their covered dependents.
  • Generous employee stock purchase plan.
  • Paid Time Off, Company paid holidays, paid volunteer hours and 12 weeks paid parental leave.
  • All compensation and benefits are subject to the requirements and restrictions set forth in the applicable plan documents and any written agreements between the parties.

    The US base salary range for this role is below.

    $91,500—$219,700 USD

    Five9 embraces diversity and is committed to building a team that represents a variety of backgrounds, perspectives, and skills.  The more inclusive we are, the better we are.  Five9 is an equal opportunity employer.

    View our privacy policy, including our privacy notice to California residents here : https : / / www.five9.com / pt -pt / legal.

    Note : Five9 will never request that an applicant send money as a prerequisite for commencing employment with Five9.

    Create a job alert for this search

    Senior Site Reliability Engineer • San Ramon, CA, US

    Related jobs
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    VirtualVocationsSan Jose, California, United States
    Full-time
    A company is looking for a Site Reliability Engineer.Key Responsibilities Ensure system reliability and minimize downtime for applications Analyze and optimize system performance and implement t...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Site Reliability Engineer, Scalability

    Senior Site Reliability Engineer, Scalability

    Meraki, LLCSan Francisco, CA, United States
    Full-time
    Application window is open until further notice.The Infrastructure SRE team is responsible for the compute, storage and security underpinning Meraki's cloud in 10 data centers worldwide.Meraki's hi...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    xAIPalo Alto, CA, US
    Full-time
    AI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering exc...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer - SRE at Descope Los Altos, CA

    Site Reliability Engineer - SRE at Descope Los Altos, CA

    Itlearn360Los Altos, CA, United States
    Full-time
    Site Reliability Engineer - SRE job at Descope.Descope R&D group is a skilled team of developers with a unique DNA of creativity,flexibility,anopen mindset. We are looking for a passionate SRE to jo...Show moreLast updated: 30+ days ago
    • Promoted
    Principal Site Reliability Engineer

    Principal Site Reliability Engineer

    VirtualVocationsHayward, California, United States
    Full-time
    A company is looking for a Principal Site Reliability Engineer.Key Responsibilities Lead project work to build and maintain platform features for reliability and cloud infrastructure Mentor serv...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineering Manager

    Site Reliability Engineering Manager

    VirtualVocationsFremont, California, United States
    Full-time
    A company is looking for a Manager, Software Engineer.Key Responsibilities Define and execute the strategic vision and roadmap for the Site Reliability Engineering function Provide leadership an...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Rollbar, Inc.San Francisco, CA, United States
    Full-time
    Wikimedia Foundation is hiring a Senior Site Reliability Engineer (SRE) to join our Service Operations SRE team, where we take care of the infrastructure that runs wikipedia.The SRE team at Wikimed...Show moreLast updated: 27 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    PsiQuantumPalo Alto, CA, United States
    Full-time
    Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Developer

    Site Reliability Developer

    VirtualVocationsConcord, California, United States
    Full-time
    A company is looking for a Site Reliability Developer.Key Responsibilities Perform DevOps activities to support customers and engineers during release cycles and production Respond to incidents,...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Eliassen GroupConcord, CA, US
    Full-time
    We are seeking a Senior Site Reliability Engineer (SRE) to join our Digital Platform Engineering team and play a critical role in ensuring the reliability, scalability, and performance of our infra...Show moreLast updated: 2 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Dtex SystemsFremont, CA, US
    Full-time
    We are excited that you’ve taken the time to explore our business and potentially join us on this incredible journey.We are already the leader in the Insider Risk Management, but our story do...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Site Reliability Engineer, Storage

    Senior Site Reliability Engineer, Storage

    Epoch BiodesignSan Francisco, CA, United States
    Full-time
    Crusoe Energy is on a mission to unlock value in stranded energy resources through the power of computation.Take a look at what we do! - https : / / www. We aim to align the long term interests of the c...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    LTD GlobalBerkeley, CA, US
    Full-time
    We are seeking a Site Reliability Engineer to join our Operations Group.This role plays a key part in advancing scientific discovery by supporting high-performance computing (HPC) and data analysis...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer - Technical Lead

    Site Reliability Engineer - Technical Lead

    ZipRecruiterSan Francisco, CA, United States
    Full-time
    Veryon is a leading software and technology company that enables aviation teams around the world to improve efficiency and safety. Our products maximize uptime for aircraft maintenance teams through...Show moreLast updated: 4 days ago
    • Promoted
    • New!
    Site Reliability Engineering Manager-Ecommerce

    Site Reliability Engineering Manager-Ecommerce

    Synstack TechnologiesSan Ramon, CA, US
    Full-time
    This is Hemanth from Synstack, please share your resume for below opportunity.Location – San Ramon, CA (onsite).A Site Reliability Engineer is a professional who acts as a warrior to monitor,...Show moreLast updated: 17 hours ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    WritemedSan Francisco, CA, United States
    Full-time
    Would you like to join one of the fastest-growing organizations with a goal of using the latest AI, GenAI, LLM, Cloud, and Digital Technologies to advance drug development and improve patient care ...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Foxconn Industrial Internet - FIISan Jose, CA, US
    Full-time +1
    Foxconn Industrial Internet (Fii), is a world leading professional design and manufacturing service provider of communication network equipment, cloud service equipment, precision tools and industr...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    ZiplineSouth San Francisco, CA, US
    Full-time
    Do you want to change the world? Zipline is on a mission to transform the way goods move.Our aim is to solve the world's most urgent and complex access challenges by building, manufacturing and...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    VirtualVocationsOakland, California, United States
    Full-time
    A company is looking for a Senior Site Reliability Engineer.Key Responsibilities Design and implement infrastructure and automation scripts for AWS deployment and management Optimize and monitor...Show moreLast updated: 30+ days ago
    • Promoted
    Sr Site Reliability Engineer Denver, CO;San Francisco, CA;New York, NY;Seattle, WA;Toronto, Ont[...]

    Sr Site Reliability Engineer Denver, CO;San Francisco, CA;New York, NY;Seattle, WA;Toronto, Ont[...]

    GustoSan Francisco, CA, United States
    Full-time
    Gusto is a modern, online people platform that helps small businesses take care of their teams.On top of full-service payroll, Gusto offers health insurance, 401(k)s, expert HR, and team management...Show moreLast updated: 29 days ago