Talent.com
Staff Site Reliability Engineer (SRE)

Staff Site Reliability Engineer (SRE)

HeartFlowSan Francisco, CA, United States
5 hours ago
Job type
  • Full-time
Job description

Heartflow is a medical technology company advancing the diagnosis and management of coronary artery disease, the #1 cause of death worldwide, using cutting-edge technology. The flagship product—an AI-driven, non-invasive cardiac test supported by the ACC / AHA Chest Pain Guidelines called the Heartflow FFR CT Analysis—provides a color-coded, 3D model of a patient’s coronary arteries indicating the impact blockages have on blood flow to the heart. Heartflow is the first AI-driven non‑invasive integrated heart care solution across the CCTA pathway that helps clinicians identify stenoses in the coronary arteries (RoadMap™Analysis), assess coronary blood flow (FFR CT Analysis), and characterize and quantify coronary atherosclerosis (Plaque Analysis).

Our pipeline of products is growing and so is our team; join us in helping to revolutionize precision heartcare. Heartflow is a publicly traded company (HTFL) that has received international recognition for exceptional strides in healthcare innovation, is supported by medical societies around the world, cleared for use in the US, UK, Europe, Japan and Canada, and has been used for more than 400,000 patients worldwide.

Heartflow is transforming cardiovascular care with cutting‑edge, non‑invasive technology. We are launching a massive Platform Modernization initiative to power the next generation of our life‑saving medical products.

We’re looking for an experienced Site Reliability Engineer (SRE) to join our cloud‑native infrastructure team. You will work closely with our Platform engineers and development teams to ensure our critical systems are highly available, scalable, observable, and performant. If you thrive on eliminating toil, automating complex operations, and defining the standards for production excellence, we want to talk to you.

Job Responsibilities

  • Design, implement, and lead large‑scale, cross‑functional projects to improve the reliability, performance, and efficiency of our core services and infrastructure (10× impact).
  • Drive the reduction of toil by developing and deploying sophisticated automation tools and frameworks, championing the "everything as code" philosophy.
  • Serve as a technical escalation point for critical incidents, perform deep‑dive root cause analyses (RCAs), and implement robust corrective measures to prevent recurrence.
  • Define and implement SLOs, SLIs, and Error Budgets for critical services. Enhance our monitoring, logging, and tracing systems to provide comprehensive visibility into system health.
  • Set the technical direction and best practices for the entire SRE and engineering organization. Mentor mid‑level and senior engineers on design patterns, operational rigor, and reliability principles.

We’re looking for a leader and a deep technical expert with a proven track record of solving the hardest scaling and reliability challenges.

Required Qualifications

  • 8+ years of progressive experience in Site Reliability Engineering, Production Engineering, or a closely related role.
  • Expert‑level proficiency with AWS, including networking, compute, and storage.
  • Deep expertise in Kubernetes and the cloud‑native ecosystem.
  • Fluency in at least one major scripting / programming language for automation and tooling (e.g., Python, Go, or Java).
  • Solid experience with monitoring and logging solutions (Datadog).
  • Proven ability to design and implement robust, highly available distributed systems.
  • Demonstrated experience with Infrastructure as Code tools like Terraform.
  • Exceptional communication skills, capable of explaining complex technical issues to both technical and non‑technical audiences.
  • Nice‑to‑Have

  • Experience implementing Service Mesh technologies (e.g., Istio, Linkerd).
  • A strong understanding of security principles and practices in a cloud environment.
  • Certifications such as CKA (Certified Kubernetes Administrator) or CKAD (Certified Kubernetes Application Developer).
  • A reasonable estimate of the base salary compensation range is $200,750 to $250,922, cash bonus, and equity.

    Heartflow is an Equal Opportunity Employer. We are committed to a work environment that supports, inspires, and respects all individuals and do not discriminate against any employee or applicant because of race, color, religion, marital status, age, national origin, ancestry, physical or mental disability, medical condition, pregnancy, genetic information, gender, sexual orientation, gender identity or expression, veteran status, or any other status protected under federal, state, or local law. This policy applies to every aspect of employment at Heartflow, including recruitment, hiring, training, relocation, promotion, and termination.

    #J-18808-Ljbffr

    Create a job alert for this search

    Staff Site Reliability Engineer • San Francisco, CA, United States

    Related jobs
    • Promoted
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    AI FundSan Francisco, CA, United States
    Full-time
    Baseten powers inference for the world's most dynamic AI companies, like.As a Site Reliability Engineer, you'll envision and build robust systems and processes that ensure our infrastructure is sca...Show moreLast updated: 30+ days ago
    • Promoted
    Senior / Staff Engineer - Reliability (SRE)

    Senior / Staff Engineer - Reliability (SRE)

    Perplexity AI Inc.San Francisco, CA, United States
    Full-time
    Perplexity is seeking a Senior or Staff level Reliability Engineer (SRE) to join our small team in revolutionizing the way people search and interact with the internet. You will be responsible for l...Show moreLast updated: 2 days ago
    • Promoted
    • New!
    Staff / Principal Site Reliability Engineer

    Staff / Principal Site Reliability Engineer

    VezaSan Francisco, CA, United States
    Full-time
    Staff / Principal Site Reliability Engineer.We are seeking an exceptional Staff / Principal Site Reliability Engineer to lead critical infrastructure initiatives and drive Innovation across our organiz...Show moreLast updated: 8 hours ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Insight GlobalSanta Clara, CA, US
    Full-time
    Insight Global is looking for a seasoned SRE to join one of our largest technology clients' multifaceted and fast-paced Infrastructure, Planning and Processes organization where you will be working...Show moreLast updated: 26 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    PsiQuantumPalo Alto, CA, United States
    Full-time
    Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Show moreLast updated: 30+ days ago
    • Promoted
    Reliability Engineer (Rotating Equipment)

    Reliability Engineer (Rotating Equipment)

    Advantage TechnicalRodeo, CA, United States
    Full-time
    Reliability Engineer (Rotating Equipment).Contract : 1 year, could extend.Bachelor’s degree in mechanical engineering or related technical discipline. Minimum 5 years’ rotating equipment reliability ...Show moreLast updated: 6 days ago
    • Promoted
    Staff Site Reliability Engineer

    Staff Site Reliability Engineer

    CrusoeSan Francisco, CA, United States
    Full-time
    Crusoe is building the Worlds Favorite AI-first Cloud infrastructure company.Were pioneering vertically integrated, purpose-built AI infrastructure solutions trusted by Fortune 500 companies to pow...Show moreLast updated: 23 hours ago
    • Promoted
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    SS&C TechnologiesSan Francisco, CA, United States
    Full-time
    SS&C Technologies is a global investment and financial services software provider, headquartered in Windsor, Connecticut, and supporting more than 28,000 employees across 35 countries.It specialize...Show moreLast updated: 20 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Runloop AISan Francisco, CA, United States
    Full-time
    Runloop is building the foundational infrastructure for the next generation of AI development.We provide AI engineers and data scientists with lightning-fast, secure, and reproducible code sandboxe...Show moreLast updated: 15 days ago
    • Promoted
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    BasetenSan Francisco, CA, United States
    Full-time
    Baseten powers inference for the world's most dynamic AI companies, like OpenEvidence, Clay, Mirage, Gamma, Sourcegraph, Writer, Abridge, Bland, and Zed. By uniting applied AI research, flexible inf...Show moreLast updated: 11 days ago
    • Promoted
    Senior / Staff Engineer - Reliability (SRE)

    Senior / Staff Engineer - Reliability (SRE)

    Pantera CapitalPalo Alto, CA, United States
    Full-time
    Perplexity is seeking a Site Reliability Engineer (SRE) to join our small team in revolutionizing the way people search and interact with the internet. You will be responsible for leading the design...Show moreLast updated: 3 days ago
    • Promoted
    Staff Site Reliability Engineer

    Staff Site Reliability Engineer

    ZscalerSan Jose, CA, United States
    Full-time
    Serving thousands of enterprise customers around the world including 45% of Fortune 500 companies, Zscaler (NASDAQ : ZS) was founded in 2007 with a mission to make the cloud a safe place to do busin...Show moreLast updated: 6 days ago
    • Promoted
    Lead Site Reliability Engineer (SRE)

    Lead Site Reliability Engineer (SRE)

    EPAM Systems IncSan Jose, CA, United States
    Full-time
    At EPAM, we're not just building software - we're engineering excellence.Lead Site Reliability Engineer (SRE).This role is ideal for someone who thrives in fast-paced financial systems, has a passi...Show moreLast updated: 6 days ago
    • Promoted
    Senior Site Reliability Engineer (Senior SRE)

    Senior Site Reliability Engineer (Senior SRE)

    CiroosPleasanton, CA, United States
    Full-time
    Senior Site Reliability Engineer (Senior SRE).Be among the first 25 applicants.Ciroos (pronounced Sai rose) is a seed?stage startup founded in February 2025 by a team of experienced executives and ...Show moreLast updated: 6 days ago
    • Promoted
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    Air AppsSan Francisco, CA, United States
    Full-time
    Site Reliability Engineer (SRE).Site Reliability Engineer (SRE).Get AI-powered advice on this job and more exclusive features. At Air Apps, we believe in thinking bigger—and moving faster.We’re a fa...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer Staff

    Site Reliability Engineer Staff

    HPESan Jose, CA, United States
    Full-time
    Site Reliability Engineer Staff.This role has been designed as ‘Hybrid’ with an expectation that you will work on average 2 days per week from an HPE office. Hewlett Packard Enterprise is the global...Show moreLast updated: 6 days ago
    • Promoted
    Staff Site Reliability Engineer - Kubernetes

    Staff Site Reliability Engineer - Kubernetes

    FivetranOakland, CA, United States
    Full-time
    From Fivetran's founding until now, our mission has remained the same : to make access to data as simple and reliable as electricity. With Fivetran, customer data arrives in their warehouses, canonic...Show moreLast updated: 30+ days ago
    • Promoted
    Staff Site Reliability Engineer, Fabric

    Staff Site Reliability Engineer, Fabric

    MongoDBSan Francisco, CA, United States
    Full-time
    Staff Site Reliability Engineer, Fabric.MongoDBs mission is to empower innovators to create, transform, and disrupt industries by unleashing the power of software and data.We enable organizations o...Show moreLast updated: 30+ days ago