Talent.com
Staff Site Reliability Engineer (SRE)
Staff Site Reliability Engineer (SRE)Heartflow • San Francisco, California, United States
Staff Site Reliability Engineer (SRE)

Staff Site Reliability Engineer (SRE)

Heartflow • San Francisco, California, United States
30+ days ago
Job type
  • Full-time
Job description

Heartflow is a medical technology company advancing the diagnosis and management of coronary artery disease, the #1 cause of death worldwide, using cutting-edge technology. The flagship product—an AI-driven, non-invasive cardiac test supported by the ACC / AHA Chest Pain Guidelines called the Heartflow FFR

CT

Analysis—provides a color-coded, 3D model of a patient’s coronary arteries indicating the impact blockages have on blood flow to the heart. Heartflow is the first AI-driven non-invasive integrated heart care solution across the CCTA pathway that helps clinicians identify stenoses in the coronary arteries (RoadMap™Analysis), assess coronary blood flow (FFR

CT

Analysis), and characterize and quantify coronary atherosclerosis (Plaque Analysis). Our pipeline of products is growing and so is our team; join us in helping to revolutionize precision heartcare.

Heartflow is a publicly traded company (HTFL) that has received international recognition for exceptional strides in healthcare innovation, is supported by medical societies around the world, cleared for use in the US, UK, Europe, Japan and Canada, and has been used for more than 400,000 patients worldwide.

Heartflow is transforming cardiovascular care with cutting-edge, non-invasive technology. We are launching a massive Platform Modernization initiative to power the next generation of our life-saving medical products.

We're looking for an experienced Site Reliability Engineer (SRE) to join our cloud-native infrastructure team. You will work closely with our Platform engineers and development teams to ensure our critical systems are highly available, scalable, observable, and performant. If you thrive on eliminating toil, automating complex operations, and defining the standards for production excellence, we want to talk to you.

Job Responsibilities

As our Staff SRE, you'll be the primary expert responsible for our entire compute ecosystem. Your key responsibilities will include :

As a Staff SRE, you'll operate at the highest level of technical expertise and influence. You won't just solve problems; you'll prevent them at a fundamental level across organizational boundaries.

  • Design, implement, and lead large-scale, cross-functional projects to improve the reliability, performance, and efficiency of our core services and infrastructure (10× impact).
  • Drive the reduction of toil by developing and deploying sophisticated automation tools and frameworks, championing the "everything as code" philosophy.
  • Serve as a technical escalation point for critical incidents, perform deep-dive root cause analyses (RCAs), and implement robust corrective measures to prevent recurrence.
  • Define and implement SLOs, SLIs, and Error Budgets for critical services. Enhance our monitoring, logging, and tracing systems to provide comprehensive visibility into system health.
  • Set the technical direction and best practices for the entire SRE and engineering organization. Mentor mid-level and senior engineers on design patterns, operational rigor, and reliability principles.

We're looking for a leader and a deep technical expert with a proven track record of solving the hardest scaling and reliability challenges.

Required Qualifications

  • 8+ years of progressive experience in Site Reliability Engineering, Production Engineering, or a closely related role.
  • Expert-level proficiency with AWS, including networking, compute, and storage.
  • Deep expertise in Kubernetes and the cloud-native ecosystem.
  • Fluency in at least one major scripting / programming language for automation and tooling (e.g., Python, Go, or Java).
  • Solid experience with monitoring and logging solutions (Datadog)
  • Proven ability to design and implement robust, highly available distributed systems.
  • Demonstrated experience with Infrastructure as Code tools like Terraform.
  • Exceptional communication skills, capable of explaining complex technical issues to both technical and non-technical audiences.
  • Nice-to-Have

  • Experience implementing Service Mesh technologies (e.g., Istio, Linkerd).
  • A strong understanding of security principles and practices in a cloud environment.
  • Certifications such as CKA (Certified Kubernetes Administrator) or CKAD (Certified Kubernetes Application Developer).
  • #sre #kubernetes #openrole

    A reasonable estimate of the base salary compensation range is $185,750 to $250,922, cash bonus, and equity. #LI-IB1 #LI-Hybrid;

    Heartflow is an Equal Opportunity Employer. We are committed to a work environment that supports, inspires, and respects all individuals and do not discriminate against any employee or applicant because of race, color, religion, marital status, age, national origin, ancestry, physical or mental disability, medical condition, pregnancy, genetic information, gender, sexual orientation, gender identity or expression, veteran status, or any other status protected under federal, state, or local law. This policy applies to every aspect of employment at Heartflow, including recruitment, hiring, training, relocation, promotion, and termination.

    Positions posted for Heartflow are not intended for or open to third party recruiters / agencies. Submission of any unsolicited resumes for these positions will be considered to be free referrals.

    Heartflow has become aware of a fraud where unknown entities are posing as Heartflow recruiters in an attempt to obtain personal information from individuals as part of our application or job offer process. Before providing any personal information to outside parties, please verify the following : A) all legitimate Heartflow recruiter email addresses end with “@heartflow.com” and B) the position described is found on our careers site at  www.heartflow.com / about / careers / .

    Create a job alert for this search

    Staff Site Reliability Engineer • San Francisco, California, United States

    Related jobs
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    AI Fund • San Francisco, CA, United States
    Full-time
    Baseten powers inference for the world's most dynamic AI companies, like.As a Site Reliability Engineer, you'll envision and build robust systems and processes that ensure our infrastructure is sca...Show more
    Last updated: 30+ days ago • Promoted
    Staff Site Reliability Engineer

    Staff Site Reliability Engineer

    Altana • San Francisco, California, United States
    Full-time
    AI can be a powerful tool for good in the world – at Altana we apply AI to the world’s largest organized body of supply chain data to power a more resilient, more secure, and more sustainable model...Show more
    Last updated: 30+ days ago • Promoted
    Staff Site Reliability Engineer

    Staff Site Reliability Engineer

    Altana AI • San Francisco, CA, United States
    Full-time
    AI can be a powerful tool for good in the world – at Altana we apply AI to the world’s largest organized body of supply chain data to power a more resilient, more secure, and more sustainable model...Show more
    Last updated: 30+ days ago • Promoted
    Staff / Principal Site Reliability Engineer

    Staff / Principal Site Reliability Engineer

    Veza • San Francisco, CA, United States
    Full-time
    Staff / Principal Site Reliability Engineer.We are seeking an exceptional Staff / Principal Site Reliability Engineer to lead critical infrastructure initiatives and drive Innovation across our organiz...Show more
    Last updated: 17 days ago • Promoted
    Staff Site Reliability Engineer

    Staff Site Reliability Engineer

    Checkr • San Francisco, CA, United States
    Full-time
    Checkr is building the data platform to power safe and fair decisions.Established in 2014, Checkr’s innovative technology and robust data platform help customers assess risk and ensure safety and c...Show more
    Last updated: 26 days ago • Promoted
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    SS&C Technologies • San Francisco, CA, United States
    Full-time
    SS&C Technologies is a global investment and financial services software provider, headquartered in Windsor, Connecticut, and supporting more than 28,000 employees across 35 countries.It specialize...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer - Inference

    Site Reliability Engineer - Inference

    Lambda • San Francisco, California, United States
    Full-time
    In 2012, Lambda started with a crew of AI engineers publishing research at top machine-learning conferences.We began as an AI company built by AI engineers. Today, we're on a mission to be the world...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    Air Apps, Inc. • San Francisco, CA, United States
    Full-time
    At Air Apps, we believe in thinking bigger—and moving faster.We’re a family-founded company on a mission to create the world’s first AI-powered Personal & Entrepreneurial Resource Planner (PRP), an...Show more
    Last updated: less than 1 hour ago • Promoted • New!
    Senior / Staff Site Reliability Engineer

    Senior / Staff Site Reliability Engineer

    Fluidstack • San Francisco, CA, United States
    Full-time
    At Fluidstack, we’re building the infrastructure for abundant intelligence.We partner with top AI labs, governments, and enterprises - including Mistral, Poolside, Black Forest Labs, Meta, and more...Show more
    Last updated: 30+ days ago • Promoted
    Staff Site Reliability Engineer

    Staff Site Reliability Engineer

    Redwood Materials, Inc. • San Francisco, CA, United States
    Full-time
    Redwood is localizing a global battery supply chain that seamlessly integrates recovery, reuse, and recycling—keeping critical minerals in circulation and driving the energy transition.Founded in 2...Show more
    Last updated: 7 days ago • Promoted
    Sr. Site Reliability Engineer

    Sr. Site Reliability Engineer

    Prosper • San Francisco, California, United States
    Full-time
    As a Senior Site Reliability Engineer (SRE) at Prosper, you will be instrumental in enhancing the reliability, scalability, and maintainability of our technology platform.This role bridges the gap ...Show more
    Last updated: 30+ days ago • Promoted
    Staff Software Engineer, Site Reliability Engineer (SRE)

    Staff Software Engineer, Site Reliability Engineer (SRE)

    Harvey • San Francisco, California, United States
    Full-time
    Harvey is a secure AI platform for legal and professional services that augments productivity and automates complex workflows. Harvey uses algorithms with reasoning-adept LLMs that have been customi...Show more
    Last updated: 30+ days ago • Promoted
    Lead Site Reliability Engineer

    Lead Site Reliability Engineer

    Visa • Foster City, California, United States
    Full-time
    Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more t...Show more
    Last updated: 30+ days ago • Promoted
    Staff Site Reliability Engineer

    Staff Site Reliability Engineer

    Crusoe • San Francisco, California, United States
    Full-time
    Crusoe is building the World’s Favorite AI-first Cloud infrastructure company.We’re pioneering vertically integrated, purpose-built AI infrastructure solutions trusted by Fortune 500 companies to ...Show more
    Last updated: 30+ days ago • Promoted
    Staff Site Reliability Engineer

    Staff Site Reliability Engineer

    Replit • Foster City, California, United States
    Full-time
    Replit is the agentic software creation platform that enables anyone to build applications using natural language.With millions of users worldwide and over 500,000 business users, Replit is democra...Show more
    Last updated: 30+ days ago • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Alembic • San Francisco, California, United States
    Full-time
    We’re looking for an experienced.Site Reliability Engineer (SRE).You’ll partner with engineers and data scientists to build, automate, and maintain the infrastructure that powers our core platform—...Show more
    Last updated: 13 days ago • Promoted
    Senior / Lead Site Reliability Engineer Federal

    Senior / Lead Site Reliability Engineer Federal

    C3 Ai • Redwood City, California, United States
    Full-time
    C3 AI (NYSE : AI), is the Enterprise AI application software company.C3 AI delivers a family of fully integrated products including the C3 Agentic AI Platform, an end-to-end platform for developing,...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    Baseten • San Francisco, CA, United States
    Full-time
    Baseten powers inference for the world's most dynamic AI companies, like OpenEvidence, Clay, Mirage, Gamma, Sourcegraph, Writer, Abridge, Bland, and Zed. By uniting applied AI research, flexible inf...Show more
    Last updated: 30+ days ago • Promoted