Staff Site Reliability Engineer (SRE)

HeartFlowSan Francisco, CA, United States

5 hours ago

Job type

Full-time

Job description

Heartflow is a medical technology company advancing the diagnosis and management of coronary artery disease, the #1 cause of death worldwide, using cutting-edge technology. The flagship product—an AI-driven, non-invasive cardiac test supported by the ACC / AHA Chest Pain Guidelines called the Heartflow FFR CT Analysis—provides a color-coded, 3D model of a patient’s coronary arteries indicating the impact blockages have on blood flow to the heart. Heartflow is the first AI-driven non‑invasive integrated heart care solution across the CCTA pathway that helps clinicians identify stenoses in the coronary arteries (RoadMap™Analysis), assess coronary blood flow (FFR CT Analysis), and characterize and quantify coronary atherosclerosis (Plaque Analysis).

Our pipeline of products is growing and so is our team; join us in helping to revolutionize precision heartcare. Heartflow is a publicly traded company (HTFL) that has received international recognition for exceptional strides in healthcare innovation, is supported by medical societies around the world, cleared for use in the US, UK, Europe, Japan and Canada, and has been used for more than 400,000 patients worldwide.

Heartflow is transforming cardiovascular care with cutting‑edge, non‑invasive technology. We are launching a massive Platform Modernization initiative to power the next generation of our life‑saving medical products.

We’re looking for an experienced Site Reliability Engineer (SRE) to join our cloud‑native infrastructure team. You will work closely with our Platform engineers and development teams to ensure our critical systems are highly available, scalable, observable, and performant. If you thrive on eliminating toil, automating complex operations, and defining the standards for production excellence, we want to talk to you.

Job Responsibilities

Design, implement, and lead large‑scale, cross‑functional projects to improve the reliability, performance, and efficiency of our core services and infrastructure (10× impact).
Drive the reduction of toil by developing and deploying sophisticated automation tools and frameworks, championing the "everything as code" philosophy.
Serve as a technical escalation point for critical incidents, perform deep‑dive root cause analyses (RCAs), and implement robust corrective measures to prevent recurrence.
Define and implement SLOs, SLIs, and Error Budgets for critical services. Enhance our monitoring, logging, and tracing systems to provide comprehensive visibility into system health.
Set the technical direction and best practices for the entire SRE and engineering organization. Mentor mid‑level and senior engineers on design patterns, operational rigor, and reliability principles.

We’re looking for a leader and a deep technical expert with a proven track record of solving the hardest scaling and reliability challenges.

Required Qualifications

8+ years of progressive experience in Site Reliability Engineering, Production Engineering, or a closely related role.

Expert‑level proficiency with AWS, including networking, compute, and storage.

Deep expertise in Kubernetes and the cloud‑native ecosystem.

Fluency in at least one major scripting / programming language for automation and tooling (e.g., Python, Go, or Java).

Solid experience with monitoring and logging solutions (Datadog).

Proven ability to design and implement robust, highly available distributed systems.

Demonstrated experience with Infrastructure as Code tools like Terraform.

Exceptional communication skills, capable of explaining complex technical issues to both technical and non‑technical audiences.

Nice‑to‑Have

Experience implementing Service Mesh technologies (e.g., Istio, Linkerd).

A strong understanding of security principles and practices in a cloud environment.

Certifications such as CKA (Certified Kubernetes Administrator) or CKAD (Certified Kubernetes Application Developer).

A reasonable estimate of the base salary compensation range is $200,750 to $250,922, cash bonus, and equity.

Heartflow is an Equal Opportunity Employer. We are committed to a work environment that supports, inspires, and respects all individuals and do not discriminate against any employee or applicant because of race, color, religion, marital status, age, national origin, ancestry, physical or mental disability, medical condition, pregnancy, genetic information, gender, sexual orientation, gender identity or expression, veteran status, or any other status protected under federal, state, or local law. This policy applies to every aspect of employment at Heartflow, including recruitment, hiring, training, relocation, promotion, and termination.

#J-18808-Ljbffr

Create a job alert for this search

Staff Site Reliability Engineer • San Francisco, CA, United States

Related jobs

Promoted

Site Reliability Engineer (SRE)

AI FundSan Francisco, CA, United States

Full-time

Baseten powers inference for the world's most dynamic AI companies, like.As a Site Reliability Engineer, you'll envision and build robust systems and processes that ensure our infrastructure is sca...Show moreLast updated: 30+ days ago

Promoted

Senior / Staff Engineer - Reliability (SRE)

Perplexity AI Inc.San Francisco, CA, United States

Full-time

Perplexity is seeking a Senior or Staff level Reliability Engineer (SRE) to join our small team in revolutionizing the way people search and interact with the internet. You will be responsible for l...Show moreLast updated: 2 days ago

Promoted
New!

Staff / Principal Site Reliability Engineer

VezaSan Francisco, CA, United States

Full-time

Staff / Principal Site Reliability Engineer.We are seeking an exceptional Staff / Principal Site Reliability Engineer to lead critical infrastructure initiatives and drive Innovation across our organiz...Show moreLast updated: 8 hours ago

Promoted

Site Reliability Engineer

Insight GlobalSanta Clara, CA, US

Full-time

Insight Global is looking for a seasoned SRE to join one of our largest technology clients' multifaceted and fast-paced Infrastructure, Planning and Processes organization where you will be working...Show moreLast updated: 26 days ago

Promoted

Site Reliability Engineer

PsiQuantumPalo Alto, CA, United States

Full-time

Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Show moreLast updated: 30+ days ago

Promoted

Reliability Engineer (Rotating Equipment)

Advantage TechnicalRodeo, CA, United States

Full-time

Reliability Engineer (Rotating Equipment).Contract : 1 year, could extend.Bachelor’s degree in mechanical engineering or related technical discipline. Minimum 5 years’ rotating equipment reliability ...Show moreLast updated: 6 days ago

Promoted

Staff Site Reliability Engineer

CrusoeSan Francisco, CA, United States

Full-time

Crusoe is building the Worlds Favorite AI-first Cloud infrastructure company.Were pioneering vertically integrated, purpose-built AI infrastructure solutions trusted by Fortune 500 companies to pow...Show moreLast updated: 23 hours ago

Promoted

Site Reliability Engineer (SRE)

SS&C TechnologiesSan Francisco, CA, United States

Full-time

SS&C Technologies is a global investment and financial services software provider, headquartered in Windsor, Connecticut, and supporting more than 28,000 employees across 35 countries.It specialize...Show moreLast updated: 20 days ago

Promoted

Site Reliability Engineer

Runloop AISan Francisco, CA, United States

Full-time

Runloop is building the foundational infrastructure for the next generation of AI development.We provide AI engineers and data scientists with lightning-fast, secure, and reproducible code sandboxe...Show moreLast updated: 15 days ago

Promoted

Site Reliability Engineer (SRE)

BasetenSan Francisco, CA, United States

Full-time

Baseten powers inference for the world's most dynamic AI companies, like OpenEvidence, Clay, Mirage, Gamma, Sourcegraph, Writer, Abridge, Bland, and Zed. By uniting applied AI research, flexible inf...Show moreLast updated: 11 days ago

Promoted

Senior / Staff Engineer - Reliability (SRE)

Pantera CapitalPalo Alto, CA, United States

Full-time

Perplexity is seeking a Site Reliability Engineer (SRE) to join our small team in revolutionizing the way people search and interact with the internet. You will be responsible for leading the design...Show moreLast updated: 3 days ago

Promoted

Staff Site Reliability Engineer

ZscalerSan Jose, CA, United States

Full-time

Serving thousands of enterprise customers around the world including 45% of Fortune 500 companies, Zscaler (NASDAQ : ZS) was founded in 2007 with a mission to make the cloud a safe place to do busin...Show moreLast updated: 6 days ago

Promoted

Lead Site Reliability Engineer (SRE)

EPAM Systems IncSan Jose, CA, United States

Full-time

At EPAM, we're not just building software - we're engineering excellence.Lead Site Reliability Engineer (SRE).This role is ideal for someone who thrives in fast-paced financial systems, has a passi...Show moreLast updated: 6 days ago

Promoted

Senior Site Reliability Engineer (Senior SRE)

CiroosPleasanton, CA, United States

Full-time

Senior Site Reliability Engineer (Senior SRE).Be among the first 25 applicants.Ciroos (pronounced Sai rose) is a seed?stage startup founded in February 2025 by a team of experienced executives and ...Show moreLast updated: 6 days ago

Promoted

Site Reliability Engineer (SRE)

Air AppsSan Francisco, CA, United States

Full-time

Site Reliability Engineer (SRE).Site Reliability Engineer (SRE).Get AI-powered advice on this job and more exclusive features. At Air Apps, we believe in thinking bigger—and moving faster.We’re a fa...Show moreLast updated: 30+ days ago

Promoted

Site Reliability Engineer Staff

HPESan Jose, CA, United States

Full-time

Site Reliability Engineer Staff.This role has been designed as ‘Hybrid’ with an expectation that you will work on average 2 days per week from an HPE office. Hewlett Packard Enterprise is the global...Show moreLast updated: 6 days ago

Promoted

Staff Site Reliability Engineer - Kubernetes

FivetranOakland, CA, United States

Full-time

From Fivetran's founding until now, our mission has remained the same : to make access to data as simple and reliable as electricity. With Fivetran, customer data arrives in their warehouses, canonic...Show moreLast updated: 30+ days ago

Promoted

Staff Site Reliability Engineer, Fabric

MongoDBSan Francisco, CA, United States

Full-time

Staff Site Reliability Engineer, Fabric.MongoDBs mission is to empower innovators to create, transform, and disrupt industries by unleashing the power of software and data.We enable organizations o...Show moreLast updated: 30+ days ago