Talent.com
Senior Software Engineer - Reliability (Remote)
Senior Software Engineer - Reliability (Remote)Freenome • Brisbane, CA, United States
Senior Software Engineer - Reliability (Remote)

Senior Software Engineer - Reliability (Remote)

Freenome • Brisbane, CA, United States
2 days ago
Job type
  • Full-time
  • Remote
Job description

About this opportunity :

Our Site Reliability Engineering (SRE) team is a new and critical function at Freenome. As a founding member of the team, you'll help define the culture and build the systems that keep our regulated, cloud-based production environments reliable as we transition from research to commercial operations. This is an opportunity to do meaningful engineering work that will directly save lives.

We value :

  • Reliability as a product feature
  • Continual improvement and learning
  • Automate all the things!
  • Technical simplicity and clarity
  • Blameless postmortems and transparent communication

As a Site Reliability Engineer, you will help design, implement, and operate observability, reliability, and incident management systems and practices across our clinical lab systems and regulated commercial workloads. You'll partner with engineering teams to define service-level indicators (SLIs), objectives (SLOs), and error budgets; build runbooks and operational playbooks; and develop the monitoring and automation needed to ensure that our systems are reliable and compliant. This will also include contributions to system code, Infrastructure deployments and automation.

This role is ideal for an engineer with experience running production workloads in the cloud, who is excited to build an SRE practice from the ground up in a regulated environment.

The role reports to the Director, Cloud Infrastructure.

What you'll do :

  • Define and implement observability practices (metrics, traces, dashboards, logs, alerts) for production systems
  • Partner with product, engineering, and lab teams to develop and maintain incident response playbooks and escalation procedures
  • Partner with engineering teams to define SLIs / SLOs and establish error budgets
  • Participate in on-call rotation for production systems, champion a focus on automation and self-healing
  • Contribute to production deployment and change-management processes that meet FDA and compliance requirements
  • Automate operational tasks, reducing manual intervention
  • Contribute to production systems and designs with the goal of improving reliability
  • Use Infrastructure as Code (IaC) to manage and deploy team owned infrastructure and subsystems
  • Help build out the SRE practice
  • Communication and Collaboration :

  • Work closely with engineering, product, and lab teams to understand service reliability needs
  • Partner with TPMs, RA / QA, and compliance stakeholders to align operational practices with regulatory requirements
  • Participate in cross-functional incident reviews and postmortems
  • Share knowledge and document operational standards for consistency and onboarding
  • Design and run fire drills / tabletop exercises as well as disaster recovery exercises
  • Culture :

  • Model Freenome's values and principles in your work and interactions
  • Promote a collaborative, reliable engineering culture across product, infra, and lab engineering teams
  • Contribute to documentation, runbooks, and operational standards
  • Foster a culture of accountability, learning, and psychological safety
  • Technical Leadership :

  • Independently drive reliability improvements in scoped systems or services
  • Provide mentorship to peers on observability, incident management, and operational best practices
  • Help build and evolve Freenome's reliability practices and contribute to team strategy discussions
  • Must haves :

  • Bachelor's degree in Computer Science, Engineering, or equivalent experience
  • 5+ years in software engineering or Infra / DevOps / SRE roles (Python or Go are what we currently use)
  • Experience deploying cloud infrastructure via automation (e.g. Terraform, Pulumi, Bicep / ARM, etc.)
  • Incident management experience in cloud / software engineering as well as familiarity with incident management platforms (e.g., Incident.io, ServiceNow, Opsgenie, Pagerduty, etc.)
  • Hands-on experience operating production workloads in cloud environments
  • Familiarity with Kubernetes (AKS, GKE, or EKS)
  • Strong troubleshooting and root-cause analysis skills in distributed systems
  • Experience with observability platforms (e.g., DataDog, Prometheus / Grafana, OpenTelemetry)
  • Ability to define and implement metrics, dashboards, and alerting
  • Demonstrated ability to work autonomously and own technical outcomes
  • Strong understanding of cloud Infrastructure and Networking architectures and automation
  • Nice to haves :

  • Experience supporting regulated environments (healthcare, biotech, financial)
  • Familiarity with compliance-driven change management and release processes (FDA, HIPAA)
  • Knowledge of CI / CD deployment strategies and change automation
  • Experience with both GCP and Azure cloud platforms
  • Interest in mentorship and system reliability practices at scale
  • Benefits and additional information :

    The US target range of our base salary for new hires is $131,325 - $201,000. You will also be eligible to receive pre-IPO equity, cash bonuses, and a full range of medical, financial, and other benefits depending on the position offered. Please note that individual total compensation for this position will be determined at the Company's sole discretion and may vary based on several factors, including but not limited to, location, skill level, years and depth of relevant experience, and education. We invite you to check out our career page @ freenome.com / job-openings / for additional company information.

    Freenome is proud to be an equal-opportunity employer, and we value diversity. Freenome does not discriminate on the basis of race, color, religion, marital status, age, national origin, ancestry, physical or mental disability, medical condition, pregnancy, genetic information, gender, sexual orientation, gender identity or expression, veteran status, or any other status protected under federal, state, or local law.

    Applicants have rights under Federal Employment Laws.

  • Family & Medical Leave Act (FMLA)
  • Equal Employment Opportunity (EEO)
  • Employee Polygraph Protection Act (EPPA)
  • #LI-REMOTE

    Create a job alert for this search

    Reliability Engineer • Brisbane, CA, United States

    Related jobs
    Senior Site Reliability Engineer (Senior SRE)

    Senior Site Reliability Engineer (Senior SRE)

    Ciroos • Pleasanton, California, United States
    Full-time
    Ciroos (pronounced “Sai rose”) is a seed-stage startup founded in February 2025 by a team of experienced executives and distinguished engineers with deep expertise in observability, AI, distributed...Show more
    Last updated: 30+ days ago • Promoted
    Senior Software Engineer

    Senior Software Engineer

    Symbolica AI • San Francisco, California, United States
    Full-time
    Symbolica is an AI research lab pioneering the application of category theory to enable logical reasoning in machines.We’re a well-resourced, nimble team of experts on a mission to bridge the gap b...Show more
    Last updated: 30+ days ago • Promoted
    Senior Software Engineer

    Senior Software Engineer

    Aurora Innovation • Mountain View, California, United States
    Full-time
    Aurora hires talented people with diverse backgrounds who are excited about building the future of transportation that will make our roads safer, get crucial goods where they need to go, and make m...Show more
    Last updated: 30+ days ago • Promoted
    Senior Software Engineer

    Senior Software Engineer

    Clockwork.io • Palo Alto, California, United States
    Full-time
    Silicon Valley startup that delivers state-of-the-art AI compute acceleration.We are founded by Stanford researchers and veteran systems engineers with a shared belief : distributed systems powering...Show more
    Last updated: 30+ days ago • Promoted
    Senior Software Engineer

    Senior Software Engineer

    Scale AI, Inc. • San Francisco, CA, United States
    Full-time
    Software is eating the world, but AI is eating software.We live in unprecedented times - AI has the potential to exponentially augment human intelligence. Every person will have a personal tutor, co...Show more
    Last updated: 28 days ago • Promoted
    Senior Site Reliability Engineer, Compute

    Senior Site Reliability Engineer, Compute

    Crusoe • San Francisco, California, United States
    Full-time
    Crusoe is building the World’s Favorite AI-first Cloud infrastructure company.We’re pioneering vertically integrated, purpose-built AI infrastructure solutions trusted by Fortune 500 companies to ...Show more
    Last updated: 30+ days ago • Promoted
    Senior Site Reliability Engineer, Compute

    Senior Site Reliability Engineer, Compute

    Roblox • San Mateo, California, USA
    Full-time
    The Infrastructure Compute Site Reliability Engineering (SRE) teams mission is to own and manage the successful operation of our underlying cell infrastructure system along with elements of service...Show more
    Last updated: 1 hour ago • Promoted • New!
    Software Engineer (Site Reliability Engineer)

    Software Engineer (Site Reliability Engineer)

    Anyscale • San Francisco, California, United States
    Full-time
    Ray in their tech stacks to accelerate the progress of AI applications out into the real world.With Anyscale, we’re building the best place to run Ray, so that any developer or data scientist can s...Show more
    Last updated: 30+ days ago • Promoted
    Software Engineer, Site Reliability Engineer (SRE)

    Software Engineer, Site Reliability Engineer (SRE)

    Harvey • San Francisco, California, United States
    Full-time
    Harvey is a secure AI platform for legal and professional services that augments productivity and automates complex workflows. Harvey uses algorithms with reasoning-adept LLMs that have been customi...Show more
    Last updated: 30+ days ago • Promoted
    Senior Software Engineer

    Senior Software Engineer

    Peregrine Technologies • San Francisco, California, United States
    Full-time
    Backed by leading investors from Silicon Valley, Peregrine supports public safety agencies across the country — from Los Angeles to Louisville to Atlanta — empowering public servants to improve ope...Show more
    Last updated: 30+ days ago • Promoted
    Senior Software Engineer, Observability

    Senior Software Engineer, Observability

    Airtable • San Francisco, California, USA
    Full-time
    Airtable is the no-code app platform that empowers people closest to the work to accelerate their most critical business processes. More than 500000 organizations including 80% of the Fortune 100 re...Show more
    Last updated: 12 days ago • Promoted
    Senior Software Engineer

    Senior Software Engineer

    Latitude Ai • Palo Alto, California, United States
    Full-time +1
    L3, for Ford vehicles at scale.We’re driven by the opportunity to reimagine what it’s like to drive and make travel safer, less stressful, and more enjoyable for everyone.When you join the Latitude...Show more
    Last updated: 30+ days ago • Promoted
    Senior Firmware EngineerSoftware Engineering • Berkeley, CA • Full time • On-site

    Senior Firmware EngineerSoftware Engineering • Berkeley, CA • Full time • On-site

    Form Energy • Berkeley, CA, United States
    Full-time
    Are you ready to build America's energy future? Form Energy is an American manufacturing and energy technology company.We're revolutionizing energy storage with cost-effective, multi-day technology...Show more
    Last updated: 29 days ago • Promoted
    Senior Software Engineer - Machine Learning Platform

    Senior Software Engineer - Machine Learning Platform

    Snowflake • Menlo Park, California, United States
    Full-time
    The Snowflake Machine Learning Platform team’s mission is to enable customers to bring their machine learning and deep learning workloads to Snowflake. Our customers want to build powerful models wi...Show more
    Last updated: 30+ days ago • Promoted
    Senior Software Engineer - Fullstack

    Senior Software Engineer - Fullstack

    Mlabs • San Francisco, California, United States
    Remote
    Full-time
    Senior Software Engineer - Fullstack (Remote, North America).Remote (North America : US or Canada).We are a high-growth technology company at the forefront of revolutionizing.Our advanced platform i...Show more
    Last updated: 3 days ago • Promoted
    Senior Site Reliability Engineer - Managed Kubernetes

    Senior Site Reliability Engineer - Managed Kubernetes

    Lambda • San Francisco, California, United States
    Remote
    Full-time
    We're here to help the smartest minds on the planet build Superintelligence.The labs pushing the edge? They run on Lambda. Our gear trains and serves their models, our infrastructure scales with the...Show more
    Last updated: 30+ days ago • Promoted
    Senior Software Engineer

    Senior Software Engineer

    Earnin • Palo Alto, California, United States
    Full-time
    As one of the first pioneers of earned wage access, our passion at EarnIn is building products that deliver real-time financial flexibility for those with the unique needs of living paycheck to pay...Show more
    Last updated: 30+ days ago • Promoted
    Senior Software Engineer

    Senior Software Engineer

    Appzen • San Jose, California, United States
    Full-time
    AppZen is the leader in autonomous spend-to-pay software.Its patented artificial intelligence accurately and efficiently processes information from thousands of data sources so that organizations c...Show more
    Last updated: 30+ days ago • Promoted