Talent.com
No longer accepting applications
Site Reliability Engineer

Site Reliability Engineer

PrimerSan Francisco, CA, United States
30+ days ago
Job type
  • Full-time
Job description

Primer helps B2B products break out of the B2C-centric marketing box. Our platform turns consumer ad channels, data streams, and emerging AI workflows into measurable growth engines for go-to-market teams. We ingest billions of rows from first- and third-party sources, map them to rich company context, and surface hyper-targeted audiences and real-time performance alerts—all without vendor lock-in.

That only works if the lights stay on , queries stay fast , and incidents stay rare . That’s where you come in.

As our first dedicated Site Reliability Engineer , you’ll be the force multiplier who designs, builds, and operates the infrastructure that powers everything : petabyte-scale data pipelines, LLM-backed services, and the APIs our customers (and engineers!) rely on every day. You’ll pair hard-won ops experience with a mentor’s mindset—levelling up the whole team while keeping us four steps ahead of failure.

YOUR MISSION

Own reliability from design to customer.

  • Define and uphold SLOs / SLIs, manage error budgets, and lead blameless post-mortems.
  • Automate toil out of existence—CI / CD, infra-as-code, capacity planning, and chaos testing.
  • Drive incident response end-to-end : detection, mitigation, root-cause analysis, and long-term fixes.
  • Scale multi-cloud data pipelines (Prefect, ClickHouse, Iceberg) and GPU / LLM workloads.
  • Teach best practices, review designs, and coach engineers so reliability becomes a team sport.

WHAT YOU’LL DO

  • Design, implement, and tune distributed systems that handle high-throughput B2B traffic .
  • Harden our AWS stack with IaC (e.g. Terraform)
  • Instrument everything—logs, traces, metrics, and AI-powered anomaly detection.
  • Champion security, cost optimization, and disaster-recovery strategies.
  • Jump into the weeds when something breaks, fix it fast, then automate it away.
  • WHAT YOU’LL BRING

    Must-Haves

  • 5+ years owning production systems at meaningful scale (sub-second latency, “four-nines” targets).
  • Mastery of SRE fundamentals : SLO / SLI design, error budgets, incident playbooks.
  • Deep hands-on with Linux, networking, containers / K8s, and at least one major cloud (AWS / GCP / Azure).
  • Proven track record automating infra with Terraform, Helm, or similar IaC tooling.
  • Fluency in at least one systems / scripting language (Go, Python, Rust, etc.).
  • Experience operating complex data pipelines (Prefect, Airflow, Temporal) or real-time streaming systems.
  • History of mentoring engineers and embedding reliability culture across teams.
  • Pragmatic decision-maker—balances uptime, velocity, and cost for startup reality.
  • Curiosity for AI-augmented ops (LLM chat-ops, anomaly detection, self-healing).
  • Nice-to-Haves

  • Managed GPU clusters and ML inference workloads.
  • Operated data lakes / lakehouses at scale (Iceberg, Delta, etc.).
  • Meaningful open-source contributions in SRE, DevOps, or data-infra projects.
  • WHY PRIMER

  • Mission with impact – We’re unlocking new growth channels for thousands of B2B marketers.
  • High-trust, low-ego culture – Fully distributed team, meeting-light weeks, Friday focus days.
  • Work & life, balanced – Five weeks PTO, generous parental leave, and flexibility for families.
  • Career rocket-fuel – Small team, huge problems, real ownership. Shape the future with bold innovators, driving impact that redefines industries.
  • Diverse & global – Teammates span six countries—and counting.
  • Intro Call with Engineering Manager – 30 min
  • System Design – 60 min
  • Operational Excellence Drill-down – 60 min
  • Strategic Pragmatism Chat with CTO – 45 min
  • Technical Coding / Systems Deep Dive – 30 min
  • Culture & Values with CEO – 45 min
  • Decision typically within 24-48 hrs of final conversation.

    READY TO LEVEL UP B2B MARKETING INFRASTRUCTURE?

    Email careers@sayprimer.com with your résumé, LinkedIn, GitHub, or anything that showcases your reliability superpowers. Let’s build the future—without the fire-drills.

    #J-18808-Ljbffr

    Create a job alert for this search

    Site Reliability Engineer • San Francisco, CA, United States

    Related jobs
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    ConductorOneSan Francisco, CA, United States
    Full-time
    Shape the future of identity with the highest-caliber team.If you’re amazing at what you do and want to solve big challenges in identity and security, come on board. Identity is how companies are be...Show moreLast updated: 8 days ago
    • Promoted
    Principal Site Reliability Engineer

    Principal Site Reliability Engineer

    FortinetSanta Clara, CA, United States
    Full-time
    At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers. Our team members enjoy solving complex problems, and obsess...Show moreLast updated: 6 days ago
    • Promoted
    Site Reliability Engineer I

    Site Reliability Engineer I

    prosper.comSan Francisco, CA, United States
    Full-time
    As a Site Reliability Engineer I at Prosper, you will play a crucial role in enhancing the reliability, scalability, and maintainability of our technology platform. This entry-level position is desi...Show moreLast updated: 3 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Bits to AtomsSan Francisco, CA, United States
    Full-time
    Site Reliability Engineer (SRE).You’ll work at the intersection of infrastructure, AI / ML systems, and mission-critical physical operations. You’ll collaborate directly with engineering, AI, and oper...Show moreLast updated: 23 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    LatentSan Francisco, CA, United States
    Full-time
    Latent is building the intelligence infrastructure for American healthcare.Our products are already helping hospitals and clinics dramatically increase workflow output, speed up patient access to m...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    PsiQuantumPalo Alto, CA, United States
    Full-time
    Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Redwood Materials, Inc.San Francisco, CA, United States
    Full-time
    Redwood is localizing a global battery supply chain that seamlessly integrates recovery, reuse, and recycling—keeping critical minerals in circulation and driving the energy transition.Founded in 2...Show moreLast updated: 28 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    FortinetSunnyvale, CA, United States
    Full-time
    At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers. Our team members enjoy solving complex problems, and obsess...Show moreLast updated: 6 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    WorkOSSan Francisco, CA, United States
    Full-time
    About WorkOS 🚀 WorkOS builds tools and services for developers to help them implement authentication, identity, authorization, and overall enterprise readiness. We’re a fully distributed team with ...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    AlchemySan Francisco, CA, United States
    Full-time
    Our mission is to bring web3 to a billion people, by providing builders with the tools they need to build exceptional onchain products. Alchemy is the only complete developer platform that offers th...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Together AISan Francisco, CA, United States
    Full-time
    As a Site Reliability Engineer (SRE) at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a soft...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer - Technical Lead

    Site Reliability Engineer - Technical Lead

    ZipRecruiterSan Francisco, CA, United States
    Full-time
    Veryon is a leading software and technology company that enables aviation teams around the world to improve efficiency and safety. Our products maximize uptime for aircraft maintenance teams through...Show moreLast updated: 17 days ago
    • Promoted
    Site Reliability Engineer - Openstack

    Site Reliability Engineer - Openstack

    FortinetSunnyvale, CA, United States
    Full-time
    Fortinet is recruiting a Site Reliability Engineer- OPENSTACK to join our FortiStack team.This team is responsible for the management, operation and continued development of our Openstack-based pri...Show moreLast updated: 6 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Redwood MaterialsSan Francisco, CA, United States
    Full-time
    Redwood is localizing a global battery supply chain that seamlessly integrates recovery, reuse, and recycling — keeping critical minerals in circulation and driving the energy transition.Founded in...Show moreLast updated: 26 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    FractalSan Francisco, CA, United States
    Full-time
    This range is provided by Fractal.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Fractal Analytics is a strategic AI partner to Fortune 500 com...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Signify TechnologyPalo Alto, CA, US
    Full-time
    Competitive, based on experience.We are a technology startup advancing healthcare with a safety-focused AI platform that assists medical professionals by managing patient communications, including ...Show moreLast updated: 1 day ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    PrimerSan Francisco, CA, United States
    Full-time
    Primer helps B2B products break out of the B2C-centric marketing box.Our platform turns consumer ad channels, data streams, and emerging AI workflows into measurable growth engines for go-to-market...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    WritemedSan Francisco, CA, United States
    Full-time
    Would you like to join one of the fastest-growing organizations with a goal of using the latest AI, GenAI, LLM, Cloud, and Digital Technologies to advance drug development and improve patient care ...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer II

    Site Reliability Engineer II

    Hinge HealthSan Francisco, CA, United States
    Full-time
    From scaling Kubernetes clusters to improving observability with Datadog, we build the tooling and automation that empower product teams to ship with confidence. Collaborate with engineering teams t...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    BasetenSan Francisco, CA, United States
    Full-time
    Site Reliability Engineer (SRE).Baseten powers inference for the world's most dynamic AI companies, like OpenEvidence, Clay, Mirage, Gamma, Sourcegraph, Writer, Abridge, Bland, and Zed.By uniting a...Show moreLast updated: 30+ days ago