Talent.com
Site Reliability Engineer

Site Reliability Engineer

PrimerSan Francisco, California, US
1 day ago
Job type
  • Full-time
Job description

Primer helps B2B products break out of the B2C-centric marketing box. Our platform turns consumer ad channels, data streams, and emerging AI workflows into measurable growth engines for go-to-market teams. We ingest billions of rows from first- and third-party sources, map them to rich company context, and surface hyper-targeted audiences and real-time performance alerts—all without vendor lock-in.

Please ensure you read the below overview and requirements for this employment opportunity completely.

That only works if the lights stay on , queries stay fast , and incidents stay rare . That’s where you come in.

As our first dedicated Site Reliability Engineer , you’ll be the force multiplier who designs, builds, and operates the infrastructure that powers everything : petabyte-scale data pipelines, LLM-backed services, and the APIs our customers (and engineers!) rely on every day. You’ll pair hard-won ops experience with a mentor’s mindset—levelling up the whole team while keeping us four steps ahead of failure.

YOUR MISSION

Own reliability from design to customer.

  • Define and uphold SLOs / SLIs, manage error budgets, and lead blameless post-mortems.
  • Automate toil out of existence—CI / CD, infra-as-code, capacity planning, and chaos testing.
  • Drive incident response end-to-end : detection, mitigation, root-cause analysis, and long-term fixes.
  • Scale multi-cloud data pipelines (Prefect, ClickHouse, Iceberg) and GPU / LLM workloads.
  • Teach best practices, review designs, and coach engineers so reliability becomes a team sport.

WHAT YOU’LL DO

  • Design, implement, and tune distributed systems that handle high-throughput B2B traffic .
  • Harden our AWS stack with IaC (e.g. Terraform)
  • Instrument everything—logs, traces, metrics, and AI-powered anomaly detection.
  • Champion security, cost optimization, and disaster-recovery strategies.
  • Jump into the weeds when something breaks, fix it fast, then automate it away.
  • WHAT YOU’LL BRING

    Must-Haves

  • 5+ years owning production systems at meaningful scale (sub-second latency, “four-nines” targets).
  • Mastery of SRE fundamentals : SLO / SLI design, error budgets, incident playbooks.
  • Deep hands-on with Linux, networking, containers / K8s, and at least one major cloud (AWS / GCP / Azure).
  • Proven track record automating infra with Terraform, Helm, or similar IaC tooling.
  • Fluency in at least one systems / scripting language (Go, Python, Rust, etc.).
  • Experience operating complex data pipelines (Prefect, Airflow, Temporal) or real-time streaming systems.
  • History of mentoring engineers and embedding reliability culture across teams.
  • Pragmatic decision-maker—balances uptime, velocity, and cost for startup reality.
  • Curiosity for AI-augmented ops (LLM chat-ops, anomaly detection, self-healing).
  • Nice-to-Haves

  • Managed GPU clusters and ML inference workloads.
  • Operated data lakes / lakehouses at scale (Iceberg, Delta, etc.).
  • Meaningful open-source contributions in SRE, DevOps, or data-infra projects.
  • WHY PRIMER

  • Mission with impact – We’re unlocking new growth channels for thousands of B2B marketers.
  • High-trust, low-ego culture – Fully distributed team, meeting-light weeks, Friday focus days.
  • Work & life, balanced – Five weeks PTO, generous parental leave, and flexibility for families.
  • Career rocket-fuel – Small team, huge problems, real ownership. Shape the future with bold innovators, driving impact that redefines industries.
  • Diverse & global – Teammates span six countries—and counting.
  • Intro Call with Engineering Manager – 30 min
  • System Design – 60 min
  • Operational Excellence Drill-down – 60 min
  • Strategic Pragmatism Chat with CTO – 45 min
  • Technical Coding / Systems Deep Dive – 30 min
  • Culture & Values with CEO – 45 min
  • Decision typically within 24-48 hrs of final conversation.

    READY TO LEVEL UP B2B MARKETING INFRASTRUCTURE?

    Email careers@sayprimer.com with your résumé, LinkedIn, GitHub, or anything that showcases your reliability superpowers. Let’s build the future—without the fire-drills.

    #J-18808-Ljbffr

    Create a job alert for this search

    Site Reliability Engineer • San Francisco, California, US

    Related jobs
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    AlchemySan Francisco, California, US
    Full-time
    If you are considering sending an application, make sure to hit the apply button below after reading through the entire description. Our mission is to bring web3 to a billion people, by providing bu...Show moreLast updated: 1 day ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    FortinetSunnyvale, CA, United States
    Full-time
    At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers. Our team members enjoy solving complex problems, and obsess...Show moreLast updated: 6 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    PsiQuantumPalo Alto, CA, United States
    Full-time
    Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    CompunnelRichmond, CA, United States
    Full-time
    The Site Reliability Engineer will be responsible for ensuring the reliability, availability, and performance of applications and services as part of the transition from private to public cloud.Thi...Show moreLast updated: 5 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Redwood Materials, Inc.San Francisco, CA, United States
    Full-time
    Redwood is localizing a global battery supply chain that seamlessly integrates recovery, reuse, and recycling—keeping critical minerals in circulation and driving the energy transition.Founded in 2...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Insight GlobalSanta Clara, CA, United States
    Full-time
    Insight Global is looking for a seasoned SRE to join one of our largest technology clients' multifaceted and fast-paced Infrastructure, Planning and Processes organization where you will be working...Show moreLast updated: 6 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Runloop AISan Francisco, CA, United States
    Full-time
    Runloop is building the foundational infrastructure for the next generation of AI development.We provide AI engineers and data scientists with lightning-fast, secure, and reproducible code sandboxe...Show moreLast updated: 14 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    WorkOSSan Francisco, CA, United States
    Full-time
    About WorkOS 🚀 WorkOS builds tools and services for developers to help them implement authentication, identity, authorization, and overall enterprise readiness. We’re a fully distributed team with ...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    XaiPalo Alto, CA, United States
    Full-time
    AIs mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellen...Show moreLast updated: 6 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    PSI QuantumPalo Alto, CA, United States
    Full-time
    Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    ReplitFoster City, CA, United States
    Full-time
    Replit is the agentic software creation platform that enables anyone to build applications using natural language.With millions of users worldwide and over 500,000 business users, Replit is democra...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer I

    Site Reliability Engineer I

    Prosper.comSan Francisco, CA, United States
    Full-time
    As a Site Reliability Engineer I at Prosper, you will play a crucial role in enhancing the reliability, scalability, and maintainability of our technology platform. This entry-level position is desi...Show moreLast updated: 6 days ago
    • Promoted
    Site Reliability Engineer - Supercomputing

    Site Reliability Engineer - Supercomputing

    XaiPalo Alto, CA, United States
    Full-time
    Site Reliability Engineer - Supercomputing.We are seeking a talented Site Reliability Engineer (SRE) to join our SuperComputing team. In this role, you'll ensure the reliability, scalability, and pe...Show moreLast updated: 5 days ago
    • Promoted
    • New!
    Site Reliability Engineer

    Site Reliability Engineer

    TogetherSan Francisco, CA, United States
    Full-time
    As a Site Reliability Engineer (SRE) at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a soft...Show moreLast updated: 2 hours ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    P2PSan Francisco, CA, United States
    Full-time
    Our mission is to bring web3 to a billion people, by providing builders with the tools they need to build exceptional onchain products. Alchemy is the only complete developer platform that offers th...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Rockwoods IncPleasanton, CA, US
    Full-time
    Note : Candidates must have relevant experience in Medical / Healthcare domains, this is mandatory.Senior SRE Engineer - Pleasanton, 5 days office. Primary work : 24x7 On-call support and setting up mo...Show moreLast updated: 24 days ago
    • Promoted
    • New!
    Site Reliability Engineer

    Site Reliability Engineer

    Together AISan Francisco, California, US
    Full-time
    Please double check you have the right level of experience and qualifications by reading the full overview of this opportunity below. As a Site Reliability Engineer (SRE) at Together, you are respon...Show moreLast updated: 1 hour ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    FractalSan Francisco, California, US
    Full-time
    This range is provided by Fractal.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Increase your chances of reaching the interview stage by readi...Show moreLast updated: 1 day ago