Site Reliability Engineer

PrimerSan Francisco, California, US

1 day ago

Job type

Full-time

Job description

Primer helps B2B products break out of the B2C-centric marketing box. Our platform turns consumer ad channels, data streams, and emerging AI workflows into measurable growth engines for go-to-market teams. We ingest billions of rows from first- and third-party sources, map them to rich company context, and surface hyper-targeted audiences and real-time performance alerts—all without vendor lock-in.

Please ensure you read the below overview and requirements for this employment opportunity completely.

That only works if the lights stay on , queries stay fast , and incidents stay rare . That’s where you come in.

As our first dedicated Site Reliability Engineer , you’ll be the force multiplier who designs, builds, and operates the infrastructure that powers everything : petabyte-scale data pipelines, LLM-backed services, and the APIs our customers (and engineers!) rely on every day. You’ll pair hard-won ops experience with a mentor’s mindset—levelling up the whole team while keeping us four steps ahead of failure.

YOUR MISSION

Own reliability from design to customer.

Define and uphold SLOs / SLIs, manage error budgets, and lead blameless post-mortems.
Automate toil out of existence—CI / CD, infra-as-code, capacity planning, and chaos testing.
Drive incident response end-to-end : detection, mitigation, root-cause analysis, and long-term fixes.
Scale multi-cloud data pipelines (Prefect, ClickHouse, Iceberg) and GPU / LLM workloads.
Teach best practices, review designs, and coach engineers so reliability becomes a team sport.

WHAT YOU’LL DO

Design, implement, and tune distributed systems that handle high-throughput B2B traffic .

Harden our AWS stack with IaC (e.g. Terraform)

Instrument everything—logs, traces, metrics, and AI-powered anomaly detection.

Champion security, cost optimization, and disaster-recovery strategies.

Jump into the weeds when something breaks, fix it fast, then automate it away.

WHAT YOU’LL BRING

Must-Haves

5+ years owning production systems at meaningful scale (sub-second latency, “four-nines” targets).

Mastery of SRE fundamentals : SLO / SLI design, error budgets, incident playbooks.

Deep hands-on with Linux, networking, containers / K8s, and at least one major cloud (AWS / GCP / Azure).

Proven track record automating infra with Terraform, Helm, or similar IaC tooling.

Fluency in at least one systems / scripting language (Go, Python, Rust, etc.).

Experience operating complex data pipelines (Prefect, Airflow, Temporal) or real-time streaming systems.

History of mentoring engineers and embedding reliability culture across teams.

Pragmatic decision-maker—balances uptime, velocity, and cost for startup reality.

Curiosity for AI-augmented ops (LLM chat-ops, anomaly detection, self-healing).

Nice-to-Haves

Managed GPU clusters and ML inference workloads.

Operated data lakes / lakehouses at scale (Iceberg, Delta, etc.).

Meaningful open-source contributions in SRE, DevOps, or data-infra projects.

WHY PRIMER

Mission with impact – We’re unlocking new growth channels for thousands of B2B marketers.

High-trust, low-ego culture – Fully distributed team, meeting-light weeks, Friday focus days.

Work & life, balanced – Five weeks PTO, generous parental leave, and flexibility for families.

Career rocket-fuel – Small team, huge problems, real ownership. Shape the future with bold innovators, driving impact that redefines industries.

Diverse & global – Teammates span six countries—and counting.

Intro Call with Engineering Manager – 30 min

System Design – 60 min

Operational Excellence Drill-down – 60 min

Strategic Pragmatism Chat with CTO – 45 min

Technical Coding / Systems Deep Dive – 30 min

Culture & Values with CEO – 45 min

Decision typically within 24-48 hrs of final conversation.

READY TO LEVEL UP B2B MARKETING INFRASTRUCTURE?

Email careers@sayprimer.com with your résumé, LinkedIn, GitHub, or anything that showcases your reliability superpowers. Let’s build the future—without the fire-drills.

#J-18808-Ljbffr

Create a job alert for this search

Site Reliability Engineer • San Francisco, California, US

Related jobs

Promoted

Site Reliability Engineer

AlchemySan Francisco, California, US

Full-time

If you are considering sending an application, make sure to hit the apply button below after reading through the entire description. Our mission is to bring web3 to a billion people, by providing bu...Show moreLast updated: 1 day ago

Promoted

Site Reliability Engineer

FortinetSunnyvale, CA, United States

Full-time

At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers. Our team members enjoy solving complex problems, and obsess...Show moreLast updated: 6 days ago

Promoted

Site Reliability Engineer

PsiQuantumPalo Alto, CA, United States

Full-time

Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Show moreLast updated: 30+ days ago

Promoted

Site Reliability Engineer

CompunnelRichmond, CA, United States

Full-time

The Site Reliability Engineer will be responsible for ensuring the reliability, availability, and performance of applications and services as part of the transition from private to public cloud.Thi...Show moreLast updated: 5 days ago

Promoted

Site Reliability Engineer

Redwood Materials, Inc.San Francisco, CA, United States

Full-time

Redwood is localizing a global battery supply chain that seamlessly integrates recovery, reuse, and recycling—keeping critical minerals in circulation and driving the energy transition.Founded in 2...Show moreLast updated: 30+ days ago

Promoted

Site Reliability Engineer

Insight GlobalSanta Clara, CA, United States

Full-time

Insight Global is looking for a seasoned SRE to join one of our largest technology clients' multifaceted and fast-paced Infrastructure, Planning and Processes organization where you will be working...Show moreLast updated: 6 days ago

Promoted

Site Reliability Engineer

Runloop AISan Francisco, CA, United States

Full-time

Runloop is building the foundational infrastructure for the next generation of AI development.We provide AI engineers and data scientists with lightning-fast, secure, and reproducible code sandboxe...Show moreLast updated: 14 days ago

Promoted

Site Reliability Engineer

WorkOSSan Francisco, CA, United States

Full-time

About WorkOS 🚀 WorkOS builds tools and services for developers to help them implement authentication, identity, authorization, and overall enterprise readiness. We’re a fully distributed team with ...Show moreLast updated: 30+ days ago

Promoted

Site Reliability Engineer

XaiPalo Alto, CA, United States

Full-time

AIs mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellen...Show moreLast updated: 6 days ago

Promoted

Site Reliability Engineer

PSI QuantumPalo Alto, CA, United States

Full-time

Promoted

Site Reliability Engineer

ReplitFoster City, CA, United States

Full-time

Replit is the agentic software creation platform that enables anyone to build applications using natural language.With millions of users worldwide and over 500,000 business users, Replit is democra...Show moreLast updated: 30+ days ago

Promoted

Site Reliability Engineer I

Prosper.comSan Francisco, CA, United States

Full-time

As a Site Reliability Engineer I at Prosper, you will play a crucial role in enhancing the reliability, scalability, and maintainability of our technology platform. This entry-level position is desi...Show moreLast updated: 6 days ago

Promoted

Site Reliability Engineer - Supercomputing

XaiPalo Alto, CA, United States

Full-time

Site Reliability Engineer - Supercomputing.We are seeking a talented Site Reliability Engineer (SRE) to join our SuperComputing team. In this role, you'll ensure the reliability, scalability, and pe...Show moreLast updated: 5 days ago

Promoted
New!

Site Reliability Engineer

TogetherSan Francisco, CA, United States

Full-time

As a Site Reliability Engineer (SRE) at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a soft...Show moreLast updated: 2 hours ago

Promoted

Site Reliability Engineer

P2PSan Francisco, CA, United States

Full-time

Our mission is to bring web3 to a billion people, by providing builders with the tools they need to build exceptional onchain products. Alchemy is the only complete developer platform that offers th...Show moreLast updated: 30+ days ago

Promoted

Site Reliability Engineer

Rockwoods IncPleasanton, CA, US

Full-time

Note : Candidates must have relevant experience in Medical / Healthcare domains, this is mandatory.Senior SRE Engineer - Pleasanton, 5 days office. Primary work : 24x7 On-call support and setting up mo...Show moreLast updated: 24 days ago

Promoted
New!

Site Reliability Engineer

Together AISan Francisco, California, US

Full-time

Please double check you have the right level of experience and qualifications by reading the full overview of this opportunity below. As a Site Reliability Engineer (SRE) at Together, you are respon...Show moreLast updated: 1 hour ago

Promoted

Site Reliability Engineer

FractalSan Francisco, California, US

Full-time

This range is provided by Fractal.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Increase your chances of reaching the interview stage by readi...Show moreLast updated: 1 day ago