Site Reliability EngineerPrimer • San Francisco, CA, United States

Site Reliability Engineer

Primer • San Francisco, CA, United States

30+ days ago

Job type

Full-time

Job description

Primer helps B2B products break out of the B2C-centric marketing box. Our platform turns consumer ad channels, data streams, and emerging AI workflows into measurable growth engines for go-to-market teams. We ingest billions of rows from first- and third-party sources, map them to rich company context, and surface hyper-targeted audiences and real-time performance alerts—all without vendor lock-in.

That only works if the lights stay on , queries stay fast , and incidents stay rare . That’s where you come in.

As our first dedicated Site Reliability Engineer , you’ll be the force multiplier who designs, builds, and operates the infrastructure that powers everything : petabyte-scale data pipelines, LLM-backed services, and the APIs our customers (and engineers!) rely on every day. You’ll pair hard-won ops experience with a mentor’s mindset—levelling up the whole team while keeping us four steps ahead of failure.

YOUR MISSION

Own reliability from design to customer.

Define and uphold SLOs / SLIs, manage error budgets, and lead blameless post-mortems.
Automate toil out of existence—CI / CD, infra-as-code, capacity planning, and chaos testing.
Drive incident response end-to-end : detection, mitigation, root-cause analysis, and long-term fixes.
Scale multi-cloud data pipelines (Prefect, ClickHouse, Iceberg) and GPU / LLM workloads.
Teach best practices, review designs, and coach engineers so reliability becomes a team sport.

WHAT YOU’LL DO

Design, implement, and tune distributed systems that handle high-throughput B2B traffic .

Harden our AWS stack with IaC (e.g. Terraform)

Instrument everything—logs, traces, metrics, and AI-powered anomaly detection.

Champion security, cost optimization, and disaster-recovery strategies.

Jump into the weeds when something breaks, fix it fast, then automate it away.

WHAT YOU’LL BRING

Must-Haves

5+ years owning production systems at meaningful scale (sub-second latency, “four-nines” targets).

Mastery of SRE fundamentals : SLO / SLI design, error budgets, incident playbooks.

Deep hands-on with Linux, networking, containers / K8s, and at least one major cloud (AWS / GCP / Azure).

Proven track record automating infra with Terraform, Helm, or similar IaC tooling.

Fluency in at least one systems / scripting language (Go, Python, Rust, etc.).

Experience operating complex data pipelines (Prefect, Airflow, Temporal) or real-time streaming systems.

History of mentoring engineers and embedding reliability culture across teams.

Pragmatic decision-maker—balances uptime, velocity, and cost for startup reality.

Curiosity for AI-augmented ops (LLM chat-ops, anomaly detection, self-healing).

Nice-to-Haves

Managed GPU clusters and ML inference workloads.

Operated data lakes / lakehouses at scale (Iceberg, Delta, etc.).

Meaningful open-source contributions in SRE, DevOps, or data-infra projects.

WHY PRIMER

Mission with impact – We’re unlocking new growth channels for thousands of B2B marketers.

High-trust, low-ego culture – Fully distributed team, meeting-light weeks, Friday focus days.

Work & life, balanced – Five weeks PTO, generous parental leave, and flexibility for families.

Career rocket-fuel – Small team, huge problems, real ownership. Shape the future with bold innovators, driving impact that redefines industries.

Diverse & global – Teammates span six countries—and counting.

Intro Call with Engineering Manager – 30 min

System Design – 60 min

Operational Excellence Drill-down – 60 min

Strategic Pragmatism Chat with CTO – 45 min

Technical Coding / Systems Deep Dive – 30 min

Culture & Values with CEO – 45 min

Decision typically within 24-48 hrs of final conversation.

READY TO LEVEL UP B2B MARKETING INFRASTRUCTURE?

Email careers@sayprimer.com with your résumé, , GitHub, or anything that showcases your reliability superpowers. Let’s build the future—without the fire-drills.

#J-18808-Ljbffr

Create a job alert for this search

Site Reliability Engineer • San Francisco, CA, United States

Related jobs

Site Reliability Engineer

ConductorOne • San Francisco, CA, United States

Full-time

Shape the future of identity with the highest-caliber team.If you’re amazing at what you do and want to solve big challenges in identity and security, come on board. Identity is how companies are be...Show more

Last updated: 16 days ago • Promoted

Principal Site Reliability Engineer

VirtualVocations • Hayward, California, United States

Full-time

A company is looking for a Principal Site Reliability Engineer.Key Responsibilities Lead the technical direction of the team while contributing to the design and implementation of self-service to...Show more

Last updated: 30+ days ago • Promoted

Principal Site Reliability Engineer

Fortinet • Santa Clara, CA, United States

Full-time

At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers. Our team members enjoy solving complex problems, and obsess...Show more

Last updated: 14 days ago • Promoted

Site Reliability Engineer I

prosper.com • San Francisco, CA, United States

Full-time

As a Site Reliability Engineer I at Prosper, you will play a crucial role in enhancing the reliability, scalability, and maintainability of our technology platform. This entry-level position is desi...Show more

Last updated: 11 days ago • Promoted

Site Reliability Engineer

PsiQuantum • Palo Alto, CA, United States

Full-time

Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Show more

Last updated: 30+ days ago • Promoted

Site Reliability Engineer

Redwood Materials, Inc. • San Francisco, CA, United States

Full-time

Redwood is localizing a global battery supply chain that seamlessly integrates recovery, reuse, and recycling—keeping critical minerals in circulation and driving the energy transition.Founded in 2...Show more

Last updated: 30+ days ago • Promoted

Site Reliability Engineer

Fortinet • Sunnyvale, CA, United States

Full-time

Last updated: 14 days ago • Promoted

Cloud Site Reliability Engineer

VirtualVocations • Fremont, California, United States

Full-time

A company is looking for a Cloud Site Reliability Engineer (AWS).Key Responsibilities Design, deploy, and maintain AWS cloud infrastructure for high availability and fault tolerance Administer M...Show more

Last updated: 30+ days ago • Promoted

DevOps Site Reliability Engineer

VirtualVocations • Hayward, California, United States

Full-time

A company is looking for a DevOps / Site Reliability Engineer (Remote).Key Responsibilities Configure, manage, and improve CI / CD pipelines for application deployments Monitor application perform...Show more

Last updated: 1 day ago • Promoted

Site Reliability Engineer

WorkOS • San Francisco, CA, United States

Full-time

About WorkOS 🚀 WorkOS builds tools and services for developers to help them implement authentication, identity, authorization, and overall enterprise readiness. We’re a fully distributed team with ...Show more

Last updated: 30+ days ago • Promoted

Site Reliability Engineer I

Prosper Marketplace • San Francisco, CA, United States

Full-time

Last updated: 7 days ago • Promoted

Site Reliability Engineer

Alchemy • San Francisco, CA, United States

Full-time

Our mission is to bring web3 to a billion people, by providing builders with the tools they need to build exceptional onchain products. Alchemy is the only complete developer platform that offers th...Show more

Last updated: 30+ days ago • Promoted

Site Reliability Engineer - Openstack

Fortinet • Sunnyvale, CA, United States

Full-time

Fortinet is recruiting a Site Reliability Engineer- OPENSTACK to join our FortiStack team.This team is responsible for the management, operation and continued development of our Openstack-based pri...Show more

Last updated: 14 days ago • Promoted

Senior Site Reliability Engineer

VirtualVocations • Concord, California, United States

Full-time

A company is looking for a Senior Site Reliability Engineer to help scale its platform and ensure system reliability.Key Responsibilities Act as a first responder for system incidents and outages...Show more

Last updated: 30+ days ago • Promoted

Site Reliability Engineer

Redwood Materials • San Francisco, CA, United States

Full-time

Redwood is localizing a global battery supply chain that seamlessly integrates recovery, reuse, and recycling — keeping critical minerals in circulation and driving the energy transition.Founded in...Show more

Last updated: 30+ days ago • Promoted

Site Reliability Engineer

VirtualVocations • Santa Clara, California, United States

Full-time

A company is looking for a Site Reliability Engineer (SRE).Key Responsibilities Design, build, and maintain scalable and reliable infrastructure using cloud platforms and automation tools Implem...Show more

Last updated: 30+ days ago • Promoted

Site Reliability Engineer

Fractal • San Francisco, CA, United States

Full-time

This range is provided by Fractal.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Fractal Analytics is a strategic AI partner to Fortune 500 com...Show more

Last updated: 30+ days ago • Promoted

Site Reliability Engineer

Signify Technology • Palo Alto, CA, US

Full-time

Competitive, based on experience.We are a technology startup advancing healthcare with a safety-focused AI platform that assists medical professionals by managing patient communications, including ...Show more

Last updated: 9 days ago • Promoted