No se aceptan más aplicaciones

Site Reliability Engineer

PrimerSan Francisco, California, US

Hace 5 días

Tipo de contrato

A tiempo completo

Descripción del trabajo

Primer helps B2B products break out of the B2C-centric marketing box. Our platform turns consumer ad channels, data streams, and emerging AI workflows into measurable growth engines for go-to-market teams. We ingest billions of rows from first- and third-party sources, map them to rich company context, and surface hyper-targeted audiences and real-time performance alerts—all without vendor lock-in.

Please ensure you read the below overview and requirements for this employment opportunity completely.

That only works if the lights stay on , queries stay fast , and incidents stay rare . That’s where you come in.

As our first dedicated Site Reliability Engineer , you’ll be the force multiplier who designs, builds, and operates the infrastructure that powers everything : petabyte-scale data pipelines, LLM-backed services, and the APIs our customers (and engineers!) rely on every day. You’ll pair hard-won ops experience with a mentor’s mindset—levelling up the whole team while keeping us four steps ahead of failure.

YOUR MISSION

Own reliability from design to customer.

Define and uphold SLOs / SLIs, manage error budgets, and lead blameless post-mortems.
Automate toil out of existence—CI / CD, infra-as-code, capacity planning, and chaos testing.
Drive incident response end-to-end : detection, mitigation, root-cause analysis, and long-term fixes.
Scale multi-cloud data pipelines (Prefect, ClickHouse, Iceberg) and GPU / LLM workloads.
Teach best practices, review designs, and coach engineers so reliability becomes a team sport.

WHAT YOU’LL DO

Design, implement, and tune distributed systems that handle high-throughput B2B traffic .

Harden our AWS stack with IaC (e.g. Terraform)

Instrument everything—logs, traces, metrics, and AI-powered anomaly detection.

Champion security, cost optimization, and disaster-recovery strategies.

Jump into the weeds when something breaks, fix it fast, then automate it away.

WHAT YOU’LL BRING

Must-Haves

5+ years owning production systems at meaningful scale (sub-second latency, “four-nines” targets).

Mastery of SRE fundamentals : SLO / SLI design, error budgets, incident playbooks.

Deep hands-on with Linux, networking, containers / K8s, and at least one major cloud (AWS / GCP / Azure).

Proven track record automating infra with Terraform, Helm, or similar IaC tooling.

Fluency in at least one systems / scripting language (Go, Python, Rust, etc.).

Experience operating complex data pipelines (Prefect, Airflow, Temporal) or real-time streaming systems.

History of mentoring engineers and embedding reliability culture across teams.

Pragmatic decision-maker—balances uptime, velocity, and cost for startup reality.

Curiosity for AI-augmented ops (LLM chat-ops, anomaly detection, self-healing).

Nice-to-Haves

Managed GPU clusters and ML inference workloads.

Operated data lakes / lakehouses at scale (Iceberg, Delta, etc.).

Meaningful open-source contributions in SRE, DevOps, or data-infra projects.

WHY PRIMER

Mission with impact – We’re unlocking new growth channels for thousands of B2B marketers.

High-trust, low-ego culture – Fully distributed team, meeting-light weeks, Friday focus days.

Work & life, balanced – Five weeks PTO, generous parental leave, and flexibility for families.

Career rocket-fuel – Small team, huge problems, real ownership. Shape the future with bold innovators, driving impact that redefines industries.

Diverse & global – Teammates span six countries—and counting.

Intro Call with Engineering Manager – 30 min

System Design – 60 min

Operational Excellence Drill-down – 60 min

Strategic Pragmatism Chat with CTO – 45 min

Technical Coding / Systems Deep Dive – 30 min

Culture & Values with CEO – 45 min

Decision typically within 24-48 hrs of final conversation.

READY TO LEVEL UP B2B MARKETING INFRASTRUCTURE?

Email careers@sayprimer.com with your résumé, LinkedIn, GitHub, or anything that showcases your reliability superpowers. Let’s build the future—without the fire-drills.

#J-18808-Ljbffr

Crear una alerta de empleo para esta búsqueda

Site Reliability Engineer • San Francisco, California, US

Ofertas relacionadas

Oferta promocionada

Site Reliability Engineer

Together AISan Francisco, CA, United States

A tiempo completo

As a Site Reliability Engineer (SRE) at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a soft...Mostrar másÚltima actualización: hace 10 días

Oferta promocionada

Principal Site Reliability Engineer

FortinetSanta Clara, CA, United States

A tiempo completo

At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers. Our team members enjoy solving complex problems, and obsess...Mostrar másÚltima actualización: hace más de 30 días

Oferta promocionada

Site Reliability Engineer I

ProsperSan Francisco, CA, United States

A tiempo completo

As a Site Reliability Engineer I at Prosper, you will play a crucial role in enhancing the reliability, scalability, and maintainability of our technology platform. This entry-level position is desi...Mostrar másÚltima actualización: hace 15 días

Oferta promocionada

Site Reliability Engineer

PsiQuantumPalo Alto, CA, United States

A tiempo completo

Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Mostrar másÚltima actualización: hace más de 30 días

Oferta promocionada

Site Reliability Engineer

Rethink recruitSan Francisco, CA, United States

A tiempo completo

Runloop is building the foundational infrastructure for the next generation of AI development.We provide AI engineers and data scientists with lightning-fast, secure, and reproducible code sandboxe...Mostrar másÚltima actualización: hace 10 días

Oferta promocionada

Site Reliability Engineer

Runloop AI, IncSan Francisco, CA, United States

A tiempo completo

Oferta promocionada

Site Reliability Engineer

Insight GlobalSanta Clara, CA, United States

A tiempo completo

Insight Global is looking for a seasoned SRE to join one of our largest technology clients' multifaceted and fast-paced Infrastructure, Planning and Processes organization where you will be working...Mostrar másÚltima actualización: hace 10 días

Oferta promocionada

Site Reliability Engineer

Redwood Materials, Inc.San Francisco, CA, United States

A tiempo completo

Redwood is localizing a global battery supply chain that seamlessly integrates recovery, reuse, and recycling—keeping critical minerals in circulation and driving the energy transition.Founded in 2...Mostrar másÚltima actualización: hace más de 30 días

Oferta promocionada

Site Reliability Engineer

FortinetSunnyvale, CA, United States

A tiempo completo

Oferta promocionada

Site Reliability Engineer

Redwood MaterialsSan Francisco, CA, United States

A tiempo completo

Redwood is localizing a global battery supply chain that seamlessly integrates recovery, reuse, and recycling.We are seeking a highly skilled and motivated Site Reliability Engineer to collect requ...Mostrar másÚltima actualización: hace 10 días

Oferta promocionada

Site Reliability Engineer

Runloop AISan Francisco, CA, United States

A tiempo completo

Oferta promocionada

Site Reliability Engineer

ConductorOneSan Francisco, CA, United States

A tiempo completo

ConductorOne is the first AI-native identity security platform that protects every identity : human, non-human, and AI.With powerful automation, platform-level AI, and out-of-the-box connectors, it ...Mostrar másÚltima actualización: hace 10 días

Oferta promocionada

Site Reliability Engineer

WorkOSSan Francisco, CA, United States

A tiempo completo

About WorkOS 🚀 WorkOS builds tools and services for developers to help them implement authentication, identity, authorization, and overall enterprise readiness. We’re a fully distributed team with ...Mostrar másÚltima actualización: hace más de 30 días

Oferta promocionada

Site Reliability Engineer - Openstack

FortinetSunnyvale, CA, United States

A tiempo completo

Fortinet is recruiting a Site Reliability Engineer- OPENSTACK to join our FortiStack team.This team is responsible for the management, operation and continued development of our Openstack-based pri...Mostrar másÚltima actualización: hace más de 30 días

Oferta promocionada

Site Reliability Engineer

PSI QuantumPalo Alto, CA, United States

A tiempo completo

Oferta promocionada

Site Reliability Engineer

ReplitFoster City, CA, United States

A tiempo completo

Replit is the agentic software creation platform that enables anyone to build applications using natural language.With millions of users worldwide and over 500,000 business users, Replit is democra...Mostrar másÚltima actualización: hace más de 30 días

Oferta promocionada

Site Reliability Engineer

FractalSan Francisco, CA, United States

A tiempo completo

This range is provided by Fractal.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Fractal Analytics is a strategic AI partner to Fortune 500 com...Mostrar másÚltima actualización: hace más de 30 días

Oferta promocionada

Site Reliability Engineer I

Prosper.comSan Francisco, CA, United States

A tiempo completo