Talent.com
No se aceptan más aplicaciones
Site Reliability Engineer

Site Reliability Engineer

PrimerSan Francisco, California, US
Hace 5 días
Tipo de contrato
  • A tiempo completo
Descripción del trabajo

Primer helps B2B products break out of the B2C-centric marketing box. Our platform turns consumer ad channels, data streams, and emerging AI workflows into measurable growth engines for go-to-market teams. We ingest billions of rows from first- and third-party sources, map them to rich company context, and surface hyper-targeted audiences and real-time performance alerts—all without vendor lock-in.

Please ensure you read the below overview and requirements for this employment opportunity completely.

That only works if the lights stay on , queries stay fast , and incidents stay rare . That’s where you come in.

As our first dedicated Site Reliability Engineer , you’ll be the force multiplier who designs, builds, and operates the infrastructure that powers everything : petabyte-scale data pipelines, LLM-backed services, and the APIs our customers (and engineers!) rely on every day. You’ll pair hard-won ops experience with a mentor’s mindset—levelling up the whole team while keeping us four steps ahead of failure.

YOUR MISSION

Own reliability from design to customer.

  • Define and uphold SLOs / SLIs, manage error budgets, and lead blameless post-mortems.
  • Automate toil out of existence—CI / CD, infra-as-code, capacity planning, and chaos testing.
  • Drive incident response end-to-end : detection, mitigation, root-cause analysis, and long-term fixes.
  • Scale multi-cloud data pipelines (Prefect, ClickHouse, Iceberg) and GPU / LLM workloads.
  • Teach best practices, review designs, and coach engineers so reliability becomes a team sport.

WHAT YOU’LL DO

  • Design, implement, and tune distributed systems that handle high-throughput B2B traffic .
  • Harden our AWS stack with IaC (e.g. Terraform)
  • Instrument everything—logs, traces, metrics, and AI-powered anomaly detection.
  • Champion security, cost optimization, and disaster-recovery strategies.
  • Jump into the weeds when something breaks, fix it fast, then automate it away.
  • WHAT YOU’LL BRING

    Must-Haves

  • 5+ years owning production systems at meaningful scale (sub-second latency, “four-nines” targets).
  • Mastery of SRE fundamentals : SLO / SLI design, error budgets, incident playbooks.
  • Deep hands-on with Linux, networking, containers / K8s, and at least one major cloud (AWS / GCP / Azure).
  • Proven track record automating infra with Terraform, Helm, or similar IaC tooling.
  • Fluency in at least one systems / scripting language (Go, Python, Rust, etc.).
  • Experience operating complex data pipelines (Prefect, Airflow, Temporal) or real-time streaming systems.
  • History of mentoring engineers and embedding reliability culture across teams.
  • Pragmatic decision-maker—balances uptime, velocity, and cost for startup reality.
  • Curiosity for AI-augmented ops (LLM chat-ops, anomaly detection, self-healing).
  • Nice-to-Haves

  • Managed GPU clusters and ML inference workloads.
  • Operated data lakes / lakehouses at scale (Iceberg, Delta, etc.).
  • Meaningful open-source contributions in SRE, DevOps, or data-infra projects.
  • WHY PRIMER

  • Mission with impact – We’re unlocking new growth channels for thousands of B2B marketers.
  • High-trust, low-ego culture – Fully distributed team, meeting-light weeks, Friday focus days.
  • Work & life, balanced – Five weeks PTO, generous parental leave, and flexibility for families.
  • Career rocket-fuel – Small team, huge problems, real ownership. Shape the future with bold innovators, driving impact that redefines industries.
  • Diverse & global – Teammates span six countries—and counting.
  • Intro Call with Engineering Manager – 30 min
  • System Design – 60 min
  • Operational Excellence Drill-down – 60 min
  • Strategic Pragmatism Chat with CTO – 45 min
  • Technical Coding / Systems Deep Dive – 30 min
  • Culture & Values with CEO – 45 min
  • Decision typically within 24-48 hrs of final conversation.

    READY TO LEVEL UP B2B MARKETING INFRASTRUCTURE?

    Email careers@sayprimer.com with your résumé, LinkedIn, GitHub, or anything that showcases your reliability superpowers. Let’s build the future—without the fire-drills.

    #J-18808-Ljbffr

    Crear una alerta de empleo para esta búsqueda

    Site Reliability Engineer • San Francisco, California, US

    Ofertas relacionadas
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    Together AISan Francisco, CA, United States
    A tiempo completo
    As a Site Reliability Engineer (SRE) at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a soft...Mostrar másÚltima actualización: hace 10 días
    • Oferta promocionada
    Principal Site Reliability Engineer

    Principal Site Reliability Engineer

    FortinetSanta Clara, CA, United States
    A tiempo completo
    At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers. Our team members enjoy solving complex problems, and obsess...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer I

    Site Reliability Engineer I

    ProsperSan Francisco, CA, United States
    A tiempo completo
    As a Site Reliability Engineer I at Prosper, you will play a crucial role in enhancing the reliability, scalability, and maintainability of our technology platform. This entry-level position is desi...Mostrar másÚltima actualización: hace 15 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    PsiQuantumPalo Alto, CA, United States
    A tiempo completo
    Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    Rethink recruitSan Francisco, CA, United States
    A tiempo completo
    Runloop is building the foundational infrastructure for the next generation of AI development.We provide AI engineers and data scientists with lightning-fast, secure, and reproducible code sandboxe...Mostrar másÚltima actualización: hace 10 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    Runloop AI, IncSan Francisco, CA, United States
    A tiempo completo
    Runloop is building the foundational infrastructure for the next generation of AI development.We provide AI engineers and data scientists with lightning-fast, secure, and reproducible code sandboxe...Mostrar másÚltima actualización: hace 10 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    Insight GlobalSanta Clara, CA, United States
    A tiempo completo
    Insight Global is looking for a seasoned SRE to join one of our largest technology clients' multifaceted and fast-paced Infrastructure, Planning and Processes organization where you will be working...Mostrar másÚltima actualización: hace 10 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    Redwood Materials, Inc.San Francisco, CA, United States
    A tiempo completo
    Redwood is localizing a global battery supply chain that seamlessly integrates recovery, reuse, and recycling—keeping critical minerals in circulation and driving the energy transition.Founded in 2...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    FortinetSunnyvale, CA, United States
    A tiempo completo
    At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers. Our team members enjoy solving complex problems, and obsess...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    Redwood MaterialsSan Francisco, CA, United States
    A tiempo completo
    Redwood is localizing a global battery supply chain that seamlessly integrates recovery, reuse, and recycling.We are seeking a highly skilled and motivated Site Reliability Engineer to collect requ...Mostrar másÚltima actualización: hace 10 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    Runloop AISan Francisco, CA, United States
    A tiempo completo
    Runloop is building the foundational infrastructure for the next generation of AI development.We provide AI engineers and data scientists with lightning-fast, secure, and reproducible code sandboxe...Mostrar másÚltima actualización: hace 19 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    ConductorOneSan Francisco, CA, United States
    A tiempo completo
    ConductorOne is the first AI-native identity security platform that protects every identity : human, non-human, and AI.With powerful automation, platform-level AI, and out-of-the-box connectors, it ...Mostrar másÚltima actualización: hace 10 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    WorkOSSan Francisco, CA, United States
    A tiempo completo
    About WorkOS 🚀 WorkOS builds tools and services for developers to help them implement authentication, identity, authorization, and overall enterprise readiness. We’re a fully distributed team with ...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer - Openstack

    Site Reliability Engineer - Openstack

    FortinetSunnyvale, CA, United States
    A tiempo completo
    Fortinet is recruiting a Site Reliability Engineer- OPENSTACK to join our FortiStack team.This team is responsible for the management, operation and continued development of our Openstack-based pri...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    PSI QuantumPalo Alto, CA, United States
    A tiempo completo
    Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    ReplitFoster City, CA, United States
    A tiempo completo
    Replit is the agentic software creation platform that enables anyone to build applications using natural language.With millions of users worldwide and over 500,000 business users, Replit is democra...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    FractalSan Francisco, CA, United States
    A tiempo completo
    This range is provided by Fractal.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Fractal Analytics is a strategic AI partner to Fortune 500 com...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer I

    Site Reliability Engineer I

    Prosper.comSan Francisco, CA, United States
    A tiempo completo
    As a Site Reliability Engineer I at Prosper, you will play a crucial role in enhancing the reliability, scalability, and maintainability of our technology platform. This entry-level position is desi...Mostrar másÚltima actualización: hace 10 días