Talent.com
CloudDevs: Senior Site Reliability Engineer (SRE)
CloudDevs: Senior Site Reliability Engineer (SRE)Breakout Tools • San Francisco, CA, United States
CloudDevs : Senior Site Reliability Engineer (SRE)

CloudDevs : Senior Site Reliability Engineer (SRE)

Breakout Tools • San Francisco, CA, United States
Hace 1 día
Tipo de contrato
  • A tiempo completo
Descripción del trabajo

CloudDevs works with fast-moving, venture-backed startups across the US. We’re building a pool of world-class Site Reliability Engineers for current roles and for upcoming opportunities. You will either be placed directly into one of our partner startups or added to our vetted SRE network for future projects.

This role is ideal for engineers who care about reliability, metrics, performance, and building simple, scalable systems. If you enjoy designing for scale and improving how teams ship software, you’ll fit right in.

Key Responsibilities

  • Work as a hands‑on engineer focused on system reliability, performance, and observability.
  • Define and track SLIs, SLOs, and error budgets.
  • Optimize monitoring cost and signal quality across metrics, logs, and traces.
  • Improve deployment safety, canary rollouts, and UAT pipelines.
  • Build tools for automated and local performance testing and track benchmarks.
  • Lead resilience work like failover drills, chaos tests, and redundancy checks.
  • Partner with engineering teams to improve scaling patterns and architecture as the product grows.
  • Support incident response processes and help reduce operational noise.
  • Write clean, maintainable code in Go, Python, or Node.js.
  • Contribute to CI / CD improvements and automation efforts.
  • Collaborate with engineers across teams to raise reliability standards.

Requirements

  • 5+ years in SRE, DevOps, or Platform Engineering roles.
  • Strong experience with cloud infrastructure (AWS preferred), Terraform, and Kubernetes.
  • Deep knowledge of observability tools like DataDog, Prometheus, or OpenTelemetry.
  • Strong debugging skills across services, networking, and data layers.
  • Hands‑on experience designing and monitoring SLIs / SLOs.
  • Experience with CI / CD tools such as GitHub Actions, Jenkins, or ArgoCD.
  • Ability to write production‑grade code in Go, Python, or Node.js.
  • Comfort working independently in fast‑paced environments.
  • Nice to Have

  • Experience tuning observability costs and optimizing data ingestion.
  • Exposure to chaos engineering and progressive deployments.
  • Background with high‑throughput or latency‑sensitive systems.
  • AWS at scale (EKS, Lambda, DynamoDB, S3).
  • Experience in regulated industries like fintech, payments, or SOC2 environments.
  • Performance testing pipelines or load‑testing automation.
  • Experience handling systems processing tens of millions of API calls.
  • Open Pool for SREs

    Even if you don’t meet every requirement or aren’t a fit for the current role, strong SREs with real production experience are welcome to join our talent pool. We regularly place engineers with different strengths across reliability, DevOps, platform, observability, backend, and infrastructure engineering.

    #J-18808-Ljbffr

    Crear una alerta de empleo para esta búsqueda

    Senior Site Reliability Engineer • San Francisco, CA, United States

    Ofertas relacionadas
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Chainlink Labs • San Francisco, CA, United States
    A tiempo completo
    Chainlink Labs is the primary contributing developer of Chainlink, the decentralized computing platform powering the verifiable web. Chainlink is the industry-standard platform for providing access ...Mostrar más
    Última actualización: hace más de 30 días • Oferta promocionada
    Senior Site Reliability Engineer, Compute

    Senior Site Reliability Engineer, Compute

    Crusoe • San Francisco, CA, United States
    A tiempo completo
    Senior Site Reliability Engineer, Compute.Crusoe's mission is to accelerate the abundance of energy and intelligence.We’re crafting the engine that powers a world where people can create ambitiousl...Mostrar más
    Última actualización: hace más de 30 días • Oferta promocionada
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    AI Fund • San Francisco, CA, United States
    A tiempo completo
    Baseten powers inference for the world's most dynamic AI companies, like.As a Site Reliability Engineer, you'll envision and build robust systems and processes that ensure our infrastructure is sca...Mostrar más
    Última actualización: hace más de 30 días • Oferta promocionada
    Senior Site Reliability Engineer – Platform

    Senior Site Reliability Engineer – Platform

    Icon Ventures • San Francisco, CA, United States
    A tiempo completo
    At Quizlet, our mission is to help every learner achieve their outcomes in the most effective and delightful way.We blend cognitive science with machine learning to personalize and enhance the lear...Mostrar más
    Última actualización: hace 10 días • Oferta promocionada
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    The Recruiting Guy • San Francisco, CA, United States
    A tiempo completo
    Be among the first 25 applicants.This range is provided by The Recruiting Guy.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.Senior Cloud Infra...Mostrar más
    Última actualización: hace 10 días • Oferta promocionada
    CloudDevs : Senior Web site Reliability Engineer (SRE)

    CloudDevs : Senior Web site Reliability Engineer (SRE)

    The10minutecareersolution • San Francisco, CA, United States
    A tiempo completo
    CloudDevs : Senior Web site Reliability Engineer (SRE).CloudDevs works with fast-moving, venture-backed startups throughout the US. We’re constructing a pool of world-class Web site Reliability Engin...Mostrar más
    Última actualización: hace 5 días • Oferta promocionada
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    SS&C Technologies • San Francisco, CA, United States
    A tiempo completo
    SS&C Technologies is a global investment and financial services software provider, headquartered in Windsor, Connecticut, and supporting more than 28,000 employees across 35 countries.It specialize...Mostrar más
    Última actualización: hace más de 30 días • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    Together AI • San Francisco, CA, United States
    A tiempo completo
    As a Site Reliability Engineer (SRE) at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a soft...Mostrar más
    Última actualización: hace más de 30 días • Oferta promocionada
    Senior Site Reliability Engineer, Compute

    Senior Site Reliability Engineer, Compute

    Epoch Biodesign • San Francisco, CA, United States
    A tiempo completo
    Crusoe's mission is to accelerate the abundance of energy and intelligence.We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, spe...Mostrar más
    Última actualización: hace 3 días • Oferta promocionada
    Remote Senior Site Reliability Engineer (SRE) - Zetachain

    Remote Senior Site Reliability Engineer (SRE) - Zetachain

    Blockchain Works • San Francisco, CA, United States
    Teletrabajo
    A tiempo completo
    Site Reliability Engineer to join our team and run critical infrastructure for our blockchain and web applications.You’ll learn to deploy and maintain a fleet of RPC and validator nodes for multipl...Mostrar más
    Última actualización: hace más de 30 días • Oferta promocionada
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Alembic • San Francisco, CA, United States
    A tiempo completo
    We’re looking for an experienced.Site Reliability Engineer (SRE).You’ll partner with engineers and data scientists to build, automate, and maintain the infrastructure that powers our core platform—...Mostrar más
    Última actualización: hace 11 días • Oferta promocionada
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Loft Orbital • San Francisco, CA, United States
    A tiempo completo
    Loft Orbital is revolutionizing access to space by building reliable, shareable satellites that drastically reduce the time and complexity traditionally required to get to orbit.We operate satellit...Mostrar más
    Última actualización: hace más de 30 días • Oferta promocionada
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Checkr • San Francisco, CA, United States
    A tiempo completo
    Checkr is building the data platform to power safe and fair decisions.Established in 2014, Checkr’s innovative technology and robust data platform help customers assess risk and ensure safety and c...Mostrar más
    Última actualización: hace 10 días • Oferta promocionada
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Alembic Technologies • San Francisco, CA, United States
    A tiempo completo
    Senior Site Reliability Engineer.This range is provided by Alembic Technologies.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.We’re looking fo...Mostrar más
    Última actualización: hace 9 días • Oferta promocionada
    Senior Software Engineer, Site Reliability Engineer (SRE)

    Senior Software Engineer, Site Reliability Engineer (SRE)

    harvey.ai • San Francisco, CA, United States
    A tiempo completo
    At Harvey, we’re transforming how legal and professional services operate — not incrementally, but end-to-end.By combining frontier agentic AI, an enterprise-grade platform, and deep domain experti...Mostrar más
    Última actualización: hace más de 30 días • Oferta promocionada
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Hive • San Francisco, CA, United States
    A tiempo completo
    Hive is the leading provider of cloud-based AI solutions to understand, search, and generate content, and is trusted by hundreds of the world's largest and most innovative organizations.The company...Mostrar más
    Última actualización: hace más de 30 días • Oferta promocionada
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Circle • San Francisco, CA, United States
    A tiempo completo
    Senior Site Reliability Engineer at Circle.Circle is a financial technology company at the epicenter of the emerging internet of money. Our infrastructure—including USDC, a blockchain‑based dollar—h...Mostrar más
    Última actualización: hace más de 30 días • Oferta promocionada
    Senior Site Reliability Engineer, Healthcare Cloud Infrastructure and Networking

    Senior Site Reliability Engineer, Healthcare Cloud Infrastructure and Networking

    Collective Health • San Francisco, CA, United States
    A tiempo completo
    Senior Site Reliability Engineer, Healthcare Cloud Infrastructure and Networking.At Collective Health, we’re transforming how employers and their people engage with their health benefits by seamles...Mostrar más
    Última actualización: hace 2 días • Oferta promocionada