Talent.com
CloudDevs: Senior Web site Reliability Engineer (SRE)
CloudDevs: Senior Web site Reliability Engineer (SRE)The10minutecareersolution • San Francisco, CA, United States
CloudDevs : Senior Web site Reliability Engineer (SRE)

CloudDevs : Senior Web site Reliability Engineer (SRE)

The10minutecareersolution • San Francisco, CA, United States
6 days ago
Job type
  • Full-time
Job description

CloudDevs : Senior Web site Reliability Engineer (SRE)

CloudDevs works with fast-moving, venture-backed startups throughout the US. We’re constructing a pool of world-class Web site Reliability Engineers for present roles and for upcoming alternatives. You’ll both be positioned straight into considered one of our associate startups or added to our vetted SRE community for future tasks.

This function is good for engineers who care about reliability, metrics, efficiency, and constructing easy, scalable methods. In the event you take pleasure in designing for scale and bettering how groups ship software program, you’ll match proper in.

Key Duties

  • Work as a hands-on engineer targeted on system reliability, efficiency, and observability.
  • Outline and monitor SLIs, SLOs, and error budgets.
  • Optimize monitoring value and sign high quality throughout metrics, logs, and traces.
  • Enhance deployment security, canary rollouts, and UAT pipelines.
  • Construct instruments for automated and native efficiency testing and monitor benchmarks.
  • Lead resilience work like failover drills, chaos assessments, and redundancy checks.
  • Companion with engineering groups to enhance scaling patterns and structure because the product grows.
  • Assist incident response processes and assist cut back operational noise.
  • Write clear, maintainable code in Go, Python, or Node.js.
  • Contribute to CI / CD enhancements and automation efforts.
  • Collaborate with engineers throughout groups to lift reliability requirements.

Necessities

  • 5+ years in SRE, DevOps, or Platform Engineering roles.
  • Sturdy expertise with cloud infrastructure (AWS most popular), Terraform, and Kubernetes.
  • Deep data of observability instruments like DataDog, Prometheus, or OpenTelemetry.
  • Sturdy debugging expertise throughout providers, networking, and knowledge layers.
  • Arms-on expertise designing and monitoring SLIs / SLOs.
  • Expertise with CI / CD instruments akin to GitHub Actions, Jenkins, or ArgoCD.
  • Skill to write down production-grade code in Go, Python, or Node.js.
  • Consolation working independently in fast-paced environments.
  • Good to Have

  • Expertise tuning observability prices and optimizing knowledge ingestion.
  • Publicity to chaos engineering and progressive deployments.
  • Background with high-throughput or latency-sensitive methods.
  • AWS at scale (EKS, Lambda, DynamoDB, S3).
  • Expertise in regulated industries like fintech, funds, or SOC2 environments.
  • Efficiency testing pipelines or load-testing automation.
  • Expertise dealing with methods processing tens of hundreds of thousands of API calls.
  • Open Pool for SREs

    Even for those who don’t meet each requirement or aren’t a match for the present function, sturdy SREs with actual manufacturing expertise are welcome to hitch our expertise pool. We recurrently place engineers with completely different strengths throughout reliability, DevOps, platform, observability, backend, and infrastructure engineering.

    #J-18808-Ljbffr

    Create a job alert for this search

    Senior Site Reliability Engineer • San Francisco, CA, United States

    Related jobs
    Senior Site Reliability Engineer, Compute

    Senior Site Reliability Engineer, Compute

    Roblox • San Mateo, California, United States
    Full-time
    The Infrastructure Compute Site Reliability Engineering (SRE) team's mission is to own and manage the successful operation of our underlying cell infrastructure system, along with elements of servi...Show more
    Last updated: 5 days ago • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Chainlink Labs • San Francisco, CA, United States
    Full-time
    Chainlink Labs is the primary contributing developer of Chainlink, the decentralized computing platform powering the verifiable web. Chainlink is the industry-standard platform for providing access ...Show more
    Last updated: 30+ days ago • Promoted
    Cloud Site Reliability Engineer (SRE)

    Cloud Site Reliability Engineer (SRE)

    Promise • Oakland, California, United States
    Full-time +1
    Promise empowers utilities and government agencies to create flexible, affordable solutions for individuals struggling with debt. Our innovative approach to payment plans and relief distribution sig...Show more
    Last updated: 30+ days ago • Promoted
    Senior Site Reliability Engineer – Platform

    Senior Site Reliability Engineer – Platform

    Icon Ventures • San Francisco, CA, United States
    Full-time
    At Quizlet, our mission is to help every learner achieve their outcomes in the most effective and delightful way.We blend cognitive science with machine learning to personalize and enhance the lear...Show more
    Last updated: 11 days ago • Promoted
    Remote Senior Site Reliability Engineer (SRE) - Zetachain

    Remote Senior Site Reliability Engineer (SRE) - Zetachain

    Blockchain Works • San Francisco, CA, United States
    Remote
    Full-time
    Site Reliability Engineer to join our team and run critical infrastructure for our blockchain and web applications.You’ll learn to deploy and maintain a fleet of RPC and validator nodes for multipl...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Conductorone • San Francisco, California, United States
    Full-time
    ConductorOne is the modern identity governance platform that makes it possible to move beyond the limitations of legacy IGA and reduce the identity attack surface with confidence.Designed for flexi...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Together AI • San Francisco, CA, United States
    Full-time
    As a Site Reliability Engineer (SRE) at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a soft...Show more
    Last updated: 30+ days ago • Promoted
    Software Engineer, Site Reliability Engineer (SRE)

    Software Engineer, Site Reliability Engineer (SRE)

    Harvey • San Francisco, California, United States
    Full-time
    Harvey is a secure AI platform for legal and professional services that augments productivity and automates complex workflows. Harvey uses algorithms with reasoning-adept LLMs that have been customi...Show more
    Last updated: 30+ days ago • Promoted
    Sr. Site Reliability Engineer

    Sr. Site Reliability Engineer

    Prosper • San Francisco, California, United States
    Full-time
    As a Senior Site Reliability Engineer (SRE) at Prosper, you will be instrumental in enhancing the reliability, scalability, and maintainability of our technology platform.This role bridges the gap ...Show more
    Last updated: 30+ days ago • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Alembic • San Francisco, CA, United States
    Full-time
    We’re looking for an experienced.Site Reliability Engineer (SRE).You’ll partner with engineers and data scientists to build, automate, and maintain the infrastructure that powers our core platform—...Show more
    Last updated: 12 days ago • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Hive • San Francisco, CA, United States
    Full-time
    Hive is the leading provider of cloud-based AI solutions to understand, search, and generate content, and is trusted by hundreds of the world's largest and most innovative organizations.The company...Show more
    Last updated: 30+ days ago • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Loft Orbital Solutions • San Francisco, California, United States
    Full-time
    Loft Orbital builds a space infrastructure providing a fast & simple path to orbit.We operate satellites, fly customer payloads onboard and handle the entire mission from initial concept through in...Show more
    Last updated: 30+ days ago • Promoted
    Senior Software Engineer, Site Reliability Engineer (SRE)

    Senior Software Engineer, Site Reliability Engineer (SRE)

    harvey.ai • San Francisco, CA, United States
    Full-time
    At Harvey, we’re transforming how legal and professional services operate — not incrementally, but end-to-end.By combining frontier agentic AI, an enterprise-grade platform, and deep domain experti...Show more
    Last updated: 30+ days ago • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Circle • San Francisco, CA, United States
    Full-time
    Senior Site Reliability Engineer at Circle.Circle is a financial technology company at the epicenter of the emerging internet of money. Our infrastructure—including USDC, a blockchain‑based dollar—h...Show more
    Last updated: 30+ days ago • Promoted
    Senior Site Reliability Engineer - Managed Kubernetes

    Senior Site Reliability Engineer - Managed Kubernetes

    Lambda • San Francisco, California, United States
    Remote
    Full-time
    We're here to help the smartest minds on the planet build Superintelligence.The labs pushing the edge? They run on Lambda. Our gear trains and serves their models, our infrastructure scales with the...Show more
    Last updated: 30+ days ago • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Checkr • San Francisco, California, United States
    Full-time
    Checkr is building the data platform to power safe and fair decisions.Established in 2014, Checkr’s innovative technology and robust data platform help customers assess risk and ensure safety and c...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Replit • Foster City, California, United States
    Full-time
    Replit is the fastest way to turn ideas into software.With our powerful AI-powered Agent and Assistant, anyone can create and launch apps from natural language in just one click.Build and deploy fu...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    SOLANA FOUNDATION • San Francisco, CA, United States
    Full-time
    Our mission is to bring web3 to a billion people, by providing builders with the tools they need to build exceptional onchain products. Alchemy is the only complete developer platform that offers th...Show more
    Last updated: 5 days ago • Promoted