Talent.com
Site Reliability Engineer
Site Reliability EngineerP2P • San Francisco, CA, United States
Site Reliability Engineer

Site Reliability Engineer

P2P • San Francisco, CA, United States
30+ days ago
Job type
  • Full-time
Job description

Our mission is to bring web3 to a billion people, by providing builders with the tools they need to build exceptional onchain products. Alchemy is the only complete developer platform that offers the powerful APIs, SDKs, and tools necessary to build and scale onchain apps and rollups.

Our infrastructure powers 70% of the top web3 teams, 90%+ of web2 companies building in web3 and 100+ million end users. Our customers include top web3 brands like Polymarket, OpenSea, Circle, WorldCoin, as well as major global brands like Shopify and Adobe.

The Alchemy team draws from decades of deep expertise in massively scalable infrastructure, AI, and blockchain from leadership roles at leading companies and universities like Google, Microsoft, Facebook, Stanford, and MIT.

We're backed by the world's leading VCs and institutions, including : Lightspeed, Silver Lake, a16z, Coatue, Pantera, Addition, Stanford University, Coinbase, and Charles Schwab, among others.

The Role

As an engineer in the Infrastructure department at Alchemy, you will collaborate with our engineering team to design, deploy, and continuously improve the infrastructure supporting our globally used developer platform. Your focus will be on enhancing developer productivity and ensuring product reliability as we scale.

The Infrastructure team’s mission is to provide the infrastructure, tooling and expertise needed to allow Alchemy engineers to ship, scale and operate high quality products to our customers in a fast, safe and cost efficient manner.

Come and help us build, maintain and scale the underlying infrastructure that is required to build products that delight our customers when it comes to reliability, latency and cost.

What You'll Do :

  • Set high standards for Reliability at Alchemy
  • Develop and own company wide Reliability best practices like SLO definition, incident management, postmortem reviews, launch readiness reviews, change management
  • Architect production infrastructure and tools that encourage and enforces high reliability
  • Inspire the broader engineering organization to ensure Reliability is a first class citizen in the products we build
  • Collaborate, partner, advise, review and mentor engineering teams on Reliability topics like high reliability architecture, observability, safe change management
  • Improve critical infrastructure and systems that are used to operate infrastructure at scale (i.e. compute, networking, deployment, observability, code tooling / libraries etc.)
  • Develop and own best practices for managing production infrastructure : provisioning, application scaling, configuration management, capacity planning, monitoring, etc.
  • Develop and own best practices for developer processes : CI / CD, dev and staging environments, etc.
  • Provide input into long-term platform requirements and operational guidelines with a focus on reliability
  • Continuously raise our standard of engineering excellence by implementing best practices for coding, testing, and deployment
  • Build and maintain documentation around process and workflows

What We're Looking For :

  • 5+ years of experience as an Infrastructure Engineer focused on Reliability (e.g., Site Reliability Engineer, Production Engineer, Platform Engineer)
  • Experience leading and driving company wide reliability efforts and engineering initiatives
  • Experience with observability best practices and tooling like Prometheus, Grafana and Datadog
  • Experience designing and operating large-scale, multi-region production systems
  • Experience working with AWS or other cloud infrastructures
  • Experience with container schedules and runtimes such as Docker and Kubernetes
  • Experience with Infrastructure-as-Code (e.g. Terraform, Pulumi, Chef, Puppet, etc)
  • The cross-functional nature of this role requires strong communication and collaboration skills
  • (Preferred) Experience with running production services on bare-metal
  • (Preferred) Experience with Typescript and Python
  • (Preferred) Excellent understanding of web applications and architecture
  • More on The Role

    Alchemy is committed to offering competitive compensation, including base salary as well as equity. Additionally, Alchemy offers comprehensive medical, dental, and vision coverage, as well as other benefits such as 401k and unlimited flexible time off.

    The base salary range for this position is estimated to be between $135,000 - $275,000 annually. Please note this range reflects base salary only, and does not include bonus, equity, or benefits. Your salary will be determined by various factors, including relevant experience, skill set, qualifications, and other business needs.

    #J-18808-Ljbffr

    Create a job alert for this search

    Site Reliability Engineer • San Francisco, CA, United States

    Related jobs
    Site Reliability Engineer

    Site Reliability Engineer

    ConductorOne • San Francisco, CA, United States
    Full-time
    ConductorOne is the first AI-native identity security platform that protects every identity : human, non-human, and AI.With powerful automation, platform-level AI, and out-of-the-box connectors, it ...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer I

    Site Reliability Engineer I

    Prosper • San Francisco, CA, United States
    Full-time
    As a Site Reliability Engineer I at Prosper, you will play a crucial role in enhancing the reliability, scalability, and maintainability of our technology platform. This entry-level position is desi...Show more
    Last updated: 29 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Together • San Francisco, CA, US
    Full-time
    As a Site Reliability Engineer (SRE) at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a soft...Show more
    Last updated: 1 day ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Conductorone • San Francisco, California, United States
    Full-time
    ConductorOne is the modern identity governance platform that makes it possible to move beyond the limitations of legacy IGA and reduce the identity attack surface with confidence.Designed for flexi...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer - Inference

    Site Reliability Engineer - Inference

    Lambda • San Francisco, California, United States
    Full-time
    In 2012, Lambda started with a crew of AI engineers publishing research at top machine-learning conferences.We began as an AI company built by AI engineers. Today, we're on a mission to be the world...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Baseten • San Francisco, California, United States
    Full-time
    We’re a growing team of builders backed by top-tier investors, including.ML teams at enterprises and category-defining AI-native companies like. Baseten to power their core production workloads with...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Zoox • Foster City, California, United States
    Full-time
    Zoox is looking for a platform / site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous veh...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Workos • San Francisco, California, United States
    Remote
    Full-time
    WorkOS builds tools and services for developers to help them implement authentication, identity, authorization, and overall enterprise readiness. We’re a fully distributed team with employees across...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    DevOps projects • San Francisco, CA, United States
    Full-time
    Lambda is the #1 GPU Cloud for ML / AI teams training, fine-tuning and inferencing AI models, where engineers can easily, securely and affordably build, test and deploy AI products at scale.Lambda’s ...Show more
    Last updated: 6 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Redwood Materials • San Francisco, California, United States
    Full-time
    Redwood is localizing a global battery supply chain that seamlessly integrates recovery, reuse, and recycling .Responsibilities will include : . Collect business & technical requirements and work wit...Show more
    Last updated: 30+ days ago • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Loft Orbital Solutions • San Francisco, California, United States
    Full-time
    Loft Orbital builds a space infrastructure providing a fast & simple path to orbit.We operate satellites, fly customer payloads onboard and handle the entire mission from initial concept through in...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Checkr • San Francisco, California, United States
    Full-time
    Checkr is building the data platform to power safe and fair decisions.Established in 2014, Checkr’s innovative technology and robust data platform help customers assess risk and ensure safety and c...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Replit • Foster City, California, United States
    Full-time
    Replit is the fastest way to turn ideas into software.With our powerful AI-powered Agent and Assistant, anyone can create and launch apps from natural language in just one click.Build and deploy fu...Show more
    Last updated: 30+ days ago • Promoted
    Lead Site Reliability Engineer

    Lead Site Reliability Engineer

    Visa • Foster City, California, United States
    Full-time
    Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more t...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Latent • San Francisco, California, United States
    Full-time
    San Francisco, CA (5 Days In-Office).You are the infrastructure expert who enables our rapid product development and guarantees. AI platform for major health systems.Your focus on operational excell...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Speak • San Francisco, CA, United States
    Full-time
    Our mission is to reinvent the way people learn, starting with language.Learning a language can change a life by opening doors to new cultures, careers, and communities. Two billion people around th...Show more
    Last updated: 7 days ago • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Alembic • San Francisco, California, United States
    Full-time
    We’re looking for an experienced.Site Reliability Engineer (SRE).You’ll partner with engineers and data scientists to build, automate, and maintain the infrastructure that powers our core platform—...Show more
    Last updated: 12 days ago • Promoted
    Senior / Lead Site Reliability Engineer Federal

    Senior / Lead Site Reliability Engineer Federal

    C3 Ai • Redwood City, California, United States
    Full-time
    C3 AI (NYSE : AI), is the Enterprise AI application software company.C3 AI delivers a family of fully integrated products including the C3 Agentic AI Platform, an end-to-end platform for developing,...Show more
    Last updated: 30+ days ago • Promoted