Talent.com
Software Engineer, Site Reliability (SRE)
Software Engineer, Site Reliability (SRE)Sierra Business Solution • San Francisco, CA, US
Software Engineer, Site Reliability (SRE)

Software Engineer, Site Reliability (SRE)

Sierra Business Solution • San Francisco, CA, US
10 days ago
Job type
  • Full-time
Job description

Software Engineer, Site Reliability (SRE)

Software Engineer, Site Reliability (SRE) at Sierra Business Solution .

About Us

We are an in-person company based in San Francisco with growing offices in Atlanta, New York, and London, building a platform that helps businesses create better, more human customer experiences with AI.

Our core values are Trust, Customer Obsession, Craftsmanship, Intensity, and Family.

Company founders : Bret Taylor, former Salesforce and Facebook executive; Clay Bavor, former Google Labs leader.

What You'll Do

Own Sierra's observability stack—monitoring, alerting, logging, and tracing—to give engineers clear visibility into system health and performance.

Partner with product and platform engineers to design reliable, scalable systems from day one.

Design and implement scalable, secure cloud infrastructure (AWS) using Terraform and modern DevOps tooling.

Improve reliability and scalability of LLM deployments, ensuring robust, cost-effective operation.

Lead improvements to deployment pipelines, CI / CD tooling, and incident-management processes.

Define the foundation of SRE practices at Sierra, influencing culture, tooling, and best practices.

What You'll Bring

5+ years of hands-on experience in Site Reliability or infrastructure engineering for complex SaaS or cloud-based systems.

Experience designing for availability, scalability, and reliability at both infrastructure and application layers.

Deep experience with Terraform, AWS services, container orchestration, and cloud networking (IAM, VPC).

Strong background in observability systems (Prometheus, Grafana, Datadog, or similar).

Experience working with enterprise customers and familiarity with compliance and networking needs.

Comfortable working in fast-moving environments and collaborating across teams.

Degree in Computer Science or equivalent professional experience.

Even Better

Experience with LLM infrastructure—optimizing inference, managing fine-tuned models, or large-scale deployment.

Early-stage startup experience defining SRE culture and tooling from scratch.

Familiarity with incident-management automation or self-healing infrastructure patterns.

Benefits

Unlimited Paid Time Off

Medical, Dental, and Vision benefits

Life Insurance and Disability Benefits

401(k) retirement plan with company match

Parental Leave and fertility benefits via Carrot

Lunch, snacks, coffee, and discretionary stipend

Equity plans per applicable policies

Equality & Diversity

We actively encourage applicants of all backgrounds to apply. We strive to evaluate all applicants consistently without regard to race, color, religion, gender, sexual orientation, age, disability, veteran status, or any other protected characteristic.

J-18808-Ljbffr

Create a job alert for this search

Site Reliability Engineer Sre • San Francisco, CA, US

Related jobs
Senior Site Reliability Engineer, Compute

Senior Site Reliability Engineer, Compute

Roblox • San Mateo, California, United States
Full-time
The Infrastructure Compute Site Reliability Engineering (SRE) team's mission is to own and manage the successful operation of our underlying cell infrastructure system, along with elements of servi...Show more
Last updated: 1 day ago • Promoted
Site Reliability Engineer (SRE)

Site Reliability Engineer (SRE)

AI Fund • San Francisco, CA, United States
Full-time
Baseten powers inference for the world's most dynamic AI companies, like.As a Site Reliability Engineer, you'll envision and build robust systems and processes that ensure our infrastructure is sca...Show more
Last updated: 30+ days ago • Promoted
Software Engineer, Site Reliability

Software Engineer, Site Reliability

Fireworks AI • Redwood City, CA, United States
Full-time
Get AI-powered advice on this job and more exclusive features.Here at Fireworks, we're building the future of generative AI infrastructure. Fireworks offers the generative AI platform with the highe...Show more
Last updated: 30+ days ago • Promoted
Cloud Site Reliability Engineer (SRE)

Cloud Site Reliability Engineer (SRE)

Promise • Oakland, California, United States
Full-time +1
Promise empowers utilities and government agencies to create flexible, affordable solutions for individuals struggling with debt. Our innovative approach to payment plans and relief distribution sig...Show more
Last updated: 30+ days ago • Promoted
Software Engineer, Site Reliability (SRE)

Software Engineer, Site Reliability (SRE)

Sierra • San Francisco, CA, United States
Full-time
At Sierra, we’re creating a platform to help businesses build better, more human customer experiences with AI.We are primarily an in-person company based in San Francisco, with growing offices in A...Show more
Last updated: 30+ days ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

Together • San Francisco, CA, US
Full-time
As a Site Reliability Engineer (SRE) at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a soft...Show more
Last updated: 1 day ago • Promoted
Site Reliability Engineer (SRE)

Site Reliability Engineer (SRE)

SS&C Technologies • San Francisco, CA, United States
Full-time
SS&C Technologies is a global investment and financial services software provider, headquartered in Windsor, Connecticut, and supporting more than 28,000 employees across 35 countries.It specialize...Show more
Last updated: 30+ days ago • Promoted
Software Engineer, Site Reliability Engineer (SRE)

Software Engineer, Site Reliability Engineer (SRE)

Harvey • San Francisco, California, United States
Full-time
Harvey is a secure AI platform for legal and professional services that augments productivity and automates complex workflows. Harvey uses algorithms with reasoning-adept LLMs that have been customi...Show more
Last updated: 30+ days ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

Conductorone • San Francisco, California, United States
Full-time
ConductorOne is the modern identity governance platform that makes it possible to move beyond the limitations of legacy IGA and reduce the identity attack surface with confidence.Designed for flexi...Show more
Last updated: 30+ days ago • Promoted
Site Reliability Engineer - Inference

Site Reliability Engineer - Inference

Lambda • San Francisco, California, United States
Full-time
In 2012, Lambda started with a crew of AI engineers publishing research at top machine-learning conferences.We began as an AI company built by AI engineers. Today, we're on a mission to be the world...Show more
Last updated: 30+ days ago • Promoted
Senior Software Engineer, Site Reliability Engineer (SRE)

Senior Software Engineer, Site Reliability Engineer (SRE)

harvey.ai • San Francisco, CA, US
Full-time
Why Harvey At Harvey, we're transforming how legal and professional services operate — not incrementally, but end-to-end. By combining frontier agentic AI, an enterprise-grade platform, and deep dom...Show more
Last updated: 10 days ago • Promoted
Sr. Site Reliability Engineer

Sr. Site Reliability Engineer

Prosper • San Francisco, California, United States
Full-time
As a Senior Site Reliability Engineer (SRE) at Prosper, you will be instrumental in enhancing the reliability, scalability, and maintainability of our technology platform.This role bridges the gap ...Show more
Last updated: 30+ days ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

Zoox • Foster City, California, United States
Full-time
Zoox is looking for a platform / site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous veh...Show more
Last updated: 30+ days ago • Promoted
Systems Reliability Engineer (SRE) - Edge

Systems Reliability Engineer (SRE) - Edge

Cloudflare • San Francisco, CA, US
Full-time
Systems Reliability Engineer (SRE) - Edge About Us.At Cloudflare, we are on a mission to help build a better Internet.Today the company runs one of the world's largest networks that powers millions...Show more
Last updated: 15 hours ago • Promoted • New!
Site Reliability Engineer

Site Reliability Engineer

Replit • Foster City, California, United States
Full-time
Replit is the fastest way to turn ideas into software.With our powerful AI-powered Agent and Assistant, anyone can create and launch apps from natural language in just one click.Build and deploy fu...Show more
Last updated: 30+ days ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

Checkr • San Francisco, California, United States
Full-time
Checkr is building the data platform to power safe and fair decisions.Established in 2014, Checkr’s innovative technology and robust data platform help customers assess risk and ensure safety and c...Show more
Last updated: 30+ days ago • Promoted
Software Engineer, Site Reliability

Software Engineer, Site Reliability

DevOps projects • San Francisco, CA, United States
Full-time
Get weekly curated DevOps opportunities, salary insights, and career tips no spam, only relevant roles that match your stack and experience level. Software Engineer, Site Reliability.Harvey is a...Show more
Last updated: 21 hours ago • Promoted • New!
Site Reliability Engineer (SRE)

Site Reliability Engineer (SRE)

Baseten • San Francisco, CA, United States
Full-time
Baseten powers inference for the world's most dynamic AI companies, like OpenEvidence, Clay, Mirage, Gamma, Sourcegraph, Writer, Abridge, Bland, and Zed. By uniting applied AI research, flexible inf...Show more
Last updated: 30+ days ago • Promoted