Talent.com
Software Engineer, Site Reliability (SRE)
Software Engineer, Site Reliability (SRE)Sierra Business Solution • San Francisco, CA, US
Software Engineer, Site Reliability (SRE)

Software Engineer, Site Reliability (SRE)

Sierra Business Solution • San Francisco, CA, US
11 days ago
Job type
  • Full-time
Job description

Software Engineer, Site Reliability (SRE)

Software Engineer, Site Reliability (SRE) at Sierra Business Solution .

About Us

We are an in-person company based in San Francisco with growing offices in Atlanta, New York, and London, building a platform that helps businesses create better, more human customer experiences with AI.

Our core values are Trust, Customer Obsession, Craftsmanship, Intensity, and Family.

Company founders : Bret Taylor, former Salesforce and Facebook executive; Clay Bavor, former Google Labs leader.

What You'll Do

Own Sierra's observability stack—monitoring, alerting, logging, and tracing—to give engineers clear visibility into system health and performance.

Partner with product and platform engineers to design reliable, scalable systems from day one.

Design and implement scalable, secure cloud infrastructure (AWS) using Terraform and modern DevOps tooling.

Improve reliability and scalability of LLM deployments, ensuring robust, cost-effective operation.

Lead improvements to deployment pipelines, CI / CD tooling, and incident-management processes.

Define the foundation of SRE practices at Sierra, influencing culture, tooling, and best practices.

What You'll Bring

5+ years of hands-on experience in Site Reliability or infrastructure engineering for complex SaaS or cloud-based systems.

Experience designing for availability, scalability, and reliability at both infrastructure and application layers.

Deep experience with Terraform, AWS services, container orchestration, and cloud networking (IAM, VPC).

Strong background in observability systems (Prometheus, Grafana, Datadog, or similar).

Experience working with enterprise customers and familiarity with compliance and networking needs.

Comfortable working in fast-moving environments and collaborating across teams.

Degree in Computer Science or equivalent professional experience.

Even Better

Experience with LLM infrastructure—optimizing inference, managing fine-tuned models, or large-scale deployment.

Early-stage startup experience defining SRE culture and tooling from scratch.

Familiarity with incident-management automation or self-healing infrastructure patterns.

Benefits

Unlimited Paid Time Off

Medical, Dental, and Vision benefits

Life Insurance and Disability Benefits

401(k) retirement plan with company match

Parental Leave and fertility benefits via Carrot

Lunch, snacks, coffee, and discretionary stipend

Equity plans per applicable policies

Equality & Diversity

We actively encourage applicants of all backgrounds to apply. We strive to evaluate all applicants consistently without regard to race, color, religion, gender, sexual orientation, age, disability, veteran status, or any other protected characteristic.

J-18808-Ljbffr

Create a job alert for this search

Site Reliability Engineer Sre • San Francisco, CA, US

Related jobs
Site Reliability Engineer (SRE)

Site Reliability Engineer (SRE)

AI Fund • San Francisco, CA, United States
Full-time
Baseten powers inference for the world's most dynamic AI companies, like.As a Site Reliability Engineer, you'll envision and build robust systems and processes that ensure our infrastructure is sca...Show more
Last updated: 30+ days ago • Promoted
Software Engineer, Site Reliability

Software Engineer, Site Reliability

Fireworks AI • Redwood City, CA, United States
Full-time
Get AI-powered advice on this job and more exclusive features.Here at Fireworks, we're building the future of generative AI infrastructure. Fireworks offers the generative AI platform with the highe...Show more
Last updated: 30+ days ago • Promoted
Principal Site Reliability Engineer

Principal Site Reliability Engineer

Harrison Clarke • San Francisco, CA, US
Full-time
Harrison Clarke are working with several high profile companies that are seeking a Principal Site Reliability Engineer (SRE) , to lead the design, implementation, and scaling of the infrastructur...Show more
Last updated: 30+ days ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

DevOps projects • Berkeley, CA, United States
Full-time
LMArena is an engineering-first startup redefining how the world evaluates large language models.Created in 2023 by UC Berkeley researchers, our neutral, community-driven benchmarking platform attr...Show more
Last updated: 7 days ago • Promoted
Senior Software Engineer, Site Reliability

Senior Software Engineer, Site Reliability

startups • San Francisco, CA, United States
Full-time
Parabola is a workflow builder that makes it easy to organize and transform messy data from anywhere—even PDFs, emails, and spreadsheets—so that forward-thinking teams can automate the work they th...Show more
Last updated: 29 days ago • Promoted
Software Engineer, Site Reliability (SRE)

Software Engineer, Site Reliability (SRE)

Sierra • San Francisco, CA, United States
Full-time
At Sierra, we’re creating a platform to help businesses build better, more human customer experiences with AI.We are primarily an in-person company based in San Francisco, with growing offices in A...Show more
Last updated: 30+ days ago • Promoted
Site Reliability Engineer (SRE)

Site Reliability Engineer (SRE)

SS&C Technologies • San Francisco, CA, United States
Full-time
SS&C Technologies is a global investment and financial services software provider, headquartered in Windsor, Connecticut, and supporting more than 28,000 employees across 35 countries.It specialize...Show more
Last updated: 30+ days ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

Together • San Francisco, CA, US
Full-time
As a Site Reliability Engineer (SRE) at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a soft...Show more
Last updated: 1 day ago • Promoted
Staff Site Reliability Engineer : Scale & Resiliency Leader

Staff Site Reliability Engineer : Scale & Resiliency Leader

Checkr • San Francisco, CA, US
Full-time
A leading technology company in California is seeking a Staff Site Reliability Engineer to enhance the reliability of its products. The role involves architectural discussions, troubleshooting, and ...Show more
Last updated: 5 hours ago • Promoted • New!
Senior Software Engineer, Site Reliability Engineer (SRE)

Senior Software Engineer, Site Reliability Engineer (SRE)

harvey.ai • San Francisco, CA, US
Full-time
Why Harvey At Harvey, we're transforming how legal and professional services operate — not incrementally, but end-to-end. By combining frontier agentic AI, an enterprise-grade platform, and deep dom...Show more
Last updated: 11 days ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

Together AI • San Francisco, CA, United States
Full-time
As a Site Reliability Engineer (SRE) at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a soft...Show more
Last updated: 30+ days ago • Promoted
Systems Reliability Engineer (SRE) - Edge

Systems Reliability Engineer (SRE) - Edge

Cloudflare • San Francisco, CA, US
Full-time
Systems Reliability Engineer (SRE) - Edge About Us.At Cloudflare, we are on a mission to help build a better Internet.Today the company runs one of the world's largest networks that powers millions...Show more
Last updated: 1 day ago • Promoted
Senior Site Reliability Engineer

Senior Site Reliability Engineer

Alembic • San Francisco, CA, United States
Full-time
We’re looking for an experienced.Site Reliability Engineer (SRE).You’ll partner with engineers and data scientists to build, automate, and maintain the infrastructure that powers our core platform—...Show more
Last updated: 9 days ago • Promoted
Site Reliability Engineer (SRE)

Site Reliability Engineer (SRE)

Air Apps • San Francisco, CA, United States
Full-time
Site Reliability Engineer (SRE).Site Reliability Engineer (SRE).Get AI-powered advice on this job and more exclusive features. At Air Apps, we believe in thinking bigger—and moving faster.We’re a fa...Show more
Last updated: 30+ days ago • Promoted
Site Reliability Engineer I

Site Reliability Engineer I

Prosper • San Francisco, CA, United States
Full-time
As a Site Reliability Engineer I at Prosper, you will play a crucial role in enhancing the reliability, scalability, and maintainability of our technology platform. This entry-level position is desi...Show more
Last updated: 30+ days ago • Promoted
Software Engineer (Site Reliability Engineer)

Software Engineer (Site Reliability Engineer)

Anyscale • San Francisco, CA, United States
Full-time
Software Engineer (Site Reliability Engineer).Software Engineer (Site Reliability Engineer).At Anyscale, we're on a mission to democratize distributed computing and make it accessible to software d...Show more
Last updated: 30+ days ago • Promoted
Software Engineer, Reliability

Software Engineer, Reliability

OpenAI • San Francisco, CA, United States
Full-time
Join the engineering teams that bring OpenAI's ideas safely to the world!!.The Applied Engineering team works across research, engineering, product, and design to bring OpenAI's technology to consu...Show more
Last updated: 30+ days ago • Promoted
Site Reliability Engineer (SRE)

Site Reliability Engineer (SRE)

Baseten • San Francisco, CA, United States
Full-time
Baseten powers inference for the world's most dynamic AI companies, like OpenEvidence, Clay, Mirage, Gamma, Sourcegraph, Writer, Abridge, Bland, and Zed. By uniting applied AI research, flexible inf...Show more
Last updated: 30+ days ago • Promoted