Talent.com
Sr. Site Reliability Engineer
Sr. Site Reliability EngineerKesta IT • Culver City, CA, United States
Sr. Site Reliability Engineer

Sr. Site Reliability Engineer

Kesta IT • Culver City, CA, United States
2 days ago
Job type
  • Full-time
  • Permanent
Job description

Come build, innovate, disrupt, and thrive!

KēSTA I.T. is actively seeking a Sr. Site Reliability Engineer for an immediate full-time opportunity with our industry leading client.

Are you on the lookout for a unique career opportunity that offers leadership, responsibility, and the chance to make a significant impact? If you're eager to contribute to a thriving and stable organization while maintaining your confidentiality, continue reading.

Overview

A leading technology company in the immersive content space is searching for a Senior Site Reliability Engineer to help scale a global platform that delivers their product. This role is ideal for someone who thrives in fast-paced environments, enjoys automating at scale, and is passionate about building fault-tolerant systems that serve millions of users reliably.

In this position, you'll design and refine the backbone of our cloud infrastructure - ensuring uptime, observability, and security across multiple tenants and delivery pipelines. You'll collaborate closely with software and DevOps teams to define reliability goals, establish best practices, and develop the monitoring and automation that keep our systems healthy around the clock.

Key Responsibilities

  • Architect and automate scalable cloud infrastructure leveraging Terraform and modern container technologies such as Kubernetes .
  • Optimize global CDN performance and end-to-end content delivery pipelines to improve streaming quality and latency.
  • Build and maintain observability frameworks that include defining SLIs / SLOs, implementing alerting systems, and ensuring actionable insights through Grafana and Prometheus dashboards.
  • Establish proactive capacity planning and load testing strategies to ensure reliability during rapid growth and high traffic periods.
  • Drive continuous improvement through incident management , root cause analysis , and post-incident reviews that strengthen system resiliency.
  • Participate in on-call rotations and help define escalation workflows that uphold 24 / 7 service availability.
  • Collaborate with engineering teams to embed reliability and security principles into every stage of the deployment lifecycle.
  • Mentor team members on operational readiness, reliability patterns, and scalable system design.
  • Contribute to internal standards for compliance and data protection (SOC 2, GDPR, ISO 27001, open-source licensing).

Qualifications

  • 7+ years of hands-on experience in SRE or DevOps , focused on building reliable, distributed systems at scale.
  • Deep technical knowledge of AWS , CoreWeave , or other major cloud environments, with strong experience in container orchestration and Terraform-based automation .
  • Proven success managing multi-tenant architectures and applying best practices for data isolation, access control, and system hardening.
  • Skilled in monitoring, metrics, and tracing tools (e.g., Prometheus, Grafana) and experienced in using data-driven insights to enhance system performance.
  • Familiarity with security frameworks and automated auditing for compliance (SOC 2, GDPR, ISO 27001).
  • Strong leadership and mentorship capabilities; able to influence engineering culture and champion best practices around uptime, scalability, and operational excellence.
  • About KēSTA I.T. :

    Our name says it all; KēSTA I.T. (Keys-to-I.T.) AND our people are our keys to our success! KēSTA I.T. is a premier Utah-based technical staffing and consulting services firm. We specialize in temporary and permanent placement of Software, Hardware, Network, Cloud, CRM / ERP, Data, End-User support, Web and Executive / leadership-based positions on a full time and consulting basis. If you're interested in a role where top performance is rewarded, personal time is valued, and excellence is demanded at every level we want to talk to you today!

    Where do you want to go? We've got the keys! ~ KēSTA I.T.

    WWW.KeSTAIT.COM

    Create a job alert for this search

    Site Reliability Engineer • Culver City, CA, United States

    Related jobs
    Senior Site Reliability Engineer (SRE)

    Senior Site Reliability Engineer (SRE)

    StubHub • Los Angeles, CA, United States
    Full-time
    StubHub is on a mission to redefine the live event experience on a global scale.Whether someone is looking to attend their first event or their hundredth, we're here to delight them all the way fro...Show more
    Last updated: 2 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Diverse Lynx • Los Angeles, CA, United States
    Full-time
    Must Have Technical / Functional Skills.Experience in Cloud platforms (AWS, Azure, Google Cloud) and hybrid environments. Proficiency in container technologies (Docker, Container, Podman).Strong knowl...Show more
    Last updated: 30+ days ago • Promoted
    Lead Site Reliability Engineer (SRE)

    Lead Site Reliability Engineer (SRE)

    EPAM Systems Inc • Los Angeles, CA, United States
    Full-time
    At EPAM, we're not just building software - we're engineering excellence.Lead Site Reliability Engineer (SRE).This role is ideal for someone who thrives in fast-paced financial systems, has a passi...Show more
    Last updated: 2 days ago • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    K2 Space • Los Angeles, CA, United States
    Permanent
    K2 Space is building large, high-powered spacecraft for the next generation of space development.Backed by Lightspeed Venture Partners, Altimeter Capital, and many others ($200M raised to date), we...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer in Los Angeles

    Site Reliability Engineer in Los Angeles

    Energy Jobline ZR • Los Angeles, CA, United States
    Full-time
    Energy Jobline is the largest and fastest growing global Energy Job Board and Energy Hub.We have an audience reach of over 7 million energy professionals, 400,000+ monthly advertised global energy ...Show more
    Last updated: 2 days ago • Promoted
    Lead Site Reliability Engineer

    Lead Site Reliability Engineer

    Disqo • Los Angeles, CA, United States
    Full-time
    DISQO's mission is to build the world's most trusted ad measurement platform that fuels brand growth.The world's largest brands, agencies, and media companies trust DISQO for expert insight and AI-...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer II

    Site Reliability Engineer II

    AEG • Los Angeles, CA, United States
    Full-time
    In order to be considered for this role, after clicking "Apply Now" above and being redirected, you must fully complete the application process on the follow-up screen. AXS connects fans with the ar...Show more
    Last updated: 2 days ago • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    anduril • Costa Mesa, CA, United States
    Full-time
    Senior Site Reliability Engineer.Anduril Industries is a defense technology company with a mission to transform U.By bringing the expertise, technology, and business model of the 21st century's mos...Show more
    Last updated: 1 day ago • Promoted
    Sr Mission Reliability Engineer

    Sr Mission Reliability Engineer

    Relativity Space • Long Beach, CA, United States
    Full-time
    At Relativity Space, we're building rockets to serve today's needs and tomorrow's breakthroughs.Our Terran R vehicle will deliver customer payloads to orbit, meeting the growing demand for launch c...Show more
    Last updated: 2 days ago • Promoted
    Senior Site Reliability Engineer (Remote)

    Senior Site Reliability Engineer (Remote)

    Experian • Costa Mesa, CA, United States
    Remote
    Full-time
    Experian is a global data and technology company, powering opportunities for people and businesses around the world.We help to redefine lending practices, uncover and prevent fraud, simplify health...Show more
    Last updated: 30+ days ago • Promoted
    DevOps / Site Reliability Engineer (SRE) US

    DevOps / Site Reliability Engineer (SRE) US

    Channelwill • Pasadena, CA, United States
    Full-time
    Pasadena, California (Remote or Hybrid).SaaS company based in Pasadena, California, providing innovative post-purchase solutions for eCommerce brands. Our products help merchants improve customer ex...Show more
    Last updated: 2 days ago • Promoted
    Site Reliability Engineer, GNC (Falcon)

    Site Reliability Engineer, GNC (Falcon)

    SpaceX • Inglewood, CA, United States
    Full-time
    Site Reliability Engineer, GNC (Falcon).SpaceX was founded under the belief that a future where humanity is out exploring the stars is fundamentally more exciting than one where we are not.Today Sp...Show more
    Last updated: 1 day ago • Promoted
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    Diverse Lynx • Newport Coast, CA, United States
    Full-time
    DevOps Engineer With Strong Site Reliability Engineering Capabilities.Experienced DevOps Engineers with strong Site Reliability Engineering (SRE) capabilities who can work independently, think crit...Show more
    Last updated: 2 days ago • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Mango • Los Angeles, CA, United States
    Full-time
    We are seeking a Senior Site Reliability Engineer to own and evolve the infrastructure that supports our on-premise instruments, data systems, and machine learning pipelines.This role combines syst...Show more
    Last updated: 14 days ago • Promoted
    Sr Systems Reliability Engineer

    Sr Systems Reliability Engineer

    Disney Parks and Resorts • Glendale, CA, United States
    Full-time
    We make the impossible, possible.The Walt Disney Company is a world-class entertainment and technological leader.Walt's passion was to continuously envision new ways to move audiences around the wo...Show more
    Last updated: 2 days ago • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Mango, Inc. • Los Angeles, CA, United States
    Full-time
    We are seeking a Senior Site Reliability Engineer to own and evolve the infrastructure that supports our on-premise instruments, data systems, and machine learning pipelines.This role combines syst...Show more
    Last updated: 1 day ago • Promoted