Talent.com
Principal Site Reliability Engineer (SRE)
Principal Site Reliability Engineer (SRE)InStride • Los Angeles, CA, US
No longer accepting applications
Principal Site Reliability Engineer (SRE)

Principal Site Reliability Engineer (SRE)

InStride • Los Angeles, CA, US
15 hours ago
Job type
  • Full-time
Job description

Principal Site Reliability Engineer (SRE)

We're looking for a Principal Site Reliability Engineer (SRE) to join InStride's growing engineering team. This is a highly technical role for an individual contributor who thrives at the intersection of cloud architecture, automation, and reliability engineering . You will be the go-to AWS expert for complex initiatives, setting technical direction, and raising the bar for operational excellence across our platform. At InStride, every system you design, every automation you implement, and every safeguard you put in place will directly support our mission of expanding access to life-changing education for working adults around the globe.

Skills we'd love to see you show off :

  • Cloud Architecture & Strategy : Design and optimize AWS environments that balance scalability, resilience, and cost efficiency for enterprise workloads.
  • Technical Leadership & Mentorship : Serve as a trusted technical advisor, guiding engineers on best practices in Kubernetes, DevSecOps, and AWS-native design patterns.
  • Infrastructure as Code Mastery : Build reusable, version-controlled IaC libraries with AWS CDK, Terraform, or CloudFormation to standardize deployments.
  • Security & Compliance by Design : Enforce least-privilege IAM, encryption-by-default, and policy-as-code guardrails to meet security and regulatory standards.
  • Observability & Reliability Engineering : Define SLIs / SLOs, manage error budgets, and implement monitoring strategies with Prometheus, Grafana, and AWS-native tools.
  • CI / CD Excellence : Optimize automated pipelines with Harness and GitHub, enabling faster, safer, and more reliable software delivery.
  • Networking & Resilience : Architect secure, performant VPCs, load balancing, and multi-region failover strategies with AWS networking services.
  • Automation & Self-Service Enablement : Deliver developer-friendly automation and Internal Developer Portal (IDP) capabilities that empower teams to provision infrastructure without SRE intervention.

Who you are :

  • 10+ years of experience in SRE, DevOps, or Platform Engineering roles operating production AWS workloads.
  • Hands-on expertise with AWS EKS, Kubernetes networking, Helm, autoscaling frameworks (Karpenter / Cluster Autoscaler), serverless architectures, and API Gateways.
  • Proven delivery of service mesh solutions (Istio, Linkerd, or AWS App Mesh) for secure and observable service-to-service communication.
  • Proficiency with Infrastructure as Code (IaC) using AWS CDK (TypeScript preferred / Python), Terraform, or CloudFormation.
  • Strong programming and automation skills in Go, Python, or TypeScript, with additional proficiency in Bash.
  • Demonstrated experience implementing policy-as-code with OPA / Rego or similar tooling integrated into CI / CD pipelines.
  • Solid understanding of SLI / SLO / error-budget methodologies and hands-on experience with monitoring and alerting stacks (Prometheus, Grafana, CloudWatch, Groundcover).
  • Deep knowledge of AWS security best practices, including IAM policies, encryption, OS hardening, and compliance enforcement.
  • Excellent communication skills with the ability to translate reliability metrics into business impact and guide incident / post-mortem discussions.
  • Experience mentoring engineers and influencing enterprise AWS and DevOps strategies without direct management responsibilities.
  • Familiarity with Internal Developer Portals (Backstage, Port, Cortex) and self-service automation is a strong plus.
  • How you will create impact :

  • Elevate platform reliability : Design and operate multi-region, fault-tolerant systems that ensure InStride's learning platform is always available for learners and partners.
  • Advance automation at scale : Deliver Infrastructure as Code libraries, CI / CD pipelines, and self-service capabilities that reduce operational toil and accelerate developer productivity.
  • Champion security and compliance : Implement defense-in-depth strategies, policy-as-code guardrails, and proactive monitoring to protect sensitive data and maintain trust.
  • Drive observability maturity : Define and enforce SLIs / SLOs, establish error-budget policies, and build monitoring frameworks that inform release readiness and operational decisions.
  • Enable seamless service connectivity : Deploy and manage service mesh solutions that secure, monitor, and optimize service-to-service communication across Kubernetes workloads.
  • Influence technical direction : Partner with engineering and security stakeholders to shape InStride's AWS strategy, ensuring scalability, resilience, and cost efficiency.
  • Mentor and uplift engineers : Share expertise, lead design reviews, and guide teams toward modern DevOps and SRE practices, raising the technical bar across the organization.
  • Compensation

    At InStride, final offer amounts are dependent on multiple factors including location, depth of experience, interview performance and equity with other team members.

    We encourage you to talk with your recruiter to learn more about the total compensation and benefits available for this role.

    Compensation range :

    $165,000—$185,000 USD

    We are looking for someone who is not only technically skilled, but also enthusiastic about making a meaningful impact. If this description resonates with you, we're excited about the possibility of having you on our team. As a skills-driven employer, we encourage you to apply if there is a skill-fit, even in the absence of years of experience.

    Don't meet every single requirement? We encourage you to apply because we value diverse experience.

    J-18808-Ljbffr

    Create a job alert for this search

    Site Reliability Engineer Sre • Los Angeles, CA, US

    Related jobs
    Systems Engineer (Reliability, Maintainability & Availability – RMA)

    Systems Engineer (Reliability, Maintainability & Availability – RMA)

    G2 Ops, Inc. • El Segundo, CA, US
    Full-time
    Quick Apply
    El Segundo, CA at our customer site Work Setting : In person, some remote opportunity, and / or flexible working hours, not a fully remote position Salary Range : $105,000 – 160,000 plus com...Show more
    Last updated: 28 days ago
    Sr. Project Engineer

    Sr. Project Engineer

    Granite Construction • USA, California, La Mirada
    Full-time
    Building a career at Granite may be the most valuable thing you could do.Find your dream job today, and be part of something great. Our most powerful partnership is the one we have with our employee...Show more
    Last updated: 30+ days ago
    Lead Space Systems Engineer

    Lead Space Systems Engineer

    OffWorld • Altadena, CA, US
    Full-time
    Quick Apply
    Lead Space Systems Engineer Full time position based in Altadena, CA OffWorld is a robotics startup working on developing a mobile robotic workforce for heavy industrial jobs on E...Show more
    Last updated: 30+ days ago
    Site Manager

    Site Manager

    Illumination Health + Home • Anaheim, CA, US
    Full-time
    Quick Apply
    Every person deserves compassion, dignity, and the safety of a place to call home.Homelessness is the largest social and public health crisis in California. Illumination Health + Home (IH+H) is a gr...Show more
    Last updated: 24 days ago
    Site Superintendent

    Site Superintendent

    BrightView Landscapes • Santa Ana, CA, US
    Full-time
    The Best Teams are Created and Maintained Here.At BrightView, the best teams are created and maintained here.If you are searching for your next fulfilling career, picture yourself on a best-in-clas...Show more
    Last updated: 4 days ago
    Site Manager - Car Wash

    Site Manager - Car Wash

    BLISS Car Wash • Huntington Beach, California, United States
    Full-time
    Quick Apply
    Site Manager - Car Wash ($1,500 SIGN ON BONUS).Medical, Dental, Vision, Critical Illness & Accident Insurance Plans.We offer the opportunity for growth within the Bliss family and value each on...Show more
    Last updated: 30+ days ago
    MicroStation ORD Tech

    MicroStation ORD Tech

    T2 Utility Engineers • Huntington Beach, CA, US
    Full-time
    Join the T2 team and play an integral role in assembling engineering plan sets that depict existing underground and overhead utilities and their appurtenances for our clients.Drafting of these proj...Show more
    Last updated: 19 days ago • Promoted
    Engineering Manager

    Engineering Manager

    HEICO • Burbank, CA, US
    Full-time
    The Engineering Manager is directly responsible to the President for the engineering, planning, and coordination of the procedures relating to the Overhaul, Repair, and Inspection of Articles recei...Show more
    Last updated: 30+ days ago • Promoted
    Sr. Critical Systems Engineer

    Sr. Critical Systems Engineer

    Takeda Pharmaceutical Company Ltd • Commerce, CA, United States
    Full-time
    By clicking the “Apply” button, I understand that my employment application process with Takeda will commence and that the information I provide in my application will be processed in line with Tak...Show more
    Last updated: 10 days ago • Promoted
    Systems Engineer

    Systems Engineer

    Confidential • Long Beach, California, United States
    Full-time
    Quality Control Specialist - Conduct systematic QC.Monitor inventory, storage, & QC.Conduct training sessions for Store Leads on QC. Wrk w / Store Leads & Team Members to proactively address qual.Lea...Show more
    Last updated: 1 day ago • Promoted
    Linux Systems Engineer

    Linux Systems Engineer

    MetroSys • Long Beach, CA, US
    Full-time
    Quick Apply
    Platform Management & Optimization : Support the implementation, administration, and maintenance of Kubernetes, Linux, and VMware environments. Automate build, testing, and deployment of infrastr...Show more
    Last updated: 30+ days ago
    Stormwater Engineer

    Stormwater Engineer

    Tait & Associates, Inc. • Santa Ana, CA, US
    Full-time
    Quick Apply
    About TAIT Welcome to TAIT, where innovation meets legacy!.As a premier civil engineering, architectural design, and real estate development firm, we're not just shaping skylines; we're developing ...Show more
    Last updated: 30+ days ago
    Leading Building Engineer

    Leading Building Engineer

    JLL • Costa Mesa, CA, US
    Full-time
    Leading Building Engineer JLL.Our people at JLL and JLL Technologies are shaping the future of real estate for a better world by combining world class services, advisory and technology for our cli...Show more
    Last updated: 12 days ago • Promoted
    Plant Engineering & Reliability Manager

    Plant Engineering & Reliability Manager

    Novolex Corporate • Santa Fe Springs, CA, US
    Full-time
    Quick Apply
    Lead end-to-end engineering excellence across the plant.Own the strategy and day-to-day leadership of Engineering, Extrusion, Maintenance, and Tooling. You will steer production performance, elevate...Show more
    Last updated: 19 hours ago • New!
    10850 – Sr. Platform Engineer (Hadoop Admin)

    10850 – Sr. Platform Engineer (Hadoop Admin)

    Hyundai Autoever America • Fountain Valley, CA, US
    Full-time
    Quick Apply
    Hyundai AutoEver America is seeking a highly experienced Senior or Lead Platform Engineer / Site Reliability Engineer (SRE) / Hadoop Admin to manage and enhance our petabyte-scale, on-premises data pla...Show more
    Last updated: 30+ days ago
    Short Range Air Defense System Repairer

    Short Range Air Defense System Repairer

    United States Army • Acton, CA, US
    Part-time +1
    Short Range Air Defense Systems Repairer Now Hiring Full and Part Time Positions You'll become an expert in maintaining and repairing cutting-edge air defense technology designed to protect against...Show more
    Last updated: 30+ days ago • Promoted
    Lead Quality Engineer

    Lead Quality Engineer

    Skylimit Systems • Simi Valley, CA, US
    Full-time
    We are seeking a highly skilled Lead Quality Engineer to oversee and drive quality assurance initiatives within our aerospace and defense manufacturing operations.The ideal candidate will...Show more
    Last updated: 30+ days ago • Promoted
    Site Manager

    Site Manager

    Illumination Foundation • Anaheim, CA, US
    Full-time
    Every person deserves compassion, dignity, and the safety of a place to call home.Homelessness is the largest social and public health crisis in California. Illumination Health + Home (IH+H) is a gr...Show more
    Last updated: 23 days ago • Promoted