This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior Site Reliability Engineer in United States.
We are seeking an experienced Senior Site Reliability Engineer to design, build, and maintain highly available, secure, and scalable systems that power production and machine learning workloads. In this role, you will collaborate closely with software engineers, data scientists, and platform architects to ensure reliability, performance, and operational efficiency across Kubernetes-based environments. You will lead initiatives to optimize microservices, infrastructure, and ML workflows, driving improvements in system architecture, observability, and automation. This position offers the opportunity to shape production-grade infrastructure, mentor junior engineers, and make a measurable impact on service reliability and operational excellence. The ideal candidate thrives in a fast-paced environment where software engineering and infrastructure expertise intersect.
Accountabilities
- Design, implement, and maintain cloud-native infrastructure on Kubernetes (EKS / GKE / AKS) for production systems.
- Architect and manage microservice deployments, ensuring reliable CI / CD pipelines and service performance.
- Collaborate with ML and Data teams to design, optimize, and monitor ML / AI workflows using Databricks, Spark, Flyte, Airflow, or similar tools.
- Establish and enforce SLOs / SLIs, conduct incident postmortems, and enhance system reliability and developer velocity.
- Lead improvements in architecture focusing on scalability, fault tolerance, performance, and cost optimization.
- Support secure infrastructure practices, including IAM, secret management, policy-as-code, and compliance controls.
- Mentor junior engineers and contribute to best practices across observability, infrastructure-as-code, and production readiness.
- Perform additional related duties as required.
Requirements
Bachelor’s degree in a related field or equivalent work experience.8+ years of experience in software, systems, or DevOps engineering.Strong expertise in Kubernetes deployment, scaling, networking, monitoring, and debugging.Proficiency in Golang and Python.Solid understanding of distributed systems, cloud architecture, and container orchestration.Experience building and maintaining microservice-based architectures in production.Familiarity with CI / CD pipelines (GitLab CI, ArgoCD, Flux, or similar).Deep experience with monitoring / observability tools (Datadog, Prometheus, Grafana, OpenTelemetry).Experience designing or operating ML workflows and data pipelines is preferred.Background in system design or infrastructure architecture is a plus.Exposure to multi-cloud environments (AWS, GCP, Azure) is a plus.Knowledge of security, compliance, and automation in production-grade systems is preferred.Contributions to open-source projects or internal platform tooling, or experience leading SRE transformations, are a plus.Familiarity with service meshes (Istio, Linkerd) and API gateways (Kong, Envoy) is a plus.Benefits
Competitive salary range : $138,000–$213,000.Performance-based bonuses and stock options.Unlimited paid time off.Health, dental, and vision coverage.Remote or hybrid work flexibility.Opportunities for professional growth and impact in a collaborative environment.Jobgether is a Talent Matching Platform that partners with companies worldwide to efficiently connect top talent with the right opportunities through AI-driven job matching.
When you apply, your profile goes through our AI-powered screening process designed to identify top talent efficiently and fairly.
🔍 Our AI evaluates your CV and LinkedIn profile thoroughly, analyzing your skills, experience, and achievements.
📊 It compares your profile to the job’s core requirements and past success factors to determine your match score.
🎯 Based on this analysis, we automatically shortlist the three candidates with the highest match to the role.
🧠 When necessary, our human team may perform an additional manual review to ensure no strong profile is missed.
The process is transparent, skills-based, and free of bias — focusing solely on your fit for the role. Once the shortlist is completed, we share it directly with the company that owns the job opening. Their internal hiring team then makes the final decision and manages next steps such as interviews or further assessments.
Thank you for your interest!
#LI-CL1