Talent.com
Senior Site Reliability Engineer
Senior Site Reliability EngineerAhold Delhaize USA • Quincy, MA, US
Senior Site Reliability Engineer

Senior Site Reliability Engineer

Ahold Delhaize USA • Quincy, MA, US
22 hours ago
Job type
  • Full-time
Job description

Ahold Delhaize USA, a division of global food retailer Ahold Delhaize, is part of the U.S. family of brands, which includes five leading omnichannel grocery brands - Food Lion, Giant Food, The GIANT Company, Hannaford and Stop & Shop.

The Site Reliability Engineer (SRE) III is responsible for ensuring the scalability, reliability, and performance of production systems through automation, observability, incident response, and infrastructure engineering. This role involves designing and implementing robust operational processes and tooling to support highly available, fault-tolerant systems in a cloud-native environment.

The SRE III collaborates closely with engineering squads, product teams, and stakeholders to embed reliability best practices across the software delivery lifecycle. The role includes ownership of system uptime, service level objectives (SLOs), and operational excellence, along with mentoring junior engineers and leading cross-functional initiatives that improve system resilience.

Our flexible / hybrid work schedule includes 3 in-person days at either our Chicago, IL office, Quincy, MA office, or Salisbury, NC office and 2 remote days.

Applicants must be currently authorized to work in the United States on a full-time basis.

Responsibilities

  • Design and implement infrastructure solutions that ensure system availability, scalability, and reliability across cloud-native environments like AKS and Kubernetes.
  • Develop automation for provisioning, deployment, configuration, monitoring, and incident remediation using tools such as Terraform, ArgoCD, and GitHub Actions.
  • Collaborate with engineering teams to define and track service level objectives (SLOs) and service level indicators (SLIs).
  • Build and manage microservices-based platforms leveraging Spring Boot, Java, Tomcat, and Redis.
  • Monitor production environments using Datadog and proactively address performance and reliability issues.
  • Perform root cause analysis and lead post-incident reviews to drive continual improvement.
  • Manage CI / CD pipelines and deployment automation using GitHub, Docker, and container orchestration technologies.
  • Create and maintain infrastructure as code (IaC) using Terraform, with deployment pipelines integrated into GitOps workflows.
  • Lead and support operational readiness reviews, game days, chaos engineering practices, and failure mode analysis.
  • Build scalable observability and alerting frameworks with Datadog.
  • Implement resilient, asynchronous architectures using Kafka for event-driven services.
  • Reduce operational toil through self-healing automation and proactive system tuning.
  • Troubleshoot Linux-based environments such as Ubuntu and optimize them for reliability.
  • Provide on-call support and ensure 24 / 7 / 365 system reliability for mission-critical applications.
  • Collaborate with the security team to enforce secure operational practices and cloud compliance.
  • Mentor junior engineers and contribute to documentation, technical design, and knowledge-sharing across the organization.

Qualifications

  • Bachelor's Degree in Computer Science, Information Systems, or a related technical field; equivalent training, certifications, or experience will be considered.
  • 5+ years of experience in a Site Reliability Engineering, or DevOps, or Java programming role.
  • Experience managing production-grade systems and services on AKS / Kubernetes in distributed environments.
  • Proficiency in programming and scripting languages including Python, Java, Bash, or Go.
  • Proven experience with Spring Boot, Tomcat, Redis, and microservices architecture.
  • Hands-on experience in managing Linux environments, particularly Ubuntu.
  • Proficiency with observability stacks and performance monitoring using Datadog, Prometheus, and ELK.
  • Deep understanding of containerization and orchestration using Docker, Kubernetes, and ArgoCD.
  • Experience managing event-driven systems using Kafka.
  • Expertise in IaC and automation using Terraform and GitHub Actions.
  • Familiarity with networking concepts, DNS, load balancing, and cloud infrastructure (AWS, Azure, or GCP).
  • Strong analytical, debugging, and problem-solving skills.
  • Excellent verbal and written communication skills and the ability to collaborate effectively across teams.
  • Salary Range : $125,040 - $187,560

    Ahold Delhaize USA is an equal opportunities employer.

    J-18808-Ljbffr

    Create a job alert for this search

    Senior Site Reliability Engineer • Quincy, MA, US

    Related jobs
    Senior System Reliability Analysis Engineer

    Senior System Reliability Analysis Engineer

    Draper Labs • Cambridge, MA, United States
    Full-time
    Draper is an independent, nonprofit research and development company headquartered in Cambridge, MA.The 2,000+ employees of Draper tackle important national challenges with a promise of delivering ...Show more
    Last updated: 30+ days ago • Promoted
    Sr. Reliability Engineer

    Sr. Reliability Engineer

    Raytheon • Tewksbury, Massachusetts, US
    Full-time
    Date Posted : 2025-10-06 Country : United States of America Location : MA133 : Tewksbury, Ma Bldg 3 Concord 50 Apple Hill Drive Concord - Building 3, Tewksbury, MA, 01876 USA Position Role Type : Onsite...Show more
    Last updated: 30+ days ago • Promoted
    Lead Substation Protection and Control Engineer

    Lead Substation Protection and Control Engineer

    Leidos Inc • Framingham, MA, United States
    Full-time
    Looking for an opportunity to make an impact?.Everything we do is built on a commitment to do the right thing for our customers, our people, and our community. Our Mission, Vision, and Values guide ...Show more
    Last updated: 30+ days ago • Promoted
    Senior Engineer, Reliability Engineering

    Senior Engineer, Reliability Engineering

    Raytheon • Tewksbury, MA, United States
    Full-time
    MA134 : Innovation Dr Tewks Bdg 400 836 North Street Building 400, Tewksbury, MA, 01876 USA.Person, or Immigration Status Requirements : . At Raytheon, the foundation of everything we do is rooted in o...Show more
    Last updated: 7 days ago • Promoted
    Senior Manager, Site Reliability Engineering

    Senior Manager, Site Reliability Engineering

    Xometry • Boston, MA, US
    Full-time
    Xometry (NASDAQ : XMTR) powers the industries of today and tomorrow by connecting the people with big ideas to the manufacturers who can bring them to life. Xometry's digital marketplace gives ma...Show more
    Last updated: 30+ days ago • Promoted
    Sales Representative (Remote)

    Sales Representative (Remote)

    American Income Life • Scituate, MA, US
    Remote
    Full-time
    A Sales Career That Grows With You.Are you looking for a career path that gives you the freedom and flexibility to control your schedule, but also has the security and stability of a large company?...Show more
    Last updated: 30+ days ago • Promoted
    Senior Relay Settings Engineer

    Senior Relay Settings Engineer

    Leidos Inc • Framingham, MA, United States
    Full-time
    Looking for an opportunity to make an impact?.We empower our teams, contribute to our communities, and operate sustainably. Everything we do is built on a commitment to do the right thing for our cu...Show more
    Last updated: 30+ days ago • Promoted
    Software Reliability Engineer

    Software Reliability Engineer

    Raft • Hanscom Air Force Base, MA, United States
    Full-time
    All of the programs we support require.All work must be conducted within the continental U.Distributed Data Systems, Platforms at Scale, and Complex Application Development, with headquarters in Mc...Show more
    Last updated: 16 days ago • Promoted
    Reliability Engineering Co-Op - Spring 2026

    Reliability Engineering Co-Op - Spring 2026

    Entegris • Billerica, MA, United States
    Full-time
    Reliability Engineering Co-Op - Spring 2026.Reliability Engineering Co-Op - Spring 2026 Here at Entegris, we use advanced science to enable technologies that transform the world, and we are seeking...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Cimulate • Boston, MA, United States
    Full-time
    In this pivotal role, you’ll own the reliability, availability, and performance of our SaaS production environment—monitoring critical systems, managing deployments, and ensuring seamless operation...Show more
    Last updated: 2 hours ago • Promoted • New!
    Director of Site Reliability Engineering

    Director of Site Reliability Engineering

    Oscar • Boston, MA, United States
    Full-time +1
    My client is searching for a Director of Site Reliability Engineering to play a pivotal role in scaling operations, strengthening platform reliability, and shaping the long-term DevOps vision.This ...Show more
    Last updated: 23 days ago • Promoted
    Lead Semiconductor Reliability Engineer

    Lead Semiconductor Reliability Engineer

    Raytheon • Tewksbury, MA, United States
    Full-time
    MA112 : Andover MA 358 Lowell St Dukes 358 Lowell Street Dukes, Andover, MA, 01810 USA.Person, or Immigration Status Requirements : . The ability to obtain and maintain a U.At Raytheon, the foundation ...Show more
    Last updated: 30+ days ago • Promoted
    Mover (Taskrabbit)

    Mover (Taskrabbit)

    Taskrabbit • Rockport, MA, US
    Full-time
    Taskrabbit is looking for capable, hardworking individuals to join our global network of independent service providers, who we call Taskers. Whether you're experienced with physical labor or you...Show more
    Last updated: 30+ days ago • Promoted
    shift manager - Store# 07885, COHASSET

    shift manager - Store# 07885, COHASSET

    Starbucks • Cohasset, MA, US
    Full-time +1
    Join us and inspire with every cup!.At Starbucks, it’s all about connection.People are at the heart of who we are, especially the people that are a part of our store team.We connect...Show more
    Last updated: 3 days ago • Promoted
    Facilities Engineer

    Facilities Engineer

    TekWissen • East Walpole, MA, United States
    Temporary
    Job Title : Facilities Engineer 1.Job Type : Temporary Assignment.TekWissen is a global workforce management provider headquartered in Ann Arbor, Michigan, offering strategic talent solutions to clie...Show more
    Last updated: 1 day ago • Promoted
    Sr. Manager - Site Reliability Engineering (SRE)

    Sr. Manager - Site Reliability Engineering (SRE)

    1010 Analog Devices Inc. • Wilmington, MA, United States
    Full-time +1
    NASDAQ : ADI ) is a global semiconductor leader that bridges the physical and digital worlds to enable breakthroughs at the Intelligent Edge. ADI combines analog, digital, and software technologie...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    LogRocket, Inc • Boston, MA, United States
    Full-time
    LogRocket is an equal opportunity employer.We celebrate diversity and are committed to creating an inclusive environment for all employees. LogRocket will consider sponsoring visas for applicants in...Show more
    Last updated: 5 days ago • Promoted
    Utilities / Facilities Site Leader (R&D Site)

    Utilities / Facilities Site Leader (R&D Site)

    Mentor Technical Group • Boston, MA, US
    Full-time
    Mentor Technical Group Job Opportunity.Mentor Technical Group (MTG) provides a comprehensive portfolio of technical support and solutions for the FDA-regulated industry. As a world leader in life sc...Show more
    Last updated: 30+ days ago • Promoted
    Senior Systems Engineer, Hypersonic Weapons

    Senior Systems Engineer, Hypersonic Weapons

    Draper Labs • Cambridge, MA, United States
    Full-time
    Draper is an independent, nonprofit research and development company headquartered in Cambridge, MA.The 2,000+ employees of Draper tackle important national challenges with a promise of delivering ...Show more
    Last updated: 30+ days ago • Promoted
    Senior Substation Site Civil Engineer

    Senior Substation Site Civil Engineer

    Leidos Inc • Framingham, MA, United States
    Full-time
    Senior Substation Site Civil Engineer.This is an excellent career opportunity for an enthusiastic, energetic and talented person to join a team of outstanding professionals and the opportunity to a...Show more
    Last updated: 30+ days ago • Promoted