Talent.com
Principal Site Reliability Engineer

Principal Site Reliability Engineer

DMV IT ServiceWashington, DC, US
12 hours ago
Job type
  • Full-time
  • Quick Apply
Job description

Job Title : Principal Site Reliability Engineer

Location : Washington, D.C.

Employment Type : Contract

About US :

DMV IT Service LLC, founded in 2020, is a trusted IT consulting firm specializing in IT infrastructure optimization, cybersecurity, networking, and staffing solutions. We partner with clients to achieve technology goals through expert guidance, workforce support, and innovative solutions. With a client-focused approach, we also provide online training and job placements, ensuring long-term IT success.

Job Purpose :

We are seeking a highly skilled Principal Site Reliability Engineer to lead and elevate the reliability, scalability, and security of critical infrastructure systems. This position requires a seasoned technical professional with deep expertise in infrastructure automation (IaC) , CI / CD architecture , and cloud security , combined with hands-on experience in Site Reliability Engineering (SRE) principles such as SLOs, error budgets, and incident management. The ideal candidate will provide technical leadership, mentor cross-functional teams, and ensure systems are built for performance, resilience, and efficiency.

Requirements

Key Responsibilities :

  • Reliability & Operations : Establish and manage Service Level Objectives (SLOs) and Service Level Indicators (SLIs) ; oversee incident response , root cause analysis , and continuous service improvement initiatives.
  • Infrastructure Automation : Architect and manage scalable and secure cloud infrastructures using Infrastructure-as-Code (IaC) tools such as Terraform , Ansible , and CloudFormation .
  • CI / CD Optimization : Build and optimize secure CI / CD pipelines (e.g., GitHub Actions , Jenkins ) with automated rollbacks, canary and blue-green deployments , and artifact validation processes.
  • Observability & Monitoring : Develop advanced observability systems by creating dashboards , configuring alerts , and implementing synthetic checks for complete system visibility.
  • Security Integration : Embed security testing and compliance tools (SAST, DAST, SBOM, secret scanning) into deployment workflows and enforce security policies-as-code .
  • Cost & Capacity Management : Track and optimize cloud costs , manage capacity planning , and ensure efficient infrastructure utilization and uptime.
  • Platform Enablement : Develop self-service tools and shared frameworks that enhance developer efficiency and maintain delivery consistency.
  • Leadership & Mentorship : Act as a technical leader, mentor engineering teams, and champion best practices in reliability, automation, and secure delivery.

Required Skills & Experience :

  • Bachelor’s degree in Computer Science , Engineering , or related field.
  • At least 5 years of experience in SRE, DevOps, or Platform Engineering , with leadership in reliability and automation.
  • Minimum 3 years managing production-grade cloud systems using modern security and observability tools.
  • Strong expertise in AWS , Azure , or GCP , especially in Compute, Networking, and IAM.
  • Hands-on proficiency with Terraform , CloudFormation , Kubernetes , and Docker .
  • Solid background in Linux systems , shell scripting , and programming in Python , Go , or Bash .
  • Proficient with observability tools such as Prometheus , Grafana , ELK , Datadog , or CloudWatch .
  • Proven experience designing and managing secure CI / CD pipelines and GitOps workflows .
  • Deep understanding of SRE practices , including chaos engineering , SLO / SLA management , and capacity modeling .
  • Strong documentation, communication, and leadership skills with a record of improving operational standards.
  • Create a job alert for this search

    Site Reliability Engineer • Washington, DC, US

    Related jobs
    • Promoted
    Reliability Engineer

    Reliability Engineer

    JobotFrederick, MD, US
    Full-time
    Manufacturing company hiring Reliability Engineer in Frederick County!.This Jobot Job is hosted by : Christine McNamara.Are you a fit? Easy Apply now by clicking the "Apply Now" buttonand ...Show moreLast updated: 25 days ago
    • Promoted
    Staff Site Reliability Engineer (Federal)

    Staff Site Reliability Engineer (Federal)

    OktaWashington, DC, United States
    Full-time
    Okta is The World's Identity Company.We free everyone to safely use any technology, anywhere, on any device or app.Our flexible and neutral products, Okta Platform and Auth0 Platform, provide secur...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Sr Site Reliability Engineer - Remote

    Sr Site Reliability Engineer - Remote

    SitusAMCWashington, DC, United States
    Remote
    Full-time
    SitusAMC is where the best and most passionate people come to transform our client’s businesses and their own careers.Whether you’re a real estate veteran, a passionate technologist, or looking to ...Show moreLast updated: 15 hours ago
    • Promoted
    Site Reliability Engineer (Pipeline)

    Site Reliability Engineer (Pipeline)

    Technica CorporationWashington, DC, United States
    Full-time
    At Technica Corporation, our goal is to provide exceptional professional services and innovative technology solutions that meet or exceed our customer’s expectations. We specialize in a wide range o...Show moreLast updated: 1 day ago
    • New!
    Principal Site Reliability Engineer

    Principal Site Reliability Engineer

    Black Rock GroupsWashington, DC, United States
    Full-time
    Quick Apply
    The Principal Site Reliability Engineer will be a critical technical leader responsible for driving the operational excellence, resilience, and security of our core systems for a key Randstad clien...Show moreLast updated: 15 hours ago
    • Promoted
    Sr. Manager - Site Reliability Engineer

    Sr. Manager - Site Reliability Engineer

    VisaAshburn, VA, United States
    Full-time
    Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more t...Show moreLast updated: 7 days ago
    • Promoted
    Principal Site Reliability Engineer - Cloud (Remote)

    Principal Site Reliability Engineer - Cloud (Remote)

    Donnelley Financial, LLCRockville, MD, United States
    Remote
    Full-time
    Join a dynamic team at the pulse of global markets, where we deliver innovative software and service solutions for essential financial reporting and capital markets transactions.At DFIN, we are a v...Show moreLast updated: 7 days ago
    • Promoted
    Site Reliability Engineer - Redmond WA

    Site Reliability Engineer - Redmond WA

    Redis EnterpriseWashington, DC, United States
    Full-time
    We built the product that runs the fast apps our world runs on.If you checked the weather, used your credit card, or looked at your flight status online today, you’re welcome.At Redis, you’ll work ...Show moreLast updated: 30+ days ago
    • Promoted
    Reliability Engineer

    Reliability Engineer

    Lockheed Martin CorporationBethesda, MD, United States
    Full-time
    Lockheed Martin is a global security and aerospace company that employs some of the greatest minds in the industry.They are passionate about purposeful innovation, dedicated to keeping people safe ...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    DevSecOps Site Reliability Engineer - Clearance Required

    DevSecOps Site Reliability Engineer - Clearance Required

    LMI Consulting, LLCMcLean, VA, United States
    Full-time
    DevSecOps Site Reliability Engineer - Clearance Required.Job Locations US-VA-Tysons Job ID 2025-13264 # of Openings 1 Category Information Technology Ben...Show moreLast updated: 11 hours ago
    • New!
    Site Reliability Engineer

    Site Reliability Engineer

    Black Rock GroupsWashington, DC, United States
    Full-time
    Quick Apply
    Randstad is seeking a skilled and proactive Site Reliability Engineer (SRE) to join our client in the Washington D.The ideal candidate will bridge the gap between development and...Show moreLast updated: 15 hours ago
    • Promoted
    Senior Software Engineer, Site Reliability

    Senior Software Engineer, Site Reliability

    Capital OneWashington, DC, United States
    Full-time +1
    Senior Software Engineer, Site Reliability.Do you love building and pioneering in the technology space? Do you enjoy solving complex business problems in a fast-paced, collaborative, inclusive, and...Show moreLast updated: 29 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Karsun SolutionsWashington, DC, United States
    Full-time
    Summary : As a Site Reliability Engineer, you will help build out and run production environments, automate operations and maintain and support infrastructure. Drive and establish Service level objec...Show moreLast updated: 1 day ago
    • Promoted
    • New!
    Cloud Site Reliability Engineer

    Cloud Site Reliability Engineer

    Ford Motor CompanyWashington, DC, United States
    Full-time
    Enterprise Technology is the engine driving the future of transportation.If you’re looking for the chance to leverage advanced technology to redefine the mobility landscape, enhance the customer ex...Show moreLast updated: 15 hours ago
    • Promoted
    Site Reliability Engineer, Home

    Site Reliability Engineer, Home

    Google Inc.Washington, DC, United States
    Full-time
    Experience completing work as directed, and collaborating with teammates; developing knowledge of relevant concepts and processes. At Google, we have a vision of empowerment and equitable opportunit...Show moreLast updated: 1 day ago
    • Promoted
    • New!
    Staff Site Reliability Engineer (Federal)

    Staff Site Reliability Engineer (Federal)

    Okta, Inc.Mt Rainier, MD, United States
    Full-time
    Overview Get to know Okta Okta is The World's Identity Company.We free everyone to safely use any technology, anywhere, on any device or app. Our flexible and neutral products, Okta Platform and Aut...Show moreLast updated: 6 hours ago
    • Promoted
    Section Engineer - BGE T&S Strategic Proj Eng

    Section Engineer - BGE T&S Strategic Proj Eng

    ExelonFinksburg, MD, United States
    Full-time
    Who We Are : We're powering a cleaner, brighter future.Exelon is leading the energy transformation, and we're calling all problem solvers, innovators, community builders and change makers.Work with ...Show moreLast updated: 19 days ago
    • Promoted
    Site Reliability Engineer III

    Site Reliability Engineer III

    VerisignReston, Virginia, United States
    Full-time
    Verisign helps enable the security, stability, and resiliency of the internet.We are a trusted provider of internet infrastructure services for the networked world and deliver unmatched performance...Show moreLast updated: 30+ days ago