Site Reliability EngineeringTechniPros • Atlanta, Georgia, USA

Site Reliability Engineering

TechniPros • Atlanta, Georgia, USA

Hace 9 días

Tipo de contrato

A tiempo completo

Descripción del trabajo

Job Title : Site Reliability Engineering (SRE) Architect

Location : Atlanta Georgia (Hybrid)

Long Term Contract

Looking for W2 Candidates. No C2C

Job Description : Role Summary :

As an SRE Architect you will be a pivotal technical leader responsible for designing building and evolving the foundational systems and practices that ensure the reliability scalability performance and efficiency of our critical services. Moving beyond day-to-day operations you will focus on the strategic architectural direction of SRE function defining standards blueprints and frameworks that enable development teams and fellow SRE operations team to build and operate highly resilient systems. Leverage deep expertise in software engineering distributed systems cloud infrastructure and SRE principles to influence technology choices establish best practices and foster a proactive culture of reliability across the organization and much beyond observability pillar.

Key Responsibilities :

Reliability Strategy & Design :

Architect and design highly available scalable secure and cost-effective infrastructure and application patterns on AWS
Define and evangelize SRE best practices standards and blueprints for service design deployment monitoring and operational readiness across the engineering organization
Review current observability implementation to identify gaps and define steps to reach next level maturity of observability setup to provide deep insights into system health and behaviour
With overall maturity lead the definition and implementation strategy for Service Level Indicators (SLIs) Service Level Objectives (SLOs) and Error Budgets for critical services

Platform Architecture & Automation :

Design solutions to systematically reduce operational toil through automation and improved system design

Evaluate current SRE tools and automation frameworks (e.g. CI / CD pipelines Infrastructure as Code modules automated incident remediation chaos engineering platforms) and suggest enhancement that will help overall enhancement of capability

Evaluate prototype and recommend new technologies tools and methodologies to enhance system reliability developer productivity and operational efficiency

Technical Leadership & Consultation :

Act as a senior technical advisor and subject matter expert on reliability scalability and performance for development and platform teams

Provide architectural guidance during the design phase of new services and features to ensure reliability principles are embedded early (shift-left)

Mentor and coach other SREs and engineers fostering technical excellence and adherence to SRE principles

Lead architectural reviews and production readiness assessments for critical systems

Resilience :

Lead blameless postmortems for significant incidents ensuring root causes are identified and systemic architectural improvements are prioritized and implemented

Architect and advocate for resilience patterns (e.g. circuit breaking rate limiting graceful degradation chaos engineering) within applications and infrastructure

Required Qualifications :

Proven experience in an architectural role designing solutions for reliability scalability and performance

Deep understanding and practical application of SRE principles (SLIs / SLOs error budgets toil reduction automation incident management postmortems)

Expertise in cloud computing platforms (e.g. AWS) including infrastructure networking and security services

Strong experience with containerization and orchestration technologies (Kubernetes Docker serverless computing)

Solid experience designing and implementing observability solutions (e.g. Dynatrace Prometheus Grafana ELK / EFK Stack Jaeger OpenTelemetry)

Strong programming / scripting skills (e.g. Python Go Bash) for automation and tool development

Excellent analytical problem-solving and strategic thinking skills.

Strong communication collaboration and leadership skills with the ability to influence technical direction across teams

Preferred Qualifications :

Experience designing and implementing chaos engineering practices and platforms

Best Regards : Jahnavi G

Phone : 1-

Email : Key Skills

Kubernetes,FMEA,Continuous Improvement,Elasticsearch,Go,Root cause Analysis,Maximo,CMMS,Maintenance,Mechanical Engineering,Manufacturing,Troubleshooting

Employment Type : Full Time

Experience : years

Vacancy : 1

Crear una alerta de empleo para esta búsqueda

Site Reliability Engineering • Atlanta, Georgia, USA

Ofertas relacionadas

Principal Site Reliability Engineer

Qgenda • Atlanta, Georgia, United States

A tiempo completo +1

QGenda is redefining healthcare workforce management everywhere care is delivered.We're on a mission to empower the healthcare industry to better onboarding, deploy, and manage their workforce.Over...Mostrar más

Última actualización: hace más de 30 días • Oferta promocionada

Senior Site Reliability Engineer

Cloudious LLC • Atlanta, Georgia, USA

A tiempo completo

Senior Site Reliability Engineer.Manage and optimize data streaming and API components in OpenShift Onpremise and AWS.Proactively review the applications APIs and processes to identify opportunitie...Mostrar más

Última actualización: hace 14 días • Oferta promocionada

Senior Project Manager- Land & Site Development

Rochester | DCCM • Fayetteville, GA, US

A tiempo completo

Our Fayetteville, Georgia office is looking for a talented Senior Project Manager to join our team.In this role you will get to manage and plan detailed phases of engineering work for residential p...Mostrar más

Última actualización: hace más de 30 días • Oferta promocionada

Aerospace Site Manager

PPG • Atlanta, Georgia, USA

A tiempo completo

Kennesaw GA site that supports both Business and Operational strategies.You will be responsible for providing strategic developmental and tactical direction of the facility to drive Operational Exc...Mostrar más

Última actualización: hace 21 días • Oferta promocionada

Site Reliability Engineer I

Axon • Atlanta, Georgia, USA

A tiempo completo

Join Axon and be a Force for Good.At Axon were on a mission to Protect Life.Were explorers pursuing societys most critical safety and justice issues with our ecosystem of devices and cloud software...Mostrar más

Última actualización: hace 23 días • Oferta promocionada

Reliability Director - Total Productive Maintenance

Multi-Color Corporation MCC • Atlanta, GA, United States

A tiempo completo

Maintenance and Reliability Director.Build Your Career with an Industry Leader.As the global leader of premium labels, Multi-Color Corporation (MCC) helps brands stand out in competitive markets an...Mostrar más

Última actualización: hace 15 días • Oferta promocionada

Cell Leader Turbine

GE Vernova • Chamblee, Georgia, USA

A tiempo completo

As a member of the site leadership team you will be an active contributor to the Safety Quality Delivery and Cost (SQDC) goals for the business unit. This role will require you to lead the Steam Tur...Mostrar más

Última actualización: hace 8 días • Oferta promocionada

Site Reliability Engineer (SRE)

Kanshe Infotech • Alpharetta, Georgia, USA

A tiempo completo

Job Title : Site Reliability Engineer (SRE).We are looking for an experienced.Site Reliability Engineer (SRE).The ideal candidate will have a strong background in. DevOps cloud infrastructure automa...Mostrar más

Última actualización: hace 24 días • Oferta promocionada

Site Reliability Engineering (SRE) Architect

QTech • Atlanta, Georgia, USA

A tiempo completo

Job Title : Site Reliability Engineering (SRE) Architect.Location : Atlanta Georgia (Hybrid).As an SRE Architect you will be a pivotal technical leader responsible for designing building and evolving...Mostrar más

Última actualización: hace 9 días • Oferta promocionada

Principal Site Reliability Engineer - Federal Team

Saviynt • Atlanta, Georgia, United States

A tiempo completo

Saviynt's AI-powered identity platform manages and governs human and non-human access to all of an organization's applications, data, and business processes. Customers trust Saviynt to safeguard the...Mostrar más

Última actualización: hace más de 30 días • Oferta promocionada

Manager Site Reliability Engineering

RELX • Alpharetta, GA, US

A tiempo completo

Are you an experienced site reliability engineering leader ready to shape strategy, inspire teams, and drive innovation at scale? Are you looking to lead a high-impact sre team where your leadershi...Mostrar más

Última actualización: hace más de 30 días • Oferta promocionada

Site Reliability Engineer

CD Newco LLC d / b / a Curve Dental • Alpharetta, Georgia, United States, 30009

A tiempo completo

At Flex Dental, we go beyond checking boxes; our integration and automation are unparalleled.Every feature serves a purpose, creating seamless collaboration with Open Dental's practice management s...Mostrar más

Última actualización: hace más de 30 días

Site Reliability Engineering ManagerRemote

Wikimedia • Atlanta, GA, US

A tiempo completo

Site Reliability Engineering Manager.The Wikimedia Foundation is looking for an Engineering Manager to join our SRE team, reporting to the Director of Site Reliability Engineering.As Engineering Ma...Mostrar más

Última actualización: hace 22 días • Oferta promocionada

Senior Site Reliability Engineer - Featurespace

Visa • Atlanta, Georgia, United States

A tiempo completo

Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more t...Mostrar más

Última actualización: hace más de 30 días • Oferta promocionada

Site Reliability Engineer

Donato Technologies, Inc • Atlanta, Georgia, USA

A tiempo completo

Última actualización: hace 15 días • Oferta promocionada

Lead Engineer

Chesapeake Utilities Corporation • Norcross, GA, United States

A tiempo completo

Remote Within Service Territory -.DE, PA, OH, GA, NC, VA, MD or FL).The Lead Engineer plays a pivotal role in training and process improvement, developing and leading training programs for the Engi...Mostrar más

Última actualización: hace más de 30 días • Oferta promocionada

Systems Engineer

Delta Dental of California • Alpharetta, GA, United States

A tiempo completo

EMPLOYER : Delta Dental Insurance Company.Location : 1130 Sanctuary Pkwy, Alpharetta, GA 30009; Must live within reasonable distance from HQ and appear in office as required.Monitor and work ticket q...Mostrar más

Última actualización: hace 28 días • Oferta promocionada

Senior Systems Reliability Engineer - Now Hiring!

ADP • Alpharetta, GA, United States

A tiempo completo

Senior Systems Reliability Engineer in our Alpharetta, GA location.Are you empathetic to client needs and inspired by transformation and impacting the lives of millions of people every day?.Are you...Mostrar más

Última actualización: hace 8 días