Talent.com
GEICO
Sr Staff Engineer- Availability and Incident ManagmentGEICO • Austin, TX
Sr Staff Engineer- Availability and Incident Managment

Sr Staff Engineer- Availability and Incident Managment

GEICO • Austin, TX
30+ days ago
Job type
  • Full-time
Job description

At GEICO, we offer a rewarding career where your ambitions are met with endless possibilities.

Every day we honor our iconic brand by offering quality coverage to millions of customers and being there when they need us most. We thrive through relentless innovation to exceed our customers’ expectations while making a real impact for our company through our shared purpose.

When you join our company, we want you to feel valued, supported and proud to work here. That’s why we offer The GEICO Pledge: Great Company, Great Culture, Great Rewards and Great Careers.

Position Summary

GEICO is seeking an experienced Engineer with a passion for building high-performance, low maintenance, zero-downtime platforms, and applications. You will help drive our insurance business transformation as we transition from a traditional IT model to a tech organization with engineering excellence as its mission, while co-creating the culture of psychological safety and continuous improvement.

Position Description

The Senior Staff Engineer in Availability and Incident Management will engineer solutions and empower the engineering community with automated processes, data-driven insights, and technical tools that reduce incident recurrence, improve system reliability, and accelerate incident resolution. This role will be heavily centered around building automation platforms to streamline postmortem workflows, eliminate manual tracking, and provide fast feedback loops for incident prevention. You will lead the strategy and execution of a technical roadmap that increases the velocity of incident resolution, reduces repeat incidents, and unlocks new reliability engineering capabilities. The ideal candidate has broad and deep technical knowledge in incident forensics, root cause analysis, automation platforms, distributed systems, observability, and data analytics.

Position Responsibilities

As a Senior Staff Engineer, you will:

  • Lead the strategy and execution for incident retrospective and correction of error (COE) processes across the engineering organization
  • Help conduct deep technical root cause analysis and incident forensics across distributed systems using observability data, logs, metrics, and traces
  • Establish continuous improvement loops through automated trend analysis, pattern recognition algorithms, and predictive analytics
  • Design, code, and deploy automation platforms and self-service tools using Python, Go, Java, or C# that scale incident retrospective workflows and eliminate manual tracking
  • Build production-grade data pipelines, analytics systems, and real-time dashboards to measure incident trends, COE effectiveness, and action item completion rates
  • Write code for workflow automation, integrations with observability platforms, and APIs that connect incident management tools across the engineering ecosystem
  • Leverage SQL and NoSQL databases to store, query, and analyze incident data at scale using Azure tools and cloud-native services
  • Develop and maintain systems that ensure rigorous follow-through on action items, remediation plans, and preventive measures with automated tracking
  • Partner with service engineering teams to implement preventive measures and architectural improvements based on incident patterns
  • Present data-driven insights and incident trend analysis to leadership and engineering teams to drive preventive action
  • Influence and educate leadership on incident patterns, prevention strategies, and reliability best practices
  • Mentor engineers on coding best practices, automation techniques, and strengthen technical expertise across the engineering community
  • Stay current with industry advances in SRE, observability, incident management, and automation; educate teams on emerging practices

Qualifications

  • Experience building automation platforms and self-service tools for workflow management, analytics, or engineering productivity
  • Fluency in at least two modern languages such as Python, Go, Java, C++, or C# including object-oriented design
  • Experience building microservices architectures, REST APIs, and distributed systems
  • Experience with data pipelines, analytics platforms, and visualization tools for operational metrics and KPIs
  • Experience with SQL and NoSQL databases (e.g., PostgreSQL, MongoDB, Cassandra, CosmosDB) for data storage and analytics
  • Experience with observability platforms (Prometheus, Grafana, Datadog, Splunk, ELK) and distributed systems monitoring, logging, and tracing
  • Experience with cloud providers (Azure, AWS, or GCP) and cloud-native architectures
  • Experience with CI/CD pipelines, infrastructure as code, and container orchestration (Kubernetes, Docker)
  • Experience writing workflow automation code (YAML pipelines, GitHub Actions, Azure DevOps pipelines)
  • Strong understanding of distributed systems architecture, design patterns, reliability, and scaling
  • Knowledge of retrospective facilitation, continuous improvement processes, and blameless culture principles
  • Strong architecture and design skills with ability to influence engineering direction and technical roadmap
  • Experience solving complex analytical problems with data-driven approaches
  • Proven ability to partner with cross-functional engineering teams and drive systemic improvements
  • Excellent communication skills with ability to present technical insights to leadership and influence decision-making
  • Experience leveraging GenAI or LLMs is a plus

Experience

  • 10+ years of professional platform development or general development experience
  • 8+ years of experience with architecture and design
  • 6+ years of experience in open-source frameworks
  • 4+ years of experience with AWS, GCP, Azure, or another cloud service

Education

  • Bachelor’s degree in Computer Science, Information Systems, or equivalent education or work experience

#LI-RM2

Annual Salary

$110,000.00 - $260,000.00

The above annual salary range is a general guideline. Multiple factors are taken into consideration to arrive at the final hourly rate/ annual salary to be offered to the selected candidate. Factors include, but are not limited to, the scope and responsibilities of the role, the selected candidate’s work experience, education and training, the work location as well as market and business considerations.

GEICO will consider sponsoring a new qualified applicant for employment authorization for this position.

The GEICO Pledge:

Great Company: At GEICO, we help our customers through life’s twists and turns. Our mission is to protect people when they need it most and we’re constantly evolving to stay ahead of their needs.

We’re an iconic brand that thrives on innovation, exceeding our customers’ expectations and enabling our collective success. From day one, you’ll take on exciting challenges that help you grow and collaborate with dynamic teams who want to make a positive impact on people’s lives.

Great Careers: We offer a career where you can learn, grow, and thrive through personalized development programs, created with your career – and your potential – in mind. You’ll have access to industry leading training, certification assistance, career mentorship and coaching with supportive leaders at all levels.

Great Culture: We foster an inclusive culture of shared success, rooted in integrity, a bias for action and a winning mindset. Grounded by our core values, we have an an established culture of caring, inclusion, and belonging, that values different perspectives. Our teams are led by dynamic, multi-faceted teams led by supportive leaders, driven by performance excellence and unified under a shared purpose.

As part of our culture, we also offer employee engagement and recognition programs that reward the positive impact our work makes on the lives of our customers.

Great Rewards: We offer compensation and benefits built to enhance your physical well-being, mental and emotional health and financial future.

  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being.
  • Financial benefits including market-competitive compensation; a 401K savings plan vested from day one that offers a 6% match; performance and recognition-based incentives; and tuition assistance.
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance.
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year.
Create a job alert for this search

Sr Staff Engineer- Availability and Incident Managment • Austin, TX

Similar jobs

Staff Software Engineer

ProcoreAustin, Texas, United States
Full-time

We are looking for a Staff Software Engineer to join Procore's Platform Services Division.In this role, you will build and maintain tools that support the operational excellence of services and the... Show more

 • Promoted

Sr. Platform Engineer (AKS/EKS)

ASCENDINGAusin, TX, US
Full-time
Quick Apply

Site Reliability Engineer Long term contract- 2+ years 100% remote in the continental US Our client, a premier national healthcare provider, is currently looking for a Senior Site Reliability Engin... Show more

Remote InfoSec Manager - Vulnerability Mgmt

Kastech Software Solutions GroupAustin, TX, United States
Remote
Full-time

Kastech Software Solutions Group is hiring an Information Security Manager 3 for a long-term remote position.Candidates should have extensive experience (8+ years) in areas including Vulnerability ... Show more

 • Promoted

Sr. Site Reliability Engineer - Austin, Texas

ShipperHQAustin, TX, US
Full-time
Quick Apply

Software Engineer - Site Reliability About ShipperHQ:.ShipperHQ is a trusted leader in the e-commerce shipping space, with over 15 years of experience helping merchants deliver better checkout expe... Show more

Staff Software Engineer - Airflow

ClouderaAustin, Texas, United States
Full-time

At Cloudera, we empower people to transform complex data into clear and actionable insights.With as much data under management as the hyperscalers, we're the preferred data partner for the top comp... Show more

 • Promoted

Chief of Staff - Idera

Idera, Inc.Austin, TX, US
Full-time
Quick Apply

B2B software company with over 30 brands that sell tools to technical users, including data professionals, application developers, IT teams, DevOps engineers, and QE teams.Idera is growing rapidly ... Show more

Team Lead, Site Reliability Engineer

TeamViewer Germany GmbHAustin, Texas, .US
Full-time
Quick Apply

TeamViewer provides a leading Digital Workplace platform that connects people with technology—enabling, improving and automating digital processes to make work work better.Our software solutions ha... Show more

Sr. DevOps Engineer

ThermonAustin, TX, USA
Full-time
Quick Apply

Thermon is a global industrial technology leader and the world's foremost provider of mission critical heating solutions.From power generation and semiconductors to oil & gas and food processin... Show more

Sr. SW Engineer

VisaAustin, Texas, United States
Full-time

Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more t... Show more

 • Promoted

Chief of Staff to CTO — Drive Execution & Scale Engineering

SentiLinkAustin, TX, United States
Full-time

A leading identity verification company in Austin, Texas is seeking a Chief of Staff to the CTO to enhance operational rigor within their technology teams.The ideal candidate will enjoy driving cro... Show more

 • Promoted

Storage Engineer, Senior

ASM Research, An Accenture Federal Services CompanyAustin, TX, United States
Full-time

Design and administer large-scale SAN and NAS infrastructures, including zoning, LUN provisioning, and multipathing for high-availability workloads.Perform detailed capacity planning, performance a... Show more

 • Promoted

SRE Architect

Incedo Inc.Austin, Texas, US
Full-time

Job Description: SRE Architect.Check below to see if you have what is needed for this opportunity, and if so, make an application asap.Site Reliability Engineer (SRE) Architect.This role is ideal f... Show more

 • Promoted

Sr Analyst - Emerging Managers

1872 ConsultingAustin, TX, United States
Full-time

Senior Analyst Emerging Managers 100% WFH Must Be Near SF, NYC, Boston, Austin TX.A market leader in financial service advice to tech companies and investors - they possess the most valuable data... Show more

 • Promoted

Staff SW Engineer

VisaAustin, Texas, United States
Full-time

Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more t... Show more

 • Promoted

Strategic Alliances Manager OEM ISV AMER

LansweeperAustin, TX, US
Full-time

Strategic Alliances Manager OEM - ISV.Today, our ISV revenue contribution sits at.As Strategic Alliances Manager OEM - ISV, you’ll play a pivotal role in realizing this growth.Lansweeper’s integrat... Show more

Staff Software Engineer, Applied AI

hackeroneAustin, Texas, United States
Full-time

HackerOne is a global leader in Continuous Threat Exposure Management (CTEM).The HackerOne Platform unites agentic AI solutions with the ingenuity of the world's largest community of security resea... Show more

 • Promoted

Staff Software Engineer (API Integrations/AI Integration)

Aravo Solutions, Inc.Austin, TX, US
Full-time
Quick Apply

Hybrid- Employees may be required to work out of the nearest office location for quarterly meetings 1-4 times annually.TPRM), ESG, and vendor lifecycle management solutions powered by intelligent a... Show more

Site Reliability Engineer/Admin (SRE/15+)

InnoSoul, Inc.Austin, TX, United States
Full-time
Quick Apply

Job ID: TX-70126093 (99590501)</p> <p>Remote Site Reliability Engineer/Admin (SRE/15+) with Windows/Linux, Java/.CVEs, ServiceNow/Tenable, audit/compliance, vulnerability scanning, Agil... Show more