Site Reliability Engineering
Needs to be onsite from day 1 in Ric hmond Virginia
Must haves : Log Data
The more Observability tools the better( Datadog, New Relic, Splunk, ELK, etc)
Programming languages are a nice to have
MYSQL
Key Responsibilities :
- Develop and maintain queries and dashboards to analyze customer impact across Discover’s applications.
- Work with log data from multiple platforms including Datadog, ElasticSearch, MySQL, and Splunk.
- Collaborate with Technology Operations Center (TOC) and SRE teams to support operational visibility and incident response.
- Contribute to strategic planning for Discover’s long-term environment and data migration strategy.
- Ensure flexibility in tooling by focusing on cross-platform log querying rather than specialization in a single tool.
Qualifications :
Strong experience querying log data across various platforms (Datadog, ElasticSearch, Splunk, etc.).Familiarity with SRE principles and operational monitoring.Ability to work across diverse technical stacks and environments.Hands-on experience preferred over purely strategic or leadership backgrounds.Programming skills are a plus but not required.Preferred Profile :
A technically adept individual with a strong SRE or TOC background.Comfortable navigating hybrid environments (cloud and data center).Able to adapt quickly to evolving tools and infrastructure.Strategic mindset with a hands-on approach to problem-solving.