Principal Site Reliability Engineer - 29356

Splunk Inc
Kansas, United States
$181.2K-$249.2K a year
Full-time

Join us as we pursue our disruptive new vision to make machine data accessible, usable and valuable to everyone. We are a company filled with people who are passionate about our product and seek to deliver the best experience for our customers.

At Splunk, we’re committed to our work, customers, having fun and most importantly to each other’s success. Learn more about Splunk careers and how you can become a part of our journey! You will set up, manage, maintain, and troubleshoot customer-facing systems in the Splunk Cloud Platform.

This position is an opportunity to join the team that is responsible for Splunk Cloud’s operational infrastructure and delivery.

The candidate should have experience managing deployments and providing strategies in a cloud environment. This is an incredible opportunity to use your existing AWS or GCP cloud experience and technical leadership to drive the growth of our cloud offerings here at Splunk!

Responsibilities :

  • Manage AWS, GCP, or Azure server / storage deployments, including release process, backup / restore, and HA / DR.
  • Enhance the monitoring environment using Splunk Core and Observability products.
  • Create and enhance automation processes to improve operational efficiencies.
  • Develop tools and automation in Python, Shell, Golang, or similar to monitor system stability and performance. Ensure system availability, reliability, and usability.
  • Perform end-to-end administration of Cloud infrastructure, focusing on Linux-based systems.
  • Troubleshoot and take ownership of complex problems and resolve high-impacting operational issues.
  • Work across the Cloud organization to deliver quality products that delight Splunk's passionate users.
  • Lead groups of tight-knit, super-smart engineers who are building a state-of-the-art, cloud-based environment for massive-scale data processing.
  • Mentor and help new engineers to achieve more than they thought possible.
  • Occasionally, work directly with customers on challenging issues to speed understanding and resolution, ensuring high customer satisfaction.

Requirements :

  • Extensive experience as a Linux Systems Administrator supporting enterprise computing platforms and systems
  • Splunk administration and architecture design experience
  • Deep understanding of Splunk’s Search Processing Language (SPL) is a plus
  • Experience with virtualization and or Cloud technologies and infrastructure
  • Experience managing systems specifically in one of AWS, GCP, or Azure
  • Experience supporting customer-facing multi-tenant infrastructure (SaaS) or similar cloud-related services
  • Experience with a coding or scripting language, preferably Python, Golang, or Shell
  • Experience with open-source-based systems and tools
  • Experience as an occasionally customer-facing engineer
  • 14 days ago
Related jobs
Splunk Inc
Kansas, United States

Ensure system availability, reliability, and usability. Lead groups of tight-knit, super-smart engineers who are building a state-of-the-art, cloud-based environment for massive-scale data processing. Mentor and help new engineers to achieve more than they thought possible. Experience as an occasion...

Promoted
Canonical - Jobs
Wichita, Kansas

As a Senior Site Reliability / Gitops Engineer you will. As an Senior SRE & Gitops engineer you'll be in a unique position to drive operations automation to the next level, both in our own private clouds as well as in the public clouds. Provide assistance and work with globally distributed e...

T-Mobile
Overland Park, Kansas

Our team is searching for our next Principal Site Reliability Engineer to play a crucial role improving system reliability and resilience, facilitating faster and more efficient software development and deployment. Improve system reliability and resilience by implementing advanced site reliability e...

Cox Automotive
Mission, Kansas

This Site-Reliability Engineer II will be part of the Consumer & Marketing Solutions Site Reliability Engineering (SRE) team. We also look to instill core SRE practices into the engineering teams including measuring SLIs/SLOs, increasing visibility/observability through monitoring tools, guide c...

Vistex
Winfield, Kansas

Vistex is currently hiring a Site Reliability Engineer. The Vistex Site Reliability Engineer will be primarily responsible for service availability, performance, monitoring, incident response, and capacity planning. Collaborate with our support, operations, database, and engineering teams to investi...

WP Engine
Remote, Kansas
Remote

The evolution of our platform is required for our scale, and we are searching for an experienced Site Reliability Engineer with expertise in CDN and Networking to join our rapidly growing engineering team. Design and deploy CDN configuration to ensure high availability, performance, and scalability ...

Elite Mente LLC
Anthony, Kansas

Site ReliabilityEngineer (ITIL) Incident Management. SRE DevOps software engineering. ...

T-Mobile
Overland Park, Kansas

Our team is searching for our next Sr Site Reliability Engineer to play a crucial improving system reliability and resilience, facilitating faster and more efficient software development and deployment. Enhance system reliability and resilience by identifying potential issues and implementing preven...

Promoted
24 HOUR FLOOD PROS LLC
Wichita, Kansas

We are seeking a Reconstruction Project Manager to join our team! You will oversee project planning, scheduling, budgeting, and implementation. Oversee all aspects of construction project from planning to implementation. Allocate resources for assigned projects. Interface with project inspectors, co...

Promoted
AgileHR
Winfield, Kansas

Full time QA Engineer will be responsible for testing several enterprise level applications. ...