Search jobs > Boulder, CO > Remote > Site reliability engineer

Site Reliability Engineer-FedRAMP, AWS (FULLY REMOTE) - 29122

Splunk Inc
Boulder, Colorado, United States
$146.4K-$201.3K a year
Remote
Full-time

Job Description Join us as we pursue our disruptive vision to make machine data accessible, usable and valuable to everyone.

We are a company filled with people who are passionate about our product and seek to deliver the best experience for our customers.

AtSplunk, we’re committed to our work, customers, having fun and most significantly to each other’s success. Learn more about Splunk careers and how you can become a part of our journey! Role : Splunk's Cloud Services group is looking for a Site ReliabilityEngineer to help lead, design and build the next generation of our large scale cloudoffering.

You will be working on core services and applications that form the primitives for our current and future cloud service offerings.

Site Reliability Engineers in this role will be engaging with multiple service owners across the platform to teach and implement modern interpretations ofSRE,observability, Chaos Engineering andDevOps.

This role is highly visible and impactful to the organization and will help shapeSplunk's Engineering culture for years to come.

Your job, in a nutshell, is to make every team around you better... including your own!This is a remote role available in all US states except AK, ND, and WY.

You also have the option of an office desk in some locations if that's convenient and desirable for you! You will :

  • Own Splunk Cloudin FedRAMP environments
  • Work across the organization to deliver quality products that delightSplunk's passionate users.
  • Lead teams of tight-knit engineers who are building a state-of-the-art,cloud-based environment for massive-scale data processing.
  • Mentor and help new engineers to achieve more than they thought possible. You enjoy making other teams successful and are fulfilled through the success of others.

Qualifications :

  • You have experience or an interest in working with regulated computing environments such as FISMA and / or FedRAMP and are enthusiastic about doing it better.
  • This is a fully remote, US-based / work-from-home position. You must be a US Citizen working on US soil to be considered.
  • You have owned and operated Kubernetes Clusters and their associated ecosystems. Kubernetes certifications or an interest in obtaining these certifications are a plus, such as those from the Cloud Native Computing Foundation;

Certified Kubernetes Administrator (CKA), Certified Kubernetes Application Developer (CKAD), or Certified Kubernetes Security Specialist (CKS).

  • You have experience deploying and operating services on the Azure cloud platform.
  • You enjoy building and running distributed systems at scale in production. You understand the challenges and trade-offs to be made when building and deploying systems to production.
  • Deep understanding of linux systems (network stack, file system, OS services) and networking (L2 vs. L3, network architecture, VLANs, etc)
  • Experience with at least one programming language, preferably golang (go) or python. Knowledge of working with and automating linux systems tasks using this language is required, including working with configuration files and system services.

Knowledge of common data structures and algorithms, as well as their performance characteristics is required.

  • Knowledge of standard methodologies related to security, performance, and disaster recovery.
  • Highly skilled in identifying performance bottlenecks, identifying anomalous system behavior, and resolving root cause of service issues.
  • You have assembledOpen Sourcecomponents into cohesive services.
  • You've demonstrated the skills to effectively work across teams and functions to influence design, operations and deployment of highly available software.
  • You are interested in working hard to make the users ofSplunk's products happier every day.

Preferred skills :

  • Experience monitoring cloud environments withSplunk.
  • Experience with large distributed cloud service development,infrastructure, traffic management and architecture..
  • Experience with distributed architectures / systems with optimized and scalable software that operates on a large number of nodes.
  • 2 days ago
Related jobs
Splunk Inc
Colorado, United States
Remote

Site Reliability Engineers in this role will be engaging with multiple service owners across the platform to teach and implement modern interpretations ofSRE,observability, Chaos Engineering andDevOps. Splunk's Cloud Services group is looking for a Site ReliabilityEngineer to help lead, design and b...

UnitedHealth Group
Boulder, Colorado
Remote

The Sr Cloud Engineer/Sr Site Reliability Engineer is a member of Cloud Operations Automation team and responsible for the reliability, security and efficiency of Change Healthcare’s cloud environments and products that comprise Enterprise Imaging solutions. The engineer will also participate in the...

The AES Corporation
US, Colorado
Remote

The AES CE Senior Reliability Engineer is responsible for supporting the operational portfolio of photovoltaic and battery storage projects by improving the reliability and availability of equipment including subcomponents. It can be remote or in one of our offices and would need travel up to 15% to...

Cloudera
Remote, Colorado, US
Remote

Cloudera is seeking a Staff Site Reliability Engineer (SRE), to improve and maintain the infrastructure which powers Cloudera Data Platform (CDP). As a Staff Site Reliability Engineer, you will:. Collaborate with engineering teams to improve availability, reliability and observability of their servi...

Workday, Inc.
Boulder, Colorado

You will partner with software engineers, technical operations, and architecture to build solutions. In this position, you will have a critical role collaborating and problem solving not only with our team, but with a variety of teams in our development organization to improve the reliability of our...

Splunk Inc
Colorado, United States

This is a fully remote, US-based position. The TechOps engineers lead their respective queue and ensure all requests coming into that queue are addressed in a timely manner. Candidates must be able to support FedRAMP High. Sunday - Wednesday or Wednesday - Saturday / Nights, weekends and swing-shift...

NetApp
Boulder, Colorado

Title: Senior Site Reliability Engineer. As a Cloud Infrastructure/Site Reliability Engineer, you will be operating at the intersection of development and operations. Team Collaboration and Influence: Work in tandem with other Cloud Infrastructure Engineers and developers to ensure maximum performan...

Promoted
Amazon Data Services, Inc.
Broomfield, Colorado

We support Data Center Engineer Operations teams who are responsible for the operation of infrastructure equipment. The Data Center Infrastructure Operations organization is looking for an individual with proven skills to help support our Facility Operations Center. The Facility Operations Center is...

Promoted
Leidos Inc
Boulder, Colorado

Must have network and firewall engineering experience designing, implementation, and maintaining network infrastructure and Layer 2 and 3 networking devices and/or firewall devices such as Juniper, Dell, Cisco, Fortinet, or Palo Alto. Plan and perform maintenance and upgrade of Juniper network route...

Promoted
Coalfire Systems
Westminster, Colorado
Remote

Strong experience with AWS, Azure, or GCP platform capabilities and services (Cloud Architect, Cloud DevOps Engineer, or Cloud Security Engineer). We're looking for an Engineer II to support our Cloud Services team. Become a member of a highly-collaborative engineering team offering a unique blend o...