Talent.com
DevOps Site Reliability Engineer
DevOps Site Reliability EngineerRogue Fitness • Columbus, Ohio
DevOps Site Reliability Engineer

DevOps Site Reliability Engineer

Rogue Fitness • Columbus, Ohio
8 days ago
Job type
  • Full-time
Job description

Description

Overview

As a DevOps Site Reliability Engineer (SRE), you will be responsible for designing, implementing, and maintaining our application infrastructure to ensure that these systems are highly available, scalable, and reliable.

You will work closely with our development and operations teams to implement automation, monitor performance, and identify and resolve issues before they affect our users and customers. In this role, you will be directly supporting cutting-edge software solutions for Rogue Fitness from our retail website to systems that support our manufacturing and warehousing systems.

Rogue Fitness is the leading manufacturer of strength & conditioning equipment and Official supplier to the CrossFit Games, USA Weightlifting, the World’s Strongest Man and the Arnold Classic.

The DevOps Site Reliability Engineer is a fully onsite role in Columbus, Ohio. Remote work is not available.

Applicants must be authorized to work in the United States for any employer.

Responsibilities

Design, implement, and maintain our infrastructure and applications to ensure they are highly available, scalable, and reliable

Collaborate with our development and operations teams to implement automation, monitor performance, and identify and resolve issues before they affect our customers

Implement best practices for application deployment, configuration, management, and security

Plan and coordinate deployment processes for infrastructure upgrades with minimum downtime

Monitor and analyze system performance metrics to identify and address issues

Develop and maintain infrastructure as code using tools like Terraform.

Troubleshooting, determine the root cause of issues, and conduct post mortem analysis

Implement and maintain CI / CD pipelines for our applications

Support disaster recovery and business continuity planning

Provide coverage to respond to production issues and incidents

Qualifications

Bachelor Degree in Computer Science, Information Systems, Computer Engineering, or related area

5+ years of experience in a DevOps and / or SRE role

Expert-level knowledge of containerization and orchestration tools like Docker, Kubernetes, and Helm

Prior experience with automation tools like Azure Devops or Jenkins

Required experience in GCP and Azure.

Utilization of monitoring tools like Prometheus, Grafana, Application Insights, GCP Cloud Monitoring

Scripting competencies with Bash, Powershell, and other scripting languages

Demonstrated ability to apply programming skills for automation tools and processes.

Knowledge of GIT, Bitbucket, DevOps, and other source / version control platforms

Strong networking knowledge including firewalls, load balancing, and reverse proxy products

Cloudflare configuration and zero trust implementations are a plus.

Strong and proactive communication skills are required along with a team-oriented mindset

By applying to Rogue, regardless of the platform you choose to use, you are agreeing to Rogue's preferred methods of communication (i.e. text message). Submitting an application, through whatever online forum is ultimately used, constitutes a knowing and voluntary agreement to send and receive text messages during the recruitment process.

Create a job alert for this search

Site Reliability Engineer • Columbus, Ohio