Search jobs > San Francisco, CA > Site reliability engineer

Site Reliability Engineer

Swish Analytics
San Francisco, CA, United States
$110K-$147K a year
Full-time

Swish Analytics is a sports analytics, betting and fantasy startup building the next generation of predictive sports analytics data products.

We believe that oddsmaking is a challenge rooted in engineering, mathematics, and sports betting expertise; not intuition.

We're looking for team-oriented individuals with an authentic passion for accurate and predictive real-time data who can execute in a fast-paced, creative, and continually-evolving environment without sacrificing technical excellence.

Our challenges are unique, so we hope you are comfortable in uncharted territory and passionate about building systems to support products across a variety of industries and enterprise clients.

About the team

The Swish Analytics DevSecOps and Infrastructure team is looking for an experienced Site Reliability Engineer who will support our enterprise infrastructure during non-US hours.

In addition to supporting you will assist in optimizing incident response, observability, and working with technical teams to improve overall workload resiliency.

Responsibilities

  • Support production systems and help triage issues during live sporting events
  • Monitor the system and respond to incidents to maintain system SLO / SLA, review and follow up production incidents
  • Write and review code, develop documentation, and debug problems, live, on complex distributed systems
  • Optimize and facilitate incident response, conduct root cause analysis and blameless retrospectives
  • Work closely with technical teams to implement, optimize, maintain, scale and debug workloads on Kubernetes using CI / CD, automation tools and scripting languages to deliver tools / software to improve the reliability and scalability of services

Qualifications

  • 3+ years of experience working in an SRE leaning DevOps or full SRE roles
  • 3+ years building CICD pipelines with Github Actions, Gitlab CICD, or similar
  • Extensive experience with Kubernetes
  • Experience in managing customer-facing systems in a 24 / 7 environment including escalations
  • Experience triaging and escalation policies / protocols
  • Strong communication and documentation skills
  • Comfortable with scripting languages like Bash, Python, or similar

Preferred

  • Networking and routing experience
  • Terraform in AWS to support global-scale services
  • Improving observability in an engineering organization
  • Past experience with PagerDuty or similar tools

Salary : $110-147,000

Swish Analytics is an Equal Opportunity Employer. All candidates who meet the qualifications will be considered without regard to race, color, religion, sex, national origin, age, disability, sexual orientation, pregnancy status, genetic, military, veteran status, marital status, or any other characteristic protected by law.

The position responsibilities are not limited to the responsibilities outlined above and are subject to change. At the employer's discretion, this position may require successful completion of background and reference checks.

18 days ago
Related jobs
Promoted
Robert Half
CA, United States
Remote

Immediate hiring for a Sr Site Reliability Engineer for a fully remote opportunity to join a growing company operating in a new market category -they are helping customers globally with digitalizing their data with high portable trust assurance. ...

ThousandEyes
San Francisco, California

We’re looking for talented engineers with a software or operations background, experienced in designing and operating large-scale highly available distributed systems in the cloud. You must be willing to work closely with our application development teams to ensure the reliability, performance and s...

CIRCLE
San Francisco, California

As a Senior Site Reliability Engineer at Circle, you will design, build, and maintain Circle’s infrastructure estate to meet the growing worldwide customer base on public cloud providers across multiple regions. Staff Site Reliability Engineer (IV). Senior Site Reliability Engineer (III). Senior Sit...

Mozilla
San Francisco, California
Remote

Thunderbird is looking for a multi-skilled self-starter to work on site reliability engineering. The Sr Site Reliability Engineer is an individual contributor and will report directly to the Manager, Web Services. As a Senior SRE, you will play a critical role in ensuring the reliability, scaleabili...

CIRCLE
San Francisco, California

As a Senior Site Reliability Engineer at Circle, you will design, build, and maintain Circle’s infrastructure estate to meet the growing worldwide customer base on public cloud providers across multiple regions. Staff Site Reliability Engineer (IV). Senior Site Reliability Engineer (III). Senior Sit...

Hims
San Francisco, California
Remote

We are seeking a Site Reliability Engineer to help build a reliable web experience for our users. Manage incidents and emergency response, track outages, ensure data integrity and engineer releases to promote safe, efficient and rapid deployments. ...

Splunk Inc
California, United States
Remote

Site Reliability Engineers in this role will be engaging with multiple service owners across the platform to teach and implement modern interpretations ofSRE,observability, Chaos Engineering andDevOps. Splunk's Cloud Services group is looking for a Site ReliabilityEngineer to help lead, design and b...

GitLab
San Francisco, California
Remote

Site Reliability Engineers (SREs) are responsible for keeping all user-facing services and other GitLab production systems running smoothly. SREs are a blend of pragmatic operators and software craftspeople that apply sound engineering principles, operational discipline, and mature automation to our...

Cypress HCM
San Francisco, California
Remote

Site Reliability Engineer (Grafana). Site Reliability Engineer with proficiency in the Grafana platform. ...

OpenAI
San Francisco, California

Proven experience as an reliability engineer, production engineer, infrastructure software engineer or a similar role in a fast-paced, rapidly scaling company. Have a track record of accelerating engineering reliability by empowering your fellow engineers with excellent tooling and systems. As a Rel...