Search jobs > Mountain View, CA > Staff site reliability

Staff Cloud DevOps/Site Reliability Engineer

Inworld AI
Mountain View, CA, United States
$180K-$280K a year
Full-time

Why Join Inworld

Inworld is the best-funded startup in AI and games with a $500 million valuation and backing from top-tier investors like Intel, Microsoft, Lightspeed, Bitkraft, Founders Fund, Kleiner Perkins, and more.

Inworld was recognized by CB Insights as one of the ten most promising AI companies in the US, ranking in the top two in both the Early Stage and Vertical AI categories among all companies worldwide in 2024.

Inworld is the leading AI engine for games, enabling developers to build groundbreaking game mechanics, dynamic NPCs and worlds that evolve with each action.

Inworld powers experiences built by Ubisoft, NVIDIA, Niantic, NetEase Games and LG, among others, and has partnerships with key industry players such as Microsoft / Xbox, Epic Games and Unity.

Our Technical Operations team manages the infrastructure, DevOps, and Site Reliability of our platform. We are looking for a Staff Cloud DevOps / Site Reliability Engineer to join our team.

Qualifications

  • Bachelor's degree in Computer Science, Engineering, or a related field
  • 7+ years of experience as a DevOps, Infrastructure, Operations, or Site Reliability Engineer (or as a software engineer with relevant experience)

Experience with at least 2 years in :

  • Terraform
  • Helm
  • Kubernetes
  • AWS, Azure, or GCP
  • CI / CD using modern tools (GitOps)

Nice-to-Have :

  • MLOps (building, orchestrating, and maintaining Machine Learning Pipelines)
  • Prometheus / Grafana
  • Multi-cloud deployments (2 or more)
  • ArgoCD
  • Network management and VPNs

Responsibilities

  • Infrastructure : Maintain and contribute to Infrastructure-as-Code (Terraform)
  • DevOps and CI / CD Pipelines : Orchestrate pipelines using Github Actions, Helm, ArgoCD
  • Microservices scalability : Kubernetes Administration
  • Cloud Administration
  • Site Reliability : Measure and monitor availability, latency, and overall service health, drive incident management and post-mortem analysis

In-office location : Mountain View, CA, United States.

Remote location : United States.

The US base salary range for this full-time position is $180,000 - $280,000. In addition to base pay, total compensation includes equity and benefits.

Within the range, individual pay is determined by work location, level, and additional factors, including competencies, experience, and business needs.

The base pay range is subject to change and may be modified in the future.

23 days ago
Related jobs
Promoted
TikTok
San Jose, California

Our infrastructure team is seeking experienced site reliability engineers to build globally distributed edge platform for provisioning and deploying edge services. Collaborate with software engineers to build enterprise-level edge computing platform (PaaS) with cutting-edge Cloud Native Computing Fo...

Promoted
Zscaler
San Jose, California

The Zscaler Zero Trust Exchange is the company’s cloud-native platform that protects thousands of customers from cyberattacks and data loss by securely connecting users, devices, and applications in any location. With more than 10 years of experience developing, operating, and scaling the cloud, Zsc...

Promoted
EarnIn
Palo Alto, California

As a Staff Site Reliability Engineer, you’ll be the subject matter expert with operating systems and networking. You’ll understand how our services are performing, we use DataDog (Logging+Metrics+APM), and Cloudwatch (by way of Datadog) to alert with Slack or PagerDuty. You can plan, lead, and execu...

Promoted
Western Digital Corp
San Jose, California

As Secure Development Factory (SDF) Site Reliability Engineer - DevOps, you will be at the heart of Western Digital's engineering process, delivering the software development tools and infrastructure that empowers engineering teams to develop and deliver high quality products quickly. You will play ...

Promoted
TikTok
Mountain View, California

Own end-to-end reliability and performance of a critical, revenue-generating E-commerce platform, as well as supporting release management and data compliance in a cloud-native environment. Build and manage a team of software/reliability engineers, including mentoring junior team members and support...

Promoted
Google Cloud - Minnesota
Sunnyvale, California

Google Cloud's software engineers develop the next-generation technologies that change how billions of users connect, explore, and interact with information and one another. As a software engineer, you will work on a specific project critical to Google Cloud's needs with opportunities to switch team...

Promoted
Adobe
San Jose, California

Site Reliability Engineer page is loaded Site Reliability Engineer Apply locations San Jose Waltham Lehi time type Full time posted on Posted 2 Days Ago job requisition id R143795. Adobe’s Reliability Engineering team is looking for a Site Reliability Engineer (SRE) to help build and operate service...

Zoox
Foster City, California

Zoox is looking for a site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous vehicles. Bachelor's degree in an engineering, mathematics, or related field and 2+ years of relevant experience. M...

Rivian
Palo Alto, California

Role Summary We are seeking a highly skilled and experienced Staff DevOps Engineer to further our DevOps initiatives and drive continuous integration, software delivery, and deployment. As a Staff DevOps Engineer, you will collaborate with cross-functional teams to design, implement, and manage our ...

Hireio, Inc.
San Jose, California

Therefore, we set up an engineer team with high talent density, mainly focusing on AI technology and Privacy&Security here. ...