Talent.com
Production Engineer, Storage
Production Engineer, StorageCrusoe • San Francisco, CA, US
No longer accepting applications
Production Engineer, Storage

Production Engineer, Storage

Crusoe • San Francisco, CA, US
30+ days ago
Job type
  • Full-time
Job description

Job Description

Job Description

Crusoe is on a mission to accelerate the abundance of energy and intelligence. As the only vertically integrated AI infrastructure company built from the ground up, we own and operate each layer of the stack — from electrons to tokens — to power the world's most ambitious AI workloads. When you join Crusoe, you join a team that is building the future, faster.

We're in the midst of the greatest industrial revolution of our time. The demand for AI compute is boundless, and power is a bottleneck. We're solving that — with an energy-first approach that makes AI infrastructure better for the world and faster for the people innovating with AI.

We're looking for problem-solving, opportunity-finding teammates with a sense of urgency, who believe in the scale of our ambition and thrive on a path not fully paved — people who want to grow their careers alongside a team of experts across energy, manufacturing, data center construction, and cloud services.

If you want to do the most meaningful work of your career, help our customers and partners advance their AI strategies, and be part of a high-performing team that believes in each other, come build with us at Crusoe.

About This Role:

At Crusoe Energy Systems, our Site Reliability Engineering (SRE) team plays a mission-critical role in maintaining the performance and reliability of our AI-optimized cloud infrastructure. The Storage-focused SRE role is responsible for ensuring the availability, performance, and scalability of Crusoe’s cloud storage products and services, which power compute-intensive, latency-sensitive workloads for AI and HPC use cases. This role directly supports our vertically integrated, sustainable cloud platform by building and optimizing distributed, fault-tolerant storage systems at scale.

What You'll Be Working On:

In this role, you will build automation and self-healing tools to monitor and maintain Crusoe’s distributed cloud storage infrastructure, which includes block, file, and object storage systems. You will drive reliability initiatives focused on data replication, encryption, backup and restore strategies, and robust failover mechanisms. Collaborating closely with storage engineers, you will help implement and maintain high-performance NVMe- and SSD-backed volumes that support large-scale AI compute clusters. Your responsibilities will also include supporting user-facing storage services with a focus on availability, performance tuning, and adherence to error budgets. You’ll investigate and resolve storage-related incidents using deep telemetry, logs, and performance profiling, while also partnering with hardware and kernel teams to diagnose low-level I/O issues and optimize I/O paths, cache policies, and file systems. Additionally, you will contribute to the architecture of fault-tolerant, scalable storage backends tailored for AI-first cloud environments.


What You’ll Bring to the Team:

  • 5+ years of professional experience in SRE, systems, or storage engineering.

  • Hands-on experience with distributed storage systems (e.g., Ceph, GlusterFS, OpenEBS) and deep understanding of object, block, and file storage paradigms.

  • Proficiency in a programming language such as Python, Go, Java, or C.

  • Experience with Infrastructure as Code and deployment tooling such as Terraform, Ansible, or Puppet.

  • Deep knowledge of Linux internals with a focus on I/O subsystems, memory management, and storage scheduling.

  • Familiarity with storage protocols like NFS, SMB, iSCSI, or NVMe-oF.

  • Strong experience working with containerized workloads and orchestration platforms (e.g., Kubernetes, Docker).

  • Excellent incident response, troubleshooting, and documentation practices.

  • Experience with building and operating managed services at scale such as object, file and block storage (AWS, GCP, Azure)

  • Excellent communication skills

  • Must be able to pass a background check

  • Embody the Company values

Bonus Points:

  • Contributions to open-source storage projects or the Linux storage stack.

  • Experience with hybrid storage models across on-prem and cloud environments.

  • Familiarity with high-throughput network topologies for storage backplanes (e.g., RoCE, RDMA, InfiniBand)..

Benefits:

  • Industry competitive pay

  • Restricted Stock Units in a fast growing, well-funded technology company

  • Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents

  • Employer contributions to HSA accounts

  • Paid Parental Leave

  • Paid life insurance, short-term and long-term disability

  • Teladoc

  • 401(k) with a 100% match up to 4% of salary

  • Generous paid time off and holiday schedule

  • Cell phone reimbursement

  • Tuition reimbursement

  • Subscription to the Calm app

  • MetLife Legal

  • Company paid commuter benefit; $300 per month

Compensation:

Compensation will be paid in the range of $166,000 - $201,000 a year + Bonus. Restricted Stock Units are included in all offers. Compensation to be determined by the applicant’s education, experience, knowledge, skills, and abilities, as well as internal equity and alignment with market data.

Crusoe is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, disability, genetic information, pregnancy, citizenship, marital status, sex/gender, sexual preference/ orientation, gender identity, age, veteran status, national origin, or any other status protected by law or regulation.

Create a job alert for this search

Production Engineer, Storage • San Francisco, CA, US

Similar jobs
Hybrid Full-Stack Engineer for Energy Storage

Hybrid Full-Stack Engineer for Energy Storage

Form Energy • Berkeley, CA, United States
Full-time
A cutting-edge energy technology firm is looking for a full-stack software engineer to develop user-facing applications for their energy storage systems.This hybrid role requires on-site work in Be...Show more
Last updated: 30+ days ago • Promoted
Engineering Manager – API & Integrations(100% Work From Home)

Engineering Manager – API & Integrations(100% Work From Home)

NPA WorldWide • Menlo Park, California, USA
Remote
Full-time +1
They have 550 employees and 178,000 members.The company has unbelievable stability and have an 82% employee satisfaction rating 4% higher than the industry standard.The opportunity for professional...Show more
Last updated: 6 days ago • Promoted
Sr. Distributed Systems Engineer

Sr. Distributed Systems Engineer

Archil • San Francisco, CA, United States
Full-time
As a distributed systems engineer, you'll work across the stack to solve problems as they come up and help build Archil volumes.You'll have significant influence over the technical and product dire...Show more
Last updated: 3 days ago • Promoted
Senior Software Engineer, Storage

Senior Software Engineer, Storage

Patreon • San Francisco, CA, United States
Full-time
Patreon is a media and community platform where over 300,000 creators give their biggest fans access to exclusive work and experiences.We offer creators a variety of ways to engage with their fans ...Show more
Last updated: 28 days ago • Promoted
Senior Rust Engineer — From Research to Production (SF)

Senior Rust Engineer — From Research to Production (SF)

Victrays • San Francisco, CA, United States
Full-time
A leading AI research lab is seeking a Senior Software Engineer specializing in Rust to develop production-quality systems from prototypes.This role involves collaborating closely with researchers ...Show more
Last updated: 4 days ago • Promoted
Principal Cloud Engineering and Production Operations Engineer

Principal Cloud Engineering and Production Operations Engineer

A10 Networks, Inc • San Francisco, CA, United States
Full-time
Principal Cloud Engineering and Production Operations Engineer.Principal Cloud Engineering and Production Operations Engineer.The Principal Cloud and Production Operations Engineer serves as the se...Show more
Last updated: 30+ days ago • Promoted
Sr. Storage Infrastructure Engineer

Sr. Storage Infrastructure Engineer

Gusto • San Francisco, CA, United States
Full-time
As a Storage Infrastructure Engineer at Gusto, you will design, implement, and maintain production‑grade data storage platforms, ensuring they are resilient, usable, and secure for our product engi...Show more
Last updated: 13 days ago • Promoted
HPC Storage Systems Group Leader

HPC Storage Systems Group Leader

Lawrence Berkeley Lab • Berkeley, CA, United States
Full-time +2
The National Energy Research Scientific Computing Center (NERSC) is inviting applications for the position of Storage Systems Group (SSG) Lead.NERSC's mission is to accelerate scientific discovery ...Show more
Last updated: 3 days ago • Promoted
Engineer

Engineer

Emergent Labs • San Francisco, CA, United States
Full-time
Emergent builds autonomous coding agents that replace traditional software development by generating, testing, and deploying production applications directly from plain-language intent.Our systems ...Show more
Last updated: 3 days ago • Promoted
Senior Distributed Systems Engineer — Exascale Storage

Senior Distributed Systems Engineer — Exascale Storage

OpenAI • San Francisco, CA, United States
Full-time
A leading AI research company in California seeks a distributed systems engineer to design, build, and operate Exascale systems for managing research data.The ideal candidate will have expertise in...Show more
Last updated: 30+ days ago • Promoted
Associate Development Engineer (4758C) - Job 84710 - EECS

Associate Development Engineer (4758C) - Job 84710 - EECS

InsideHigherEd • Berkeley, California, United States
Full-time
Associate Development Engineer (4758C) - Job 84710 - EECS.At the University of California, Berkeley, we are dedicated to fostering a community where everyone feels welcome and can thrive.Our cultur...Show more
Last updated: 30+ days ago • Promoted
Senior Storage Engineer

Senior Storage Engineer

Lambda • San Francisco, CA, United States
Full-time
Lambda, The Superintelligence Cloud, is a leader in AI cloud infrastructure serving tens of thousands of customers.Our customers range from AI researchers to enterprises and hyperscalers.Lambda's m...Show more
Last updated: 3 days ago • Promoted
Senior Software Engineer, Storage

Senior Software Engineer, Storage

Crusoe Energy Systems LLC • San Francisco, CA, United States
Full-time
Cruose's mission is to accelerate the abundance of energy and intelligence.We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, spe...Show more
Last updated: 30+ days ago • Promoted
Senior Software Engineer - Storage Systems

Senior Software Engineer - Storage Systems

Verkada • San Mateo, CA, United States
Full-time
Verkada is transforming how organizations protect their people and places with an integrated, AI-powered platform.A leader in cloud physical security, Verkada helps organizations strengthen safety ...Show more
Last updated: 3 days ago • Promoted
Senior Software Engineer, Storage

Senior Software Engineer, Storage

Patreon, Inc. • San Francisco, CA, United States
Full-time
Patreon is a media and community platform where over 300,000 creators give their biggest fans access to exclusive work and experiences.We offer creators a variety of ways to engage with their fans ...Show more
Last updated: 30+ days ago • Promoted
Software Engineer - Data Production

Software Engineer - Data Production

REPLICA • San Francisco, CA, United States
Full-time
Software Engineer (Data Production Team).San Francisco, New York, Or Kansas City.Replica is a privacy-centric urban data platform that delivers critical insights about the built environment.With be...Show more
Last updated: 3 days ago • Promoted
Staff Engineer, Distributed Storage and HPC & AI Infrastructure

Staff Engineer, Distributed Storage and HPC & AI Infrastructure

Together AI • San Francisco, CA, United States
Full-time
Staff Engineer, Distributed Storage and HPC & AI Infrastructure.In this role, you will design and deliver multi-petabyte storage systems purpose-built for the world’s largest AI training and infere...Show more
Last updated: 30+ days ago • Promoted
Packaging Substrate Engineer

Packaging Substrate Engineer

Apple • San Francisco, CA, United States
Full-time
Our Hardware Technology Packaging team invents, designs, develops, and integrates electronic packaging solutions for Apple’s internal and custom external components of hardware for its consumer ele...Show more
Last updated: 2 days ago • Promoted