Talent.com
Staff Software Engineer, Slurm

Staff Software Engineer, Slurm

Crusoe Energy Systems LLCSan Francisco, CA, United States
3 days ago
Job type
  • Full-time
Job description

Crusoe's mission is to accelerate the abundance of energy and intelligence. We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, speed, or sustainability.

Be a part of the AI revolution with sustainable technology at Crusoe. Here, you'll drive meaningful innovation, make a tangible impact, and join a team that’s setting the pace for responsible, transformative cloud infrastructure.

About the Role :

We are actively seeking an exceptional Staff Software Engineer to join our cloud software team, focusing specifically on building and operating Slurm as a fully managed cloud service within Crusoe Cloud. This role is crucial for delivering next-generation orchestration capabilities to power GPU-accelerated and high-performance computing (HPC) at scale.

Your expertise will be instrumental in designing and scaling our carbon-reducing operating model, and advancing our AI training clusters to lead the industry in reliability and performance. You will shape the technical direction of systems that allow customers to run advanced workloads across CPUs, NVIDIA and AMD GPUs, and high-performance networking environments.

You will be involved in writing and reviewing code, contributing to proposals, and drafting architecture documents. You will evaluate tools and frameworks, considering their impact on reliability, scalability, operational costs, and ease of adoption.

What You'll Be Working On :

Lead the development and engineering of our managed Slurm offering, providing a seamless experience for AI / ML and HPC customers who rely on robust Slurm job scheduling.

Contribute to the development of scalable and robust software solutions, closely aligning with the strategic objectives outlined in the Crusoe Cloud roadmap.

Design, build, and maintain Kubernetes operators and controllers dedicated to managing the lifecycle, configuration, and state of large-scale Slurm clusters.

Drive the integration of GPU acceleration in the Slurm environment, including device plugin architecture, GPU operators, accelerator-aware scheduling, and resource allocation.

Ensure that high-performance networking technologies, such as InfiniBand and RoCE, are correctly leveraged for distributed GPU workloads running through Slurm.

Implement and manage features such as multi-tenancy, cluster lifecycle management, auto-scaling, and high availability for the managed Slurm control plane services.

Develop scalable systems to compete with leading managed services.

Support the development of your peers by sharing knowledge and providing guidance in technical discussions.

What You'll Bring to the Team :

You have 7+ years of experience working in software engineering, with strong experience in Systems Engineering. Experience in distributed systems, cloud, or HPC environments is a must

You possess 2+ years of programming experience in GoLang . Strong proficiency in other systems languages (Rust, C++, Python for HPC tooling) is also beneficial.

You have extensive experience with Kubernetes and Linux Engineering and debugging .

You possess deep knowledge of Slurm (Simple Linux Utility for Resource Management) administration and the architecture required for managing compute jobs in high-performance environments.

You are skilled in infrastructure as code and familiar with systems-level challenges, ideally with experience utilizing Terraform .

You understand Argo, CI / CD, and Automated Testing pipelines . You can design system architecture, taking ownership of system architecture, including CI / CD pipelines, while ensuring adherence to security standards.

Strong knowledge of container networking (CNI plugins, service meshes) and Linux networking fundamentals.

Familiarity with GPU integration in Kubernetes, including device plugins and GPU operators.

You have excellent communication skills, both verbal and written.

Compensation Range

Compensation will be paid in the range of $185,000 - $224,000. Restricted Stock Units are included in all offers. Compensation to be determined by the applicants knowledge, education, and abilities, as well as internal equity and alignment with market data.

Crusoe is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, disability, genetic information, pregnancy, citizenship, marital status, sex / gender, sexual preference / orientation, gender identity, age, veteran status, national origin, or any other status protected by law or regulation.

#J-18808-Ljbffr

Create a job alert for this search

Staff Software Engineer • San Francisco, CA, United States

Related jobs
  • Promoted
Staff Developer Advocate

Staff Developer Advocate

VirtualVocationsConcord, California, United States
Full-time
A company is looking for a Staff Developer Advocate.Key Responsibilities Architect and develop AI-focused sample applications for developers Support and participate in local developer events and...Show moreLast updated: 30+ days ago
  • Promoted
Software Engineer

Software Engineer

VirtualVocationsHayward, California, United States
Full-time
A company is looking for a Software Engineer - AI.Key Responsibilities Design and maintain backend APIs and orchestration services using Node. Express and Python Build data pipelines (ETL) to ing...Show moreLast updated: 30+ days ago
  • Promoted
  • New!
Backend Software Engineer II

Backend Software Engineer II

VirtualVocationsHayward, California, United States
Full-time
A company is looking for a Software Engineer II : Backend.Key Responsibilities Solve smaller features and bugs Follow existing practices to ensure work is tracked and communicated from inception ...Show moreLast updated: 19 hours ago
  • Promoted
  • New!
Full Stack Software Engineer II

Full Stack Software Engineer II

VirtualVocationsConcord, California, United States
Full-time
A company is looking for a Software Engineer II : Full Stack.Key Responsibilities Solve smaller features and bugs Follow existing practices to ensure work is tracked and communicated from incepti...Show moreLast updated: 19 hours ago
  • Promoted
  • New!
Staff Software Engineer

Staff Software Engineer

SuperDialSan Mateo County, CA, US
Full-time
SuperDial is seeking a Staff Software Engineer to build and scale the backend systems that power large language model (LLM) applications in healthcare. This role is ideal for an engineer who thrives...Show moreLast updated: 14 hours ago
  • Promoted
  • New!
Staff Simulation Software Engineer

Staff Simulation Software Engineer

VirtualVocationsHayward, California, United States
Full-time
A company is looking for a Staff Simulation Software Engineer.Key Responsibilities Design and build simulation frameworks and tools to evaluate robot performance across diverse environments and u...Show moreLast updated: 15 hours ago
  • Promoted
Software Engineer I

Software Engineer I

VirtualVocationsConcord, California, United States
Full-time
A company is looking for a Software Engineer I.Key Responsibilities Write production code and contribute to various engineering pods Collaborate closely with product teams to deliver high-qualit...Show moreLast updated: 30+ days ago
  • Promoted
Staff Software Engineer, Platform

Staff Software Engineer, Platform

Scale AI, Inc.San Francisco, CA, United States
Full-time
Software is eating the world, but AI is eating software.We live in unprecedented times - AI has the potential to exponentially augment human intelligence. Every person will have a personal tutor, co...Show moreLast updated: 14 days ago
  • Promoted
Staff Software Engineer, Control

Staff Software Engineer, Control

PsiQuantumPalo Alto, CA, United States
Full-time
Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Show moreLast updated: 30+ days ago
  • Promoted
Staff Fullstack Software Engineer

Staff Fullstack Software Engineer

VirtualVocationsSanta Clara, California, United States
Full-time
A company is looking for a Staff Fullstack Software Engineer.Key Responsibilities Own the web and data frontend, API, ETL pipeline, and backend infrastructure Collaborate with fullstack and iOS ...Show moreLast updated: 1 day ago
  • Promoted
  • New!
Senior / Staff Software Engineer - AI Agent Infrastructure (Healthcare)

Senior / Staff Software Engineer - AI Agent Infrastructure (Healthcare)

Honey HealthHayward, CA, US
Full-time
Honey Health is the all-in-one AI back office for primary and specialty care.Our AI agents autonomously handle core back-office jobs, such as aggregating patients data, processing orders and prescr...Show moreLast updated: 14 hours ago
  • Promoted
Staff Software Engineer

Staff Software Engineer

SupermicroSan Jose, CA, United States
Full-time
Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...Show moreLast updated: 30+ days ago
  • Promoted
  • New!
Software Engineer, Fullstack

Software Engineer, Fullstack

VirtualVocationsFremont, California, United States
Full-time
A company is looking for a Software Engineer, Fullstack.Key Responsibilities Collaborate with reliability experts to assess and enhance the performance of the analytics platform Design and build...Show moreLast updated: 19 hours ago
  • Promoted
Software Engineer II

Software Engineer II

VirtualVocationsHayward, California, United States
Full-time
A company is looking for a Software Engineer II to tackle complex internet challenges and innovate customer-facing systems. Key Responsibilities Design and develop highly scalable software for cus...Show moreLast updated: 30+ days ago
  • Promoted
Staff Full-Stack Engineer

Staff Full-Stack Engineer

VirtualVocationsConcord, California, United States
Full-time
A company is looking for a Staff Full-Stack Engineer, Front-end (SEO Engineering).Key Responsibilities Lead initiatives to enhance performance, SEO, and user engagement for high-traffic organic p...Show moreLast updated: 30+ days ago
  • Promoted
Staff Software Engineer

Staff Software Engineer

PsiQuantumPalo Alto, CA, United States
Full-time
Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Show moreLast updated: 30+ days ago
  • Promoted
  • New!
Sr. Software Engineer

Sr. Software Engineer

VirtualVocationsFremont, California, United States
Full-time
A company is looking for a Sightline Sr.Key Responsibilities Develop, enhance, maintain, and support software within Enterprise Business Solutions environments, focusing on SAP and Workday system...Show moreLast updated: 15 hours ago
  • Promoted
Staff Software Engineer

Staff Software Engineer

Bio-Rad LaboratoriesHercules, CA, United States
Full-time
This role is both technical and collaborative.You will work closely with cross-functional teams including systems engineers, mechanical designers, assay development scientists, and quality engineer...Show moreLast updated: 30+ days ago
  • Promoted
Sr. Staff Software Engineer, Core Retrieval Infrastructure

Sr. Staff Software Engineer, Core Retrieval Infrastructure

PinterestPalo Alto, CA, United States
Full-time
Millions of people around the world come to our platform to find creative ideas, dream about new possibilities and plan for memories that will last a lifetime. At Pinterest, we're on a mission to br...Show moreLast updated: 30+ days ago
  • Promoted
Staff Software Engineer

Staff Software Engineer

VirtualVocationsConcord, California, United States
Full-time
A company is looking for a Staff Software Engineer to join their Product Engineering team.Key Responsibilities Contribute to the design and implementation of frontend applications Set team stand...Show moreLast updated: 30+ days ago