Talent.com
Supercompute Infrastructure Engineer

Supercompute Infrastructure Engineer

Periodic LabsMenlo Park, CA, United States
21 hours ago
Job type
  • Full-time
Job description

About Periodic Labs

We are an AI + physical sciences lab building state of the art models to make novel scientific discoveries. We are well funded and growing rapidly. Team members are owners who identity and solve problems without boundaries or bureaucracy. We eagerly learn new tools and new science to push forward our mission.

About the Role

You will lead, design, build, and operate large-scale compute clusters to power AI scientific research.

You will write software that orchestrates large GPU and CPU clusters, manages resource allocation and automates cluster lifecycle operations. You will work on bringup, operations and maintenance of all aspects of these clusters.

You will build tools and get directly involved in large scale frontier research experiments to make Periodic Labs the world's best AI + science lab for physicists, computational materials scientists, AI researchers, and engineers.

We're looking for distributed systems engineers with experience in managing large-scale compute environments, high-performance clusters, or similar hyperscale infrastructure.

You might thrive in this role if you have experience with :

=5,000 GPU clusters

  • Cluster scheduling and orchestration tools like k8s and slurm
  • Cloud environments such as GCP, AWS, or Azure
  • Observability and monitoring tools like DataDog, Prometheus, Grafana, or VictoriaMetrics
  • IaC tools like terraform and ansible
  • GitOps tools like Github CI and ArgoCD
Create a job alert for this search

Infrastructure Engineer • Menlo Park, CA, United States

Related jobs
  • Promoted
Infrastructure Deployment Engineer

Infrastructure Deployment Engineer

Cloudflare IncSan Francisco, CA, United States
Full-time
At Cloudflare, we are on a mission to help build a better Internet.Today the company runs one of the world's largest networks that powers millions of websites and other Internet properties for cust...Show moreLast updated: 1 day ago
  • Promoted
Infrastructure Engineer

Infrastructure Engineer

OuterboundsSan Francisco, CA, United States
Full-time
We are building Metaflow (which we started at Netflix) - an open-source, human-centric ML framework that helps data scientists and ML engineers develop and deliver real-life ML projects.Besides Net...Show moreLast updated: 1 day ago
  • Promoted
Infrastructure Engineer - eero, eero Foundations - Cloud Systems and Infrastructure

Infrastructure Engineer - eero, eero Foundations - Cloud Systems and Infrastructure

AmazonSan Francisco, CA, United States
Full-time
WiFi has become a critical component to every home worldwide.Amazon Company, is the first product to deliver a whole home WiFi experience using mesh technology to make sure you never have to worry ...Show moreLast updated: 1 day ago
  • Promoted
Infrastructure Engineer - (Dublin, CA)

Infrastructure Engineer - (Dublin, CA)

Articul8Dublin, CA, United States
Full-time
At Articul8 AI, we relentlessly pursue excellence and create exceptional AI products that exceed customer expectations.We are a team of dedicated individuals who take pride in our work and strive f...Show moreLast updated: 1 day ago
  • Promoted
Infrastructure Engineer

Infrastructure Engineer

Mercor IncSan Francisco, CA, United States
Full-time
Mercor is at the intersection of labor markets and AI research.We partner with leading AI labs and enterprises to provide the human intelligence essential to AI development.Our vast talent network ...Show moreLast updated: 1 day ago
  • Promoted
Infrastructure Engineer

Infrastructure Engineer

LangChainSan Francisco, CA, United States
Full-time
At LangChain, our mission is to make intelligent agents ubiquitous.We provide the agent engineering platform and open source frameworks developers need to ship reliable agents fast.Our open source ...Show moreLast updated: 30+ days ago
  • Promoted
Cluster Infrastructure Engineer

Cluster Infrastructure Engineer

Cartesia, Inc.San Francisco, CA, United States
Full-time
Our mission is to build the next generation of AI : ubiquitous, interactive intelligence that runs wherever you are.Today, not even the best models can continuously process and reason over a year-lo...Show moreLast updated: 1 day ago
  • Promoted
Infrastructure Engineer

Infrastructure Engineer

FactorySan Francisco, CA, United States
Full-time
Factory is seeking seasoned Infrastructure Engineers to architect, build, and maintain our cloud infrastructure.Lead the design and implementation of robust, secure, and highly scalable cloud infra...Show moreLast updated: 30+ days ago
  • Promoted
Software Infrastructure & Platform Engineer

Software Infrastructure & Platform Engineer

PsiQuantumPalo Alto, CA, United States
Full-time
Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Show moreLast updated: 30+ days ago
  • Promoted
Infrastructure Engineer

Infrastructure Engineer

ReteamSan Francisco, CA, United States
Full-time
TEST TEST TEST] This is a test job board, used for internal testing.Applications received here will not be received.Infra Engineer, working on high-impact systems!. TEST TEST TEST] This is a test jo...Show moreLast updated: 1 day ago
  • Promoted
Infrastructure Engineer - SupercomputingPalo Alto, CA

Infrastructure Engineer - SupercomputingPalo Alto, CA

XaiSan Francisco, CA, United States
Full-time
Infrastructure Engineer - Supercomputing.AI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly moti...Show moreLast updated: 1 day ago
  • Promoted
MTS, Infrastructure Engineer

MTS, Infrastructure Engineer

DelphinaSan Francisco, CA, United States
Full-time
Today's Data Scientists are in pain - spending their time manually wrangling data, building models through slow trial and error, taking on painstaking rewrites for deployment, and dealing with coun...Show moreLast updated: 1 day ago
  • Promoted
Infrastructure Engineer (Dublin, CA)

Infrastructure Engineer (Dublin, CA)

Articul8 AIDublin, CA, United States
Full-time
Infrastructure Engineer (Dublin, CA).Articul8 AI is seeking an exceptional Product / Software Engineer-Infrastructure to join us in shaping the future of Generative Artificial Intelligence (GenAI).We...Show moreLast updated: 1 day ago
  • Promoted
Infrastructure Engineer - Supercomputing

Infrastructure Engineer - Supercomputing

XaiSan Francisco, CA, United States
Full-time
Infrastructure Engineer - Supercomputing.San Francisco & Palo Alto, CA - Apply.AIs mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of kno...Show moreLast updated: 1 day ago
  • Promoted
Infrastructure Engineer

Infrastructure Engineer

RetoolSan Francisco, CA, United States
Full-time
Nearly every company in the world runs on custom software for critical operations like tracking performance metrics, handling customer support workflows, building admin dashboards, and countless ot...Show moreLast updated: 1 day ago
  • Promoted
Infrastructure Engineer

Infrastructure Engineer

DescriptSan Francisco, CA, United States
Full-time
Descript is on a mission to make audio and video content creation and editing fast, easy, and accessible to all.We are building a cutting-edge media editor incorporating real time collaboration, gr...Show moreLast updated: 1 day ago
  • Promoted
Cluster Infrastructure Engineer

Cluster Infrastructure Engineer

CartesiaSan Francisco, CA, United States
Full-time
Our mission is to build the next generation of AI : ubiquitous, interactive intelligence that runs wherever you are.Today, not even the best models can continuously process and reason over a year-lo...Show moreLast updated: 20 days ago
  • Promoted
Infrastructure Engineer

Infrastructure Engineer

LangChain, IncSan Francisco, CA, United States
Full-time
At LangChain, our mission is to make intelligent agents ubiquitous.We provide the agent engineering platform and open source frameworks developers need to ship reliable agents fast.Our open source ...Show moreLast updated: 1 day ago