Talent.com
ML Infrastructure Engineer
ML Infrastructure EngineerPhizenix • Menlo Park, CA, United States
ML Infrastructure Engineer

ML Infrastructure Engineer

Phizenix • Menlo Park, CA, United States
30+ days ago
Job type
  • Full-time
  • Permanent
Job description

ML Infrastructure Engineer

Menlo Park, CA | On-Site | Full-Time / Direct Hire

Looking for ML Infra experts (Bay Area preferred) with deep experience in CUDA, GPU optimization, VLLMs, and LLM inference-pure language focus, no vision / audio.

Client Opportunity | Through Phizenix

Phizenix, a certified minority and women-led recruiting firm, is hiring on behalf of an AI startup pioneering diffusion-based large language models-built for faster generation, multimodal integration, and scalable enterprise deployment.

We're looking for a ML Infrastructure Engineer to help build the infrastructure that powers large-scale model training and real-time inference. You'll collaborate with world-class researchers and engineers to design high-performance, distributed systems that bring advanced LLMs into production.

Responsibilities

  • Design and manage distributed infrastructure for ML training at scale
  • Optimize model serving systems for low-latency inference
  • Build automated pipelines for data processing, model training, and deployment
  • Implement observability tools to monitor performance in production
  • Maximize resource utilization across GPU clusters and cloud environments
  • Translate research requirements into robust, scalable system designs

Must-Haves

  • Masters or PhD in Computer Science, Engineering, or a related field (or equivalent experience)
  • Strong foundation in software engineering, systems design, and distributed systems
  • Experience with cloud platforms (AWS, GCP, or Azure)
  • Proficient in Python and at least one systems-level language (C++ / Rust / Go)
  • Hands-on experience with Docker, Kubernetes, and CI / CD workflows
  • Familiarity with ML frameworks like PyTorch or TensorFlow from a systems perspective
  • Understanding of GPU programming and high-performance infrastructure
  • Nice-to-Haves

  • Experience with large-scale ML training clusters and GPU orchestration
  • Knowledge of LLM-serving tools (vLLM, TensorRT, ONNX Runtime)
  • Experience with distributed training strategies (e.g., data / model / pipeline parallelism)
  • Familiarity with orchestration tools like Kubeflow or Airflow
  • Background in performance tuning, system profiling, and MLOps best practices
  • At Phizenix , we're committed to supporting diverse and inclusive teams. This is your chance to shape the systems that power the next generation of AI innovation. Let's build the future-together.

    California Pay Range

    $180,000-$200,000 USD

    Create a job alert for this search

    Infrastructure Engineer • Menlo Park, CA, United States

    Related jobs
    Infrastructure Engineer

    Infrastructure Engineer

    Roboflow • San Francisco, California, USA
    Full-time
    Our mission is to make the world programmable.Sight is one of the key ways we understand the world and soon this will be true for the software we use too. Were building the tools community and resou...Show more
    Last updated: 11 days ago • Promoted
    ML Infrastructure Engineer, Safeguards

    ML Infrastructure Engineer, Safeguards

    Anthropic • San Francisco, CA, United States
    Full-time
    Anthropic's mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...Show more
    Last updated: 30+ days ago • Promoted
    Infrastructure Engineer

    Infrastructure Engineer

    Monograph • San Francisco, CA, United States
    Full-time
    Our mission at New Gen is to bend the internet to you.We envision a future where interfaces are personalized and powered by LLMs. We believe all websites and interfaces will eventually incorporate a...Show more
    Last updated: 24 days ago • Promoted
    Infrastructure Engineer, AI & LLM Platform (Hybrid)

    Infrastructure Engineer, AI & LLM Platform (Hybrid)

    Ivo • San Francisco, CA, United States
    Full-time
    A forward-thinking tech company in San Francisco is seeking an Infrastructure Engineer to design and manage complex distributed systems. As part of the engineering team, you will own the future of t...Show more
    Last updated: 6 days ago • Promoted
    ML Infrastructure Engineer

    ML Infrastructure Engineer

    Phizenix • Menlo Park, CA, US
    Full-time +1
    Menlo Park, CA | On-Site | Full-Time / Direct Hire.Client Opportunity | Through Phizenix.Phizenix, a certified minority and women-led recruiting firm, is hiring on behalf of an AI startup pioneering ...Show more
    Last updated: 30+ days ago • Promoted
    ML Infrastructure Engineer

    ML Infrastructure Engineer

    Virtue AI • San Francisco, CA, United States
    Full-time
    Be among the first 25 applicants.Get AI-powered advice on this job and more exclusive features.Virtue AI is at the forefront of AI security. As enterprises increasingly adopt Large Language Models, ...Show more
    Last updated: 7 days ago • Promoted
    Infrastructure Engineer, ML Systems

    Infrastructure Engineer, ML Systems

    Appliedcompute • San Francisco, CA, United States
    Full-time
    Applied Compute builds Specific Intelligence for enterprises, unlocking the knowledge inside a company to train custom models and deploy an in-house agent workforce. Today’s state-of-the-art AI isn’...Show more
    Last updated: 7 hours ago • Promoted • New!
    Platform & Infrastructure Engineer

    Platform & Infrastructure Engineer

    Mindsdb • San Francisco, California, United States
    Full-time
    MindsDB is a fast-growing AI startup headquartered in San Francisco, California.MindsDB is an AI Analytics solution that connects to diverse data sources and applications then unifies structured an...Show more
    Last updated: 30+ days ago • Promoted
    ML Infrastructure Engineer

    ML Infrastructure Engineer

    Openai • San Francisco, California, United States
    Full-time
    The Runtime team builds the low level framework components to power our ML training systems.We work on building robust, scalable, high performance components to support our distributed training wor...Show more
    Last updated: 30+ days ago • Promoted
    Infrastructure Engineer

    Infrastructure Engineer

    Mercor • San Francisco, California, United States
    Full-time
    Mercor is training models that predict how well someone will perform on a job better than a human can.We use our platform to source, vet, and onboard expert contractors who help train AI models in ...Show more
    Last updated: 30+ days ago • Promoted
    Software Engineer - ML Infrastructure

    Software Engineer - ML Infrastructure

    Specter • San Francisco, California, United States
    Full-time
    Specter is creating a software-defined "control plane" for the physical world.We are starting with protecting American businesses by granting them ubiquitous perception over their physical assets.T...Show more
    Last updated: 6 days ago • Promoted
    ML Infrastructure Engineer

    ML Infrastructure Engineer

    Bluespace LLC • Oakland, CA, United States
    Full-time
    Unlike conventional autonomy software, our patented 4D Predictive Perception removes reliance on data.By leveraging next-gen 4D sensors, we can precisely predict the motion of all objects, increasi...Show more
    Last updated: 30+ days ago • Promoted
    AIML - ML Infrastructure Engineer, ML Platform & Technology - ML Compute

    AIML - ML Infrastructure Engineer, ML Platform & Technology - ML Compute

    Apple • San Francisco, CA, United States
    Full-time
    Apple is where individual imaginations gather together, committing to the values that lead to great work.Every new product we build, service we create, or Apple Store experience we deliver is the r...Show more
    Last updated: 17 days ago • Promoted
    Lead Infrastructure Engineer

    Lead Infrastructure Engineer

    PIP Labs • San Francisco, California, United States
    Full-time
    Story aims to grow the creativity of the internet.The internet has introduced Story is building the IP infrastructure for the internet era, where creativity and intelligence move at the speed of cu...Show more
    Last updated: 30+ days ago • Promoted
    Infrastructure Engineer

    Infrastructure Engineer

    Vibecode • San Francisco, California, United States
    Full-time
    We're democratizing software creation.Our platform lets anyone describe an idea and instantly turn it into a working application—no coding required. We're solving one of computing's fundamental chal...Show more
    Last updated: 30+ days ago • Promoted
    MTS, Infrastructure Engineer

    MTS, Infrastructure Engineer

    Delphina • San Francisco, California, United States
    Full-time
    Today’s Data Scientists are in pain - spending their time manually wrangling data, building models through slow trial and error, taking on painstaking rewrites for deployment, and dealing with coun...Show more
    Last updated: 30+ days ago • Promoted
    Infrastructure Engineer

    Infrastructure Engineer

    Retool • San Francisco, California, United States
    Full-time
    Nearly every company in the world runs on custom software for critical operations like tracking performance metrics, handling customer support workflows, building admin dashboards, and countless ot...Show more
    Last updated: 30+ days ago • Promoted
    ML Infrastructure Generalist

    ML Infrastructure Generalist

    OpenAI • San Francisco, California, United States
    Full-time
    We're hiring across multiple teams, each focused on distinct areas of advancing artificial general intelligence (AGI).As an ML engineer supporting these teams, you’ll help build the infrastructure ...Show more
    Last updated: 30+ days ago • Promoted