Talent.com
ML Infrastructure Engineer
ML Infrastructure EngineerPhizenix • Menlo Park, CA, United States
ML Infrastructure Engineer

ML Infrastructure Engineer

Phizenix • Menlo Park, CA, United States
30+ days ago
Job type
  • Full-time
  • Permanent
Job description

ML Infrastructure Engineer

Menlo Park, CA | On-Site | Full-Time / Direct Hire

Looking for ML Infra experts (Bay Area preferred) with deep experience in CUDA, GPU optimization, VLLMs, and LLM inference-pure language focus, no vision / audio.

Client Opportunity | Through Phizenix

Phizenix, a certified minority and women-led recruiting firm, is hiring on behalf of an AI startup pioneering diffusion-based large language models-built for faster generation, multimodal integration, and scalable enterprise deployment.

We're looking for a ML Infrastructure Engineer to help build the infrastructure that powers large-scale model training and real-time inference. You'll collaborate with world-class researchers and engineers to design high-performance, distributed systems that bring advanced LLMs into production.

Responsibilities

  • Design and manage distributed infrastructure for ML training at scale
  • Optimize model serving systems for low-latency inference
  • Build automated pipelines for data processing, model training, and deployment
  • Implement observability tools to monitor performance in production
  • Maximize resource utilization across GPU clusters and cloud environments
  • Translate research requirements into robust, scalable system designs

Must-Haves

  • Masters or PhD in Computer Science, Engineering, or a related field (or equivalent experience)
  • Strong foundation in software engineering, systems design, and distributed systems
  • Experience with cloud platforms (AWS, GCP, or Azure)
  • Proficient in Python and at least one systems-level language (C++ / Rust / Go)
  • Hands-on experience with Docker, Kubernetes, and CI / CD workflows
  • Familiarity with ML frameworks like PyTorch or TensorFlow from a systems perspective
  • Understanding of GPU programming and high-performance infrastructure
  • Nice-to-Haves

  • Experience with large-scale ML training clusters and GPU orchestration
  • Knowledge of LLM-serving tools (vLLM, TensorRT, ONNX Runtime)
  • Experience with distributed training strategies (e.g., data / model / pipeline parallelism)
  • Familiarity with orchestration tools like Kubeflow or Airflow
  • Background in performance tuning, system profiling, and MLOps best practices
  • At Phizenix , we're committed to supporting diverse and inclusive teams. This is your chance to shape the systems that power the next generation of AI innovation. Let's build the future-together.

    California Pay Range

    $180,000-$200,000 USD

    Create a job alert for this search

    Infrastructure Engineer • Menlo Park, CA, United States

    Related jobs
    Staff ML Infrastructure Engineer

    Staff ML Infrastructure Engineer

    Cubiq Recruitment • San Mateo, CA, United States
    Full-time
    Staff / Lead ML Infrastructure Engineer.Salary - Over market average + equity.We are building one of the world’s leading generative video and multimodal AI platforms, and we’re looking for a senior...Show more
    Last updated: 8 days ago • Promoted
    Senior Engineer, Data & ML Infrastructure

    Senior Engineer, Data & ML Infrastructure

    Protingent • Hillsborough, CA, United States
    Permanent
    Senior Engineer, Data & ML Infrastructure.Protingent Staffing has an exciting Remote Direct Hire Senior Engineer, Data & ML Infrastructure opportunity. Work with ML Engineers and Autonomy Software D...Show more
    Last updated: 6 days ago • Promoted
    ML Infrastructure Engineer, Safeguards

    ML Infrastructure Engineer, Safeguards

    Anthropic • San Francisco, CA, United States
    Full-time
    Anthropic's mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...Show more
    Last updated: 30+ days ago • Promoted
    Infrastructure Engineer

    Infrastructure Engineer

    Monograph • San Francisco, CA, United States
    Full-time
    Our mission at New Gen is to bend the internet to you.We envision a future where interfaces are personalized and powered by LLMs. We believe all websites and interfaces will eventually incorporate a...Show more
    Last updated: 24 days ago • Promoted
    Infrastructure Engineer, AI & LLM Platform (Hybrid)

    Infrastructure Engineer, AI & LLM Platform (Hybrid)

    Ivo • San Francisco, CA, United States
    Full-time
    A forward-thinking tech company in San Francisco is seeking an Infrastructure Engineer to design and manage complex distributed systems. As part of the engineering team, you will own the future of t...Show more
    Last updated: 6 days ago • Promoted
    ML Infra Consultant

    ML Infra Consultant

    Saxon Global • Redwood City, CA, United States
    Full-time
    BS / MS in Computer Science or related field.Strong foundation in machine learning, deep learning, and computer vision.Proven experience with scalable ML infrastructure and distributed systems.Profic...Show more
    Last updated: 14 days ago • Promoted
    ML Infrastructure Engineer

    ML Infrastructure Engineer

    Phizenix • Menlo Park, CA, US
    Full-time +1
    Menlo Park, CA | On-Site | Full-Time / Direct Hire.Client Opportunity | Through Phizenix.Phizenix, a certified minority and women-led recruiting firm, is hiring on behalf of an AI startup pioneering ...Show more
    Last updated: 30+ days ago • Promoted
    ML Infrastructure Engineer

    ML Infrastructure Engineer

    Virtue AI • San Francisco, CA, United States
    Full-time
    Be among the first 25 applicants.Get AI-powered advice on this job and more exclusive features.Virtue AI is at the forefront of AI security. As enterprises increasingly adopt Large Language Models, ...Show more
    Last updated: 6 days ago • Promoted
    Infrastructure Engineer, ML Systems

    Infrastructure Engineer, ML Systems

    Appliedcompute • San Francisco, CA, United States
    Full-time
    Applied Compute builds Specific Intelligence for enterprises, unlocking the knowledge inside a company to train custom models and deploy an in-house agent workforce. Today’s state-of-the-art AI isn’...Show more
    Last updated: 1 day ago • Promoted
    Infrastructure Engineer

    Infrastructure Engineer

    LangChain • San Francisco, CA, United States
    Full-time
    At LangChain, our mission is to make intelligent agents ubiquitous.We provide the agent engineering platform and open source frameworks developers need to ship reliable agents fast.Our open source ...Show more
    Last updated: 30+ days ago • Promoted
    Senior Software Engineer, ML Infrastructure

    Senior Software Engineer, ML Infrastructure

    LMArena • San Francisco, CA, United States
    Full-time
    Senior Software Engineer, ML Infrastructure.LMArena is seeking a Senior Software Engineer (Infrastructure) to lead the design and development of scalable, high-performance real-time data and API in...Show more
    Last updated: 30+ days ago • Promoted
    Software Engineer - ML Infrastructure

    Software Engineer - ML Infrastructure

    Specter • San Francisco, California, United States
    Full-time
    Specter is creating a software-defined "control plane" for the physical world.We are starting with protecting American businesses by granting them ubiquitous perception over their physical assets.T...Show more
    Last updated: 5 days ago • Promoted
    ML Infrastructure Engineer

    ML Infrastructure Engineer

    Bluespace LLC • Oakland, CA, United States
    Full-time
    Unlike conventional autonomy software, our patented 4D Predictive Perception removes reliance on data.By leveraging next-gen 4D sensors, we can precisely predict the motion of all objects, increasi...Show more
    Last updated: 30+ days ago • Promoted
    AIML - ML Infrastructure Engineer, ML Platform & Technology - ML Compute

    AIML - ML Infrastructure Engineer, ML Platform & Technology - ML Compute

    Apple • San Francisco, CA, United States
    Full-time
    Apple is where individual imaginations gather together, committing to the values that lead to great work.Every new product we build, service we create, or Apple Store experience we deliver is the r...Show more
    Last updated: 17 days ago • Promoted
    DevOps Engineering Lead - ML Infrastructure

    DevOps Engineering Lead - ML Infrastructure

    Symbolica • San Francisco, CA, US
    Full-time
    DevOps Engineering Lead - ML Infrastructure About Us.Symbolica is an AI research lab pioneering the application of category theory to enable logical reasoning in machines.We're a well-resourced, ni...Show more
    Last updated: 8 days ago • Promoted
    ML Infrastructure Generalist

    ML Infrastructure Generalist

    OpenAI • San Francisco, California, United States
    Full-time
    We're hiring across multiple teams, each focused on distinct areas of advancing artificial general intelligence (AGI).As an ML engineer supporting these teams, you’ll help build the infrastructure ...Show more
    Last updated: 30+ days ago • Promoted
    Machine Learning Infrastructure Engineer

    Machine Learning Infrastructure Engineer

    Character • Redwood City, CA, United States
    Full-time
    We're looking for seasoned ML Infrastructure engineers with experience designing, building and maintaining training and serving infrastructure for ML research. Provide infrastructure support to our ...Show more
    Last updated: 30+ days ago • Promoted
    Founding ML Infrastructure Engineer

    Founding ML Infrastructure Engineer

    UniversalAGI • San Francisco, CA, United States
    Full-time
    San Francisco | Work Directly with CEO & founding team | Report to CEO | OpenAI for Physics | 5 Days Onsite.Founding ML Infrastructure Engineer. Location : Onsite in San Francisco.Compensation : Compe...Show more
    Last updated: 14 days ago • Promoted