Talent.com
Principal Staff Engineer – AI Infrastructure - AI / ML Leader

Principal Staff Engineer – AI Infrastructure - AI / ML Leader

AndiamoBoston, MA, United States
7 days ago
Job type
  • Permanent
Job description

Overview

Principal Staff Engineer - AI Infrastructure. We are seeking a Principal Staff Engineer to lead the architecture and development of our next-generation AI infrastructure. This role sits at the intersection of large-scale distributed systems and cutting-edge machine learning, powering the platforms that enable researchers and engineers to build, train, and deploy AI models at global scale. As a senior technical leader, you will define architectural strategy, influence cross-organizational initiatives, and guide the design of highly reliable, efficient, and scalable systems. You’ll balance deep technical execution with strategic vision—mentoring senior engineers, collaborating with AI researchers, and ensuring our infrastructure accelerates innovation while maintaining world-class reliability.

What You’ll Do

  • Design & Scale AI Infrastructure : Architect and build distributed training, inference, and data pipelines that support large-scale AI workloads across GPUs and heterogeneous environments.
  • Lead Cloud-Native Innovation : Drive adoption of Kubernetes, Docker, and modern orchestration frameworks to optimize model deployment, resource allocation, and cluster utilization.
  • Optimize Performance at Scale : Develop high-throughput, low-latency services and memory-efficient systems to support petabyte-scale data and massive model sizes.
  • Advance Observability & Reliability : Implement monitoring, tracing, and fault-tolerance strategies to ensure resilient AI systems in production.
  • Collaborate with Research & Product : Partner with ML scientists, product engineers, and platform teams to design infrastructure that accelerates experimentation and model iteration.
  • Mentor & Inspire : Support the technical growth of senior engineers, fostering a culture of excellence, innovation, and ownership.
  • Shape Technical Strategy : Define long-term roadmaps for AI infrastructure, balancing near-term delivery with foundational investments in scalability, efficiency, and reliability.

What We’re Looking For

  • Extensive Experience : 10+ years in distributed systems, large-scale infrastructure, or platform engineering, with experience supporting AI / ML workloads strongly preferred.
  • Programming Mastery : Deep expertise in Java, Python, or C++, with proven ability to build performant and reliable systems.
  • AI / ML Infrastructure Knowledge : Familiarity with ML frameworks (TensorFlow, PyTorch, JAX), distributed training strategies, GPU scheduling, and data pipeline optimization.
  • Modern Infrastructure Skills : Hands-on experience with Kubernetes, Docker, CI / CD pipelines, cloud platforms (AWS / GCP / Azure), and observability tools (Prometheus, Grafana, Datadog).
  • Systems Design Expertise : Strong foundation in algorithms, concurrency, and systems architecture for high-scale, fault-tolerant environments.
  • Leadership & Influence : Demonstrated success driving cross-functional initiatives, mentoring senior engineers, and setting engineering-wide standards.
  • Product Mindset : Ability to balance technical rigor with usability and speed, ensuring infrastructure empowers rapid iteration and impactful outcomes.
  • About Andiamo

    Andiamo is a globally recognized staffing and consulting firm specializing in placing the top 2% of technology and go-to-market professionals with the world’s largest and most well-known companies. For over 20 years, we've maintained the status of tier-one vendor for firms such as Amazon, Bloomberg, Palantir, MasterCard, Visa, Two Sigma, Citadel, as well as other major financial services firms, elite hedge funds, Google-backed tech start-ups, and major software firms. Our talent solutions include Permanent Placement, Contract Staffing, Executive Search, and Dedicated Recruiting Services (RPO). Find out more at www.andiamogo.com

    #J-18808-Ljbffr

    Create a job alert for this search

    Principal Engineer Ai • Boston, MA, United States

    Related jobs
    • Promoted
    Lead AI Engineer (AI Foundations, LLM Core)

    Lead AI Engineer (AI Foundations, LLM Core)

    Capital OneCAMBRIDGE, Massachusetts, United States
    Full-time +1
    Lead AI Engineer (AI Foundations, LLM Core).At Capital One, we are creating responsible and reliable AI systems, changing banking for good. For years, Capital One has been an industry leader in usin...Show moreLast updated: 30+ days ago
    • Promoted
    Lead AI Engineer (AI Foundations, LLM Core and Agentic AI)

    Lead AI Engineer (AI Foundations, LLM Core and Agentic AI)

    Capital OneCambridge, MA, US
    Full-time +1
    Lead AI Engineer (AI Foundations, LLM Core and Agentic AI).At Capital One, we are creating responsible and reliable AI systems, changing banking for good. For years, Capital One has been an industry...Show moreLast updated: 20 days ago
    • Promoted
    Senior AI Engineer (AI Foundations, LLM Core)

    Senior AI Engineer (AI Foundations, LLM Core)

    Capital OneCambridge, MA, US
    Full-time +1
    Senior AI Engineer (AI Foundations).At Capital One, we are creating responsible and reliable AI systems, changing banking for good. For years, Capital One has been an industry leader in using machin...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Principal Scientist, Generative AI & Scalable ML Strategy

    Principal Scientist, Generative AI & Scalable ML Strategy

    AmazonBoston, MA, United States
    Full-time
    A leading technology company in Boston seeks a Principal Applied Scientist with expertise in deep learning and generative AI. The ideal candidate has over 5 years of experience in predictive modelin...Show moreLast updated: 1 hour ago
    • Promoted
    • New!
    Lead AI Engineer (AI Foundations, LLM Core, Agentic AI)

    Lead AI Engineer (AI Foundations, LLM Core, Agentic AI)

    Capital OneHarvard Square, MA, US
    Full-time +1
    Lead AI Engineer (AI Foundations, LLM Core, Agentic AI) Overview At Capital One, we are creating responsible and reliable AI systems, changing banking for good. For years, Capital One has been an in...Show moreLast updated: 11 hours ago
    • Promoted
    Senior Lead AI Engineer (AI Foundations, LLM Core)

    Senior Lead AI Engineer (AI Foundations, LLM Core)

    Capital OneCambridge, MA, US
    Full-time +1
    Senior Lead AI Engineer (Gen AI Platform Services).At Capital One, we are creating responsible and reliable AI systems, changing banking for good. For years, Capital One has been an industry leader ...Show moreLast updated: 30+ days ago
    • Promoted
    Senior AI Engineer (AI Foundations, LLM Core, Agentic AI)

    Senior AI Engineer (AI Foundations, LLM Core, Agentic AI)

    Capital OneCambridge, MA, US
    Full-time +1
    Senior AI Engineer (AI Foundations, LLM Core, Agentic AI).At Capital One, we are creating responsible and reliable AI systems, changing banking for good. For years, Capital One has been an industry ...Show moreLast updated: 21 days ago
    Principal AI Engineer - Intelligent Systems

    Principal AI Engineer - Intelligent Systems

    C-4 AnalyticsWakefield, MA, US
    Full-time
    Quick Apply
    Principal AI Engineer - Intelligent Systems : .C-4 Analytics C-4 Analytics is a fast-growing, private, full-service digital marketing company that excels at helping automotive dealerships increase sa...Show moreLast updated: 30+ days ago
    • Promoted
    Senior AI Engineer (AI Foundations, LLM Core and Agentic AI)

    Senior AI Engineer (AI Foundations, LLM Core and Agentic AI)

    Capital OneCambridge, MA, US
    Full-time +1
    Senior AI Engineer (AI Foundations, LLM Core and Agentic AI).At Capital One, we are creating responsible and reliable AI systems, changing banking for good. For years, Capital One has been an indust...Show moreLast updated: 20 days ago
    • Promoted
    • New!
    Senior Lead AI Engineer (AI Foundations, LLM Core, Agentic AI)

    Senior Lead AI Engineer (AI Foundations, LLM Core, Agentic AI)

    Capital OneHarvard Square, MA, US
    Full-time +1
    Senior Lead AI Engineer (AI Foundations, LLM Core, Agentic AI) Overview : At Capital One, we are creating responsible and reliable AI systems, changing banking for good. For years, Capital One has be...Show moreLast updated: 11 hours ago
    • Promoted
    Lead AI Engineer (AI Foundations)

    Lead AI Engineer (AI Foundations)

    Capital OneCambridge, MA, US
    Full-time +1
    Lead AI Engineer (AI Foundations).At Capital One, we are creating responsible and reliable AI systems, changing banking for good. For years, Capital One has been an industry leader in using machine ...Show moreLast updated: 21 days ago
    • Promoted
    Principal Machine Learning Engineer, AI Inference

    Principal Machine Learning Engineer, AI Inference

    Red HatBoston, MA, United States
    Full-time +1
    Principal Machine Learning Engineer, AI Inference page is loaded## Principal Machine Learning Engineer, AI Inferenceremote type : Hybridlocations : Bostonposted on : Posted Todayjob requisition ...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Principal, ML / AI

    Senior Principal, ML / AI

    XometryWaltham, MA, US
    Full-time
    Xometry (NASDAQ : XMTR) powers the industries of today and tomorrow by connecting the people with big ideas to the manufacturers who can bring them to life. Xometry's digital marketplace gives ma...Show moreLast updated: 19 days ago
    • Promoted
    Principal AI Solution Architect

    Principal AI Solution Architect

    McKinsey & CompanyBoston, MA, United States
    Full-time
    Principal AI Solution Architect — Job ID : 102466.As a Principal AI Solution Architect at McKinsey, you will combine deep expertise in full-stack engineering and advanced AI systems to deliver trans...Show moreLast updated: 30+ days ago
    • Promoted
    Principal Engineer, Data & ML Infrastructure

    Principal Engineer, Data & ML Infrastructure

    MotionalBoston, MA, US
    Full-time
    We are seeking a highly skilled and motivated Principal Engineer, Technical Lead to lead our large-scale AI model and software evaluation framework – Ground Truth Regression.The ideal candida...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Lead AI Engineer (AI Foundations, LLM Core and Agentic AI)

    Senior Lead AI Engineer (AI Foundations, LLM Core and Agentic AI)

    Capital OneCambridge, MA, US
    Full-time +1
    Senior Lead AI Engineer (AI Foundations, LLM Core and Agentic AI).At Capital One, we are creating responsible and reliable AI systems, changing banking for good. For years, Capital One has been an i...Show moreLast updated: 21 days ago
    • Promoted
    Senior Principal Architect, ML Infrastructure

    Senior Principal Architect, ML Infrastructure

    MotionalBoston, MA, US
    Full-time
    We're seeking a Senior Principal Engineer, Machine Learning Infrastructure to lead the technical vision and architecture for the systems that power our entire machine learning lifecycle—f...Show moreLast updated: 9 days ago
    • Promoted
    Principal / Senior Principal Machine Learning Engineer, llama.cpp

    Principal / Senior Principal Machine Learning Engineer, llama.cpp

    Red HatBoston, MA, United States
    Full-time +1
    Principal / Senior Principal Machine Learning Engineer, llama.At Red Hat we believe the future of AI is open, and we are on a mission to bring the power of open‑source LLMs and vLLM to every enterpri...Show moreLast updated: 4 days ago