Talent.com
MLops Engineer
MLops EngineerArrayo • Boston, MA, United States
MLops Engineer

MLops Engineer

Arrayo • Boston, MA, United States
1 day ago
Job type
  • Full-time
Job description

MLops Engineer (Training Scalability & Workflow Optimization)

We are seeking an MLops Engineer to lead the scaling of machine learning training pipelines and ensure the robustness and efficiency of our end-to-end ML workflows. This role focuses on leveraging Flyte , Kubernetes (GPU optimization), Docker , and distributed training frameworks such as Ray to optimize and streamline our ML infrastructure.

Overview

This role focuses on leveraging Flyte , Kubernetes (GPU optimization), Docker , and distributed training frameworks such as Ray to optimize and streamline our ML infrastructure.

Responsibilities

  • Workflow Orchestration : Develop and maintain ML workflows using Flyte to manage complex ML pipelines for training, testing, and deployment.
  • Training Scalability : Architect and scale large-scale ML training systems on GPU-backed Kubernetes clusters , including auto-scaling and performance tuning for multi-node / multi-GPU workloads.
  • Distributed Computing : Implement distributed model training pipelines using frameworks like Ray for parallelization and resource efficiency.
  • Containerization : Design, build, and optimize Docker images for ML workloads with a focus on reproducibility and security.
  • Resource Optimization : Debug and optimize GPU utilization, memory, and compute bottlenecks during training and inference phases.
  • Monitoring & Maintenance : Integrate monitoring for ML jobs, track resource consumption, and enforce cost-efficient resource utilization.
  • Collaboration : Work closely with data scientists and ML engineers to productize and scale ML experiments.

Qualifications

  • Strong proficiency with Kubernetes (GPU scheduling, Helm, cluster autoscaling).
  • Hands-on experience with Flyte or similar workflow orchestration tools (Airflow, Prefect).
  • Deep knowledge of distributed ML training (e.g., PyTorch DDP, Ray, Horovod).
  • Expertise in Docker and container lifecycle management.
  • Solid understanding of GPU hardware / software stack (CUDA, NCCL).
  • Familiarity with CI / CD for ML (MLops pipelines using tools like GitHub Actions, ArgoCD).
  • Bonus : Familiarity with observability tools for ML systems (Prometheus, Grafana).
  • #J-18808-Ljbffr

    Create a job alert for this search

    Mlops Engineer • Boston, MA, United States

    Related jobs
    Field Project Manager (Stack Testing)- (All Levels)

    Field Project Manager (Stack Testing)- (All Levels)

    Alliance Technical Group • Canton, MA, United States
    Full-time
    Alliance Technical Group is a strategic and trusted partner providing premier solutions that support the full spectrum of our customers' environmental needs, and ultimately, helping to protect the ...Show more
    Last updated: 12 days ago • Promoted
    Senior DevSecOps Engineer

    Senior DevSecOps Engineer

    Starburst • Boston, MA, United States
    Full-time
    Starburst is the data platform for analytics, applications, and AI, unifying data across clouds and on-premises to accelerate AI innovation. Organizations-from startups to Fortune 500 enterprises in...Show more
    Last updated: 1 day ago • Promoted
    Senior Principal SoC Architect - Robotics & Industrial Automation

    Senior Principal SoC Architect - Robotics & Industrial Automation

    1010 Analog Devices Inc. • Wilmington, MA, United States
    Full-time +1
    NASDAQ : ADI ) is a global semiconductor leader that bridges the physical and digital worlds to enable breakthroughs at the Intelligent Edge. ADI combines analog, digital, and software technologie...Show more
    Last updated: 20 days ago • Promoted
    Principal DevSecOps Engineer - (Remote)

    Principal DevSecOps Engineer - (Remote)

    Shuvel Digital • Bedford, MA, United States
    Remote
    Full-time
    Bachelor Degree in Computer Science, Mathematics, or equivalent technical degree; or equivalent industry experience.The DevSecOps Engineer has extensive knowledge and hands-on experience integratin...Show more
    Last updated: 1 day ago • Promoted
    DevSecOps Engineer

    DevSecOps Engineer

    Raft • Hanscom Air Force Base, MA, United States
    Full-time
    All of the programs we support require.All work must be conducted within the continental U.Distributed Data Systems, Platforms at Scale, and Complex Application Development, with headquarters in Mc...Show more
    Last updated: 1 day ago • Promoted
    Software Reliability Engineer

    Software Reliability Engineer

    Raft • Hanscom Air Force Base, MA, United States
    Full-time
    All of the programs we support require.All work must be conducted within the continental U.Distributed Data Systems, Platforms at Scale, and Complex Application Development, with headquarters in Mc...Show more
    Last updated: 30+ days ago • Promoted
    Business Intelligence Developer

    Business Intelligence Developer

    Kelmar • Wakefield, MA, United States
    Full-time
    The Business Intelligence Developer will report to, and work closely with other developers across multiple projects to help create custom and complex reports, dashboards and other visualizations us...Show more
    Last updated: 10 days ago • Promoted
    Principal Data Engineer, Attack Surface Intelligence

    Principal Data Engineer, Attack Surface Intelligence

    Recorded Future • Boston, MA, United States
    Full-time
    With 1,000 intelligence professionals, over $300M in sales, and serving over 1,900 clients worldwide, Recorded Future is the world’s most advanced, and largest, intelligence company!.Lead the desig...Show more
    Last updated: 30+ days ago • Promoted
    ETL / ODI Developer

    ETL / ODI Developer

    ShiftCode Analytics • Boston, MA, United States
    Temporary
    Looking for OAS experience, and preference given to people with Oracle cloud experience.Experience in Data warehousing.Work directly with users to gather requirements and create ETL solutions.The E...Show more
    Last updated: 30+ days ago • Promoted
    OBIA Developer with Informatica experience - Boston, MA

    OBIA Developer with Informatica experience - Boston, MA

    Staffing LLC • Boston, MA, United States
    Temporary
    OBIA Developer W / Informatica Experience.Location : Boston, MA - locals preferred.Position Type : 4+ months Contract.Must be US Citizen or Green Card. Short Description : Client needs a Green Card Hold...Show more
    Last updated: 1 day ago • Promoted
    DevSecOps Engineer

    DevSecOps Engineer

    ASM Research, An Accenture Federal Services Company • Boston, MA, United States
    Full-time
    Responsible for the deployment, provisioning, hardening, and optimization of devsecops and cloud infrastructure and related cloud services. Utilizes scripting and infrastructure technologies for env...Show more
    Last updated: 1 day ago • Promoted
    Senior DevSecOps Engineer (Tewksbury)

    Senior DevSecOps Engineer (Tewksbury)

    RTX • Tewksbury, MA, United States
    Full-time
    Assabet 50 Apple Hill Drive Assabet - Building 1, Tewksbury, MA, 01876 USA.Person, or Immigration Status Requirements : . At Raytheon, the foundation of everything we do is rooted in our values and a ...Show more
    Last updated: 1 day ago • Promoted
    Senior DevSecOps Engineer (Tewksbury) in Boston

    Senior DevSecOps Engineer (Tewksbury) in Boston

    Energy Jobline ZR • Boston, MA, United States
    Full-time
    Energy Jobline is the largest and fastest growing global Energy Job Board and Energy Hub.We have an audience reach of over 7 million energy professionals, 400,000+ monthly advertised global energy ...Show more
    Last updated: 1 day ago • Promoted
    Fraud and AML Solution Architect

    Fraud and AML Solution Architect

    Business Needs Inc. • Boston, MA, US
    Full-time
    Title : Fraud and AML Solution Architect.Location : Boston, MA (Hybrid 3 days a week onsite).An experienced team member to lead Fraud and AML platforms projects as Solution Architecture capacity.The ...Show more
    Last updated: 19 days ago • Promoted
    Principal DevSecOps Engineer in Boston

    Principal DevSecOps Engineer in Boston

    Energy Jobline ZR • Boston, MA, United States
    Full-time
    Energy Jobline is the largest and fastest growing global Energy Job Board and Energy Hub.We have an audience reach of over 7 million energy professionals, 400,000+ monthly advertised global energy ...Show more
    Last updated: 1 day ago • Promoted
    Dynamics Developer

    Dynamics Developer

    VirtualVocations • Dorchester, Massachusetts, United States
    Full-time
    A company is looking for a Dynamics Developer for a remote position focused on testing and validating Microsoft Dynamics 365 applications. Key Responsibilities Develop comprehensive test plans, te...Show more
    Last updated: 30+ days ago • Promoted
    Senior DevSecOps Engineer

    Senior DevSecOps Engineer

    Motion Recruitment Partners LLC • Boston, MA, United States
    Full-time +1
    Job Description Our client in the energy management space is hiring a DevSecOps Engineer for the cybersecurity arm of their organization. This is a hybrid 6-month contract to hire role in Government...Show more
    Last updated: 1 day ago • Promoted
    DevSecOps Engineer

    DevSecOps Engineer

    Booz Allen Hamilton • Lexington, MA, United States
    Full-time +1
    Today's dynamic technology landscape demands constant and rapid innovation.To facilitate this transformation, we must ensure continuous integration and application development.That's why we need yo...Show more
    Last updated: 1 day ago • Promoted