Job Title : MLOps Lead / AI / ML-Ops Engineer / AI / ML Engineer
Location : Dallas, TX
Duration : 6 Months
Skills :
- Overall 10+ years of experience with 4+ years of experience in MLOps, Machine Learning Engineering, or a related DevOps role with a focus on ML workflow
- Extensive hands-on experience in designing and implementing MLOps solutions on AWS. Proficient with core services like SageMaker, S3, ECS, EKS, Lambda
- Strong coding proficiency in Python. Extensive experience with automation tools, including Terraform for IaC and GitHub Actions.
- A solid understanding of MLOps and DevOps principles. Hands-on experience with MLOps frameworks like Sagemaker Pipelines, Model Registry, Weights and
- Expertise in developing and deploying containerized applications using Docker and orchestrating them with ECS and EKS.
- Experience with model testing, validation, and performance monitoring. Good understanding of ML frameworks like PyTorch or TensorFlow is required
- Excellent communication and documentation skills, with a proven ability to collaborate with cross-functional teams (data scientists, data engineers
Job Description :
Build & Automate ML Pipelines : Design, implement, and maintain CI / CD pipelines for machine learning models, ensuring automated data ingestion, model training, testing, versioning, and deployment.Operationalize Models : Collaborate closely with data scientists to containerize, optimize, and deploy their models to production, focusing on reproducibility, scalability, and performance.Infrastructure Management : Design and manage the underlying cloud infrastructure (AWS) that powers our MLOps platform, leveraging Infrastructure-as-Code (IaC) tools to ensure consistency and cost optimization.Monitoring & Observability : Implement comprehensive monitoring, alerting, and logging solutions to track model performance, data integrity, and pipeline health in real-time. Proactively address issues like model or data drift.Governance & Security : Establish and enforce best practices for model and data versioning, auditability, security, and access control across the entire machine learning lifecycle.Tooling & Frameworks : Develop and maintain reusable tools and frameworks to accelerate the ML development process and empower data science teams.Cloud Expertise : Extensive hands-on experience in designing and implementing MLOps solutions on AWS. Proficient with core services like SageMaker, S3, ECS, EKS, Lambda, SQS, SNS, and IAM.Coding & Automation : Strong coding proficiency in Python. Extensive experience with automation tools, including Terraform for IaC and GitHub Actions.MLOps & DevOps : A solid understanding of MLOps and DevOps principles. Hands-on experience with MLOps frameworks like Sagemaker Pipelines, Model Registry, Weights and Bias, MLflow or Kubeflow and orchestration tools like Airflow or Argo Workflows.Containerization : Expertise in developing and deploying containerized applications using Docker and orchestrating them with ECS and EKS.Model Lifecycle : Experience with model testing, validation, and performance monitoring. Good understanding of ML frameworks like PyTorch or TensorFlow is required to effectively collaborate with data scientists.Communication : Excellent communication and documentation skills, with a proven ability to collaborate with cross-functional teams (data scientists, data engineers, and architects).Keywords : MLOps, DevOps SageMaker. S3, ECS, EKS, Lambda, SQS, workflow, CI / CD, Docker, Kubernetes, PyTorch , TensorFlow