Job Description
Job Description
- Are you passionate about using data and AI to drive environmental innovation at global scale.
- We’re looking for a Machine Learning Data Engineer to join clients AI for Sustainability team, part of the Worldwide Sustainability Science & Innovation (SSI).
- In this role, you’ll build and optimize the data pipelines that power next-generation AI models tackling some of the world’s most pressing sustainability challenges in human rights and environmental due diligence across our supplier base.
- Story Behind the Need – Business Group & Key Projects
Reason for Request : Team culture :
You’ll collaborate closely with Applied Scientists, Machine Learning Engineers, and domain experts to design end-to-end data systems that turn raw, complex datasets into AI ready workloads.This is a highly visible, hands-on opportunity to shape how client applies AI to advance sustainability at scale.Typical Day in the Role :
Daily responsibilities :
Design and maintain scalable, automated data pipelines for diverse data types (text, imagery, logs, geospatial, sensor data, and more).Transform, clean, and enrich data for machine learning and AI model training.Collaborate with cross-functional teams to understand data requirements and align them with business and scientific goals.Improve data quality, observability, and lineage tracking across sustainability data assets.Optimize performance and scalability using AWS-native technologies such as S3, Glue, Lambda, Redshift, and Athena.Support ML workflows by enabling reproducible data ingestion, transformation, and feature engineering.Requirements
3+ years of professional experience as a Data Engineer or Machine Learning Data Engineer.Familiarity / High-level understanding of common ML / AI techniques.Proficiency in Python and SQL for data manipulation and automation.Experience with data pipeline frameworks (e.g., Apache Airflow, Spark, or AWS Glue), PySpark, Pandas, and data modeling for ML workflows.Knowledge of data quality, observability, and governance frameworks.Strong working knowledge of AWS services or other cloud services providers for data processing and analytics.Preferred QualificationsExperience preparing datasets for machine learning or AI models (e.g., feature stores, labeling, or training data curation).Familiarity with geospatial, environmental, or sustainability datasets.Demonstrated passion for AI, sustainability, or environmental impact work.Additional Information
Contract Duration : 8 months (extendable to convert to blue badge depending on performance)Location : Onsite in Seattle, WA (5 days per week) - highly preferred.Remote option possible if candidate is an exceptional fit.Top 3 must-have hard skills
High-level understanding of common ML / AI techniques.Proficiency in Python and SQL for data manipulation and automation.Experience with data pipeline frameworks (e.g., Apache Airflow, Spark, or AWS Glue), PySpark, Pandas, and data modeling for ML workflows.