Job Description
Job Description
Role : AWS Data Engineer with Sage maker
Location : Reston, VA (Day 1 onsite – 5 days in a week)
We are seeking a motivated Senior Data Engineer with strong AWS and data platform experience to design, build, and operationalize scalable data processing and ML-ready pipelines. The ideal candidate will be hands-on with PySpark, RedShift, Glue, and automation using CI / CD and scripting.
Key Responsibilities
- Design and implement scalable ETL / ELT pipelines on AWS for batch and near-real-time workloads.
- Build and optimize data processing jobs using PySpark on EMR and Glue.
- Develop and manage RedShift schemas, queries, and Spectrum for external table access.
- Integrate machine learning workflows with SageMaker and Lambda-driven orchestration.
- Automate deployments and testing using CI / CD tools and source control (Jenkins, UCD, Bitbucket, GitHub).
- Create and maintain operational scripts and tooling (Shell, Python) for monitoring, troubleshooting, and performance tuning.
Must-have Skills
AWS services (EMR / SageMaker , Lambda, RedShift, Glue, SNS, SQS)PySpark and data processing frameworksShell scripting and Python developmentCI / CD tooling experience (Jenkins, UCD)Source control experience with Bitbucket and GitHubExperience building and maintaining scripts / tools for automationNice-to-have
Familiarity with AWS ECSExperience with Aurora PostgreSQLJava for tooling or pipeline components