Job Title : Data Engineer (AWS, Python, PySpark)
Location : Richmond, VA or Dallas, TX (Hybrid)
Duration : Long Term Contract
Rate : $50 / hr on C2C
Interview Mode : Face-to-Face (Client Round)
Client Domain : Confidential
Job Description
We are seeking an experienced Data Engineer with strong expertise in AWS, Python, and PySpark to join our client s data engineering team. The ideal candidate will be responsible for building scalable data pipelines, optimizing ETL workflows, and managing data processing across distributed systems. This role requires hands-on experience in big data technologies, data modeling, and cloud-based data solutions in a hybrid work environment.
Key Responsibilities
Design, develop, and maintain scalable ETL pipelines using Python and PySpark on AWS.
Ingest, transform, and process large-scale structured and unstructured data from diverse sources.
Work with AWS services such as S3, Glue, EMR, Lambda, Redshift, and Athena.
Optimize data storage, retrieval, and query performance for analytics and reporting.
Implement and maintain data quality, validation, and governance processes.
Collaborate with data scientists, analysts, and application teams to deliver end-to-end data solutions.
Automate workflows, monitor data pipelines, and resolve data-related issues proactively.
Participate in Agile ceremonies, including sprint planning, reviews, and stand-ups.
Required Skills & Qualifications
6+ years of experience as a Data Engineer.
Strong programming skills in Python and PySpark.
Proven experience with AWS data services (Glue, S3, Lambda, EMR, Redshift).
Hands-on experience with ETL design, data pipelines, and data lake architectures.
Familiarity with SQL and relational databases (PostgreSQL, MySQL, or similar).
Experience with version control tools like Git and CI / CD pipelines.
Strong analytical and problem-solving skills.
Excellent communication and teamwork abilities.
Data Engineer Python • Richmond, VA, United States