Job Summary :
The Senior Data Engineer will be responsible for designing, building, and maintaining robust data pipelines and architectures on AWS to support scalable data processing, storage, and analytics. The ideal candidate will possess deep expertise in AWS services, PySpark, and data modeling, with proven experience in developing data-driven solutions that support business intelligence and analytics initiatives.
Key Responsibilities :
Design, develop, and optimize data ingestion and transformation pipelines using AWS Glue , PySpark , and other AWS-native services.
Build and manage data lake and data warehouse solutions using Amazon S3 and Amazon Redshift .
Develop and maintain data models, schemas, and ETL frameworks to ensure efficient data organization and accessibility.
Collaborate with data analysts, scientists, and business teams to understand data requirements and deliver reliable solutions.
Implement data quality checks, validation frameworks, and automation to ensure data accuracy and reliability.
Integrate version control and CI / CD practices using Git and related tools.
Monitor and optimize performance, scalability, and cost efficiency of data pipelines in AWS.
Ensure compliance with data governance, security, and best practices in data handling and storage.
Required Skills and Qualifications :
Bachelor's or Master's degree in Computer Science, Information Technology, or related field.
6+ years of experience as a Data Engineer or in a similar role.
Strong hands-on experience with AWS data services (Glue, Redshift, S3, Lambda, IAM, CloudWatch).
Proficiency in PySpark for large-scale data transformation and processing.
Deep understanding of data modeling , ETL development , and data warehousing concepts .
Proficient in SQL and performance optimization for analytical workloads.
Experience with Git for version control and collaborative development.
Strong problem-solving, debugging, and analytical skills.
Good to Have :
Experience with Terraform or CloudFormation for infrastructure automation.
Familiarity with Apache Airflow or similar orchestration tools.
Knowledge of data governance, cataloging, and metadata management practices.
Soft Skills :
Excellent communication and documentation skills.
Ability to work collaboratively in agile, cross-functional teams.
Strong ownership mindset with attention to detail and quality.
Engineer Engineer • Pleasanton, CA, California, USA