Job Description
The Data Engineer will be responsible for designing, developing, and maintaining robust and scalable data pipelines and data solutions. This role requires strong expertise in cloud-based data platforms, particularly AWS services like Glue and Lambda , combined with proficiency in  Python, PySpark , and  Snowflake for  data warehousing . The ideal candidate will ensure efficient data ingestion, transformation, and availability for analytics and reporting, contributing to data-driven decision-making.Key Responsibilities :
- Design, build, and maintain scalable and efficient  ETL / ELT pipelines using AWS Glue, AWS Lambda, and PySpark for data ingestion, transformation, and loading into Snowflake.
- Develop and optimize data models within Snowflake, ensuring high performance and adherence to best practices for data warehousing.
- Write, test, and deploy production-grade Python and PySpark code for data processing and manipulation.
- Implement and manage data orchestration and scheduling using AWS services or other relevant tools.
- Monitor data pipeline performance, troubleshoot issues, and implement optimizations for improved efficiency and reliability.
- Ensure data quality, integrity, and security across all data solutions, adhering to compliance standards.
Required Skills & Qualifications :
Bachelor's degree in Computer Science, Engineering, or a related field.5+ years of hands-on experience in data engineering or a similar role.Strong proficiency in Python and PySpark for data manipulation and processing.Extensive experience with AWS services, specifically AWS Glue and AWS Lambda, for building and managing data pipelines.Solid understanding of SQL and experience with relational and NoSQL databases.Familiarity with version control systems (e.g., Git).Good to Have :
Experience with other AWS services such as S3, Redshift, Athena, Step Functions, Kinesis, CloudWatch.Experience with IBM Datastage or CP4DKnowledge of CI / CD pipelines and infrastructure-as-code tools (e.g., Terraform, CloudFormation).Experience with data governance and security best practices.Familiarity with real-time data processing concepts.Job Requirements The Data Engineer will be responsible for designing, developing, and maintaining robust and scalable data pipelines and data solutions. This role requires strong expertise in cloud-based data platforms, particularly AWS services like Glue and Lambda , combined with proficiency in Python, PySpark , and Snowflake for data warehousing . The ideal candidate will ensure efficient data ingestion, transformation, and availability for analytics and reporting, contributing to data-driven decision-making.Key Responsibilities :
Design, build, and maintain scalable and efficient ETL / ELT pipelines using AWS Glue, AWS Lambda, and PySpark for data ingestion, transformation, and loading into Snowflake.Develop and optimize data models within Snowflake, ensuring high performance and adherence to best practices for data warehousing.Write, test, and deploy production-grade Python and PySpark code for data processing and manipulation.Implement and manage data orchestration and scheduling using AWS services or other relevant tools.Monitor data pipeline performance, troubleshoot issues, and implement optimizations for improved efficiency and reliability.Ensure data quality, integrity, and security across all data solutions, adhering to compliance standards.Required Skills & Qualifications :
Bachelor's degree in Computer Science, Engineering, or a related field.5+ years of hands-on experience in data engineering or a similar role.Strong proficiency in Python and PySpark for data manipulation and processing.Extensive experience with AWS services, specifically AWS Glue and AWS Lambda, for building and managing data pipelines.Solid understanding of SQL and experience with relational and NoSQL databases.Familiarity with version control systems (e.g., Git).Good to Have :
Experience with other AWS services such as S3, Redshift, Athena, Step Functions, Kinesis, CloudWatch.Experience with IBM Datastage or CP4DKnowledge of CI / CD pipelines and infrastructure-as-code tools (e.g., Terraform, CloudFormation).Experience with data governance and security best practices.Familiarity with real-time data processing concepts.