Job Title / Role
Data Engineer
Mandatory Skills
Key Skills & Technologies
Programming Languages : Python (primary), SQL
Cloud Platforms : AWS (S3, Glue, Lambda, Redshift, EC2, EMR)
Data Tools : Apache Spark, Pandas, PySpark, Airflow
Databases : PostgreSQL, MySQL, NoSQL (e.g., DynamoDB)
ETL & Workflow Orchestration : AWS Glue, Apache Airflow
Version Control : Git
DevOps & CI / CD : Basic understanding of CI / CD pipelines and infrastructure as code (e.g., Terraform, CloudFormation)
JD
Data Pipeline Development - Design, build, and maintain scalable and reliable data pipelines to ingest, process, and transform data from various sources.
Data Integration & Management - Integrate structured and unstructured data from internal and external systems.
Ensure data quality, consistency, and availability across platforms.
Cloud-Based Data Engineering- Leverage AWS services (e.g., S3, Lambda, Glue, Redshift, EMR) to build cloud-native data solutions.
Optimize cloud resources for performance and cost-efficiency.
Programming & Automation - Use Python for data manipulation, ETL workflows, and automation of data tasks.
Develop reusable scripts and modules for data processing.
Collaboration & Stakeholder Engagement
Work closely with data scientists, analysts, and business teams to understand data needs.
Translate business requirements into technical solutions.
Monitoring & Optimization - Monitor data pipelines and troubleshoot issues proactively.
Continuously improve performance, scalability, and reliability of data systems.
Data Engineer • Princeton, NJ, United States