Job Description:
MANDATORY SKILLS:
- Azure Data Factory (ADF)
- Databricks (PySpark, Unity Catalog)
- Python & SQL (advanced)
- Apache Spark (data processing)
- Delta Lake (ACID, schema evolution, partitioning)
- Medallion Architecture (Bronze, Silver, Gold)
- Azure Services (ADLS Gen2, Azure SQL DB, Blob Storage, Key Vault, Event Hub)
- Data Pipeline Engineering (batch & real-time)
- Data Quality & SLA monitoring
- CI/CD (Azure DevOps / GitHub Actions)
- Git (version control)
JOB DESCRIPTION:
We are seeking a Data Engineer to build and maintain scalable data pipelines and lakehouse infrastructure on Azure for enterprise analytics.
Key Responsibilities:
- Design and build ETL/ELT pipelines using Azure Data Factory and Databricks (PySpark)
- Implement Medallion Architecture (Bronze Silver Gold)
- Optimize big data pipelines for performance, scalability, and reliability
- Ensure SLA-based data quality and pipeline consistency
- Perform root cause analysis and improve pipeline reliability
- Build and manage incremental data loads (CDC, watermarking)
- Implement monitoring and logging using Azure tools
- Collaborate with BI teams to support Power BI dashboards
- Implement CI/CD pipelines and DevOps practices
- Manage multiple pipeline projects and communicate with stakeholders
Experience:
- 4 8 years of Data Engineering experience
- Strong experience with Azure ecosystem and big data platforms
Domain Expertise:
- Enterprise data platforms and analytics
- Data warehousing and lakehouse architecture
Key Skills:
- Strong problem-solving and analytical ability
- Experience working with large-scale distributed data systems
- Ability to manage multiple priorities in Agile environments
PREFERRED QUALIFICATIONS (TOOLS & ADVANCED CAPABILITIES):
- Microsoft Fabric (Lakehouse, OneLake, Fabric Pipelines)
- Streaming Technologies (Azure Event Hub, Spark Structured Streaming)
- Advanced Data Modeling (Star Schema, SCD Type 1/2/3)
- Data Observability & Monitoring tools
- Executive reporting support via Power BI semantic models