Job Description
Job Title: Lead Data Engineer – Databricks
Job Summary
We are seeking a Lead Data Engineer with deep expertise in Databricks to architect, build, and lead scalable data engineering solutions on cloud-based lakehouse platforms. The role combines hands-on technical leadership with solution design, mentoring, and close collaboration with architects, BI, and AI teams.
Key Responsibilities
Technical Leadership & Architecture
Lead the design and implementation of Databricks Lakehouse architectures
Define medallion architecture (Bronze, Silver, Gold layers) using Delta Lake
Drive architectural decisions for batch and streaming data pipelines
Establish coding standards, best practices, and reusable frameworks
Data Engineering & Databricks
Design and build scalable ETL/ELT pipelines using Databricks (PySpark/SQL/Scala)
Optimize Spark jobs for performance, reliability, and cost
Implement Delta Lake features (ACID, time travel, schema enforcement)
Develop and manage Databricks workflows, jobs, and clusters
Cloud & Platform Integration
Architect Databricks solutions on Azure (preferred) or AWS
Integrate Databricks with cloud storage and data services
Azure: ADLS, ADF, Synapse
AWS: S3, Glue, Redshift
Enable BI and analytics consumption (Power BI, Tableau)
Governance, Security & DevOps
Implement data governance using Unity Catalog
Define RBAC, data access controls, and security best practices
Enable CI/CD for Databricks using GitHub / Azure DevOps
Use Infrastructure-as-Code (Terraform) for environment management
Leadership & Collaboration
Lead, mentor, and grow data engineering teams
Conduct design and code reviews
Collaborate with Data Architects, Product Owners, and stakeholders
Support production releases, monitoring, and incident resolution
Required Skills
Databricks & Big Data
Expert-level Databricks experience (Azure or AWS)
Strong Spark / PySpark / Spark SQL expertise
Delta Lake and Lakehouse architecture
Streaming (Structured Streaming) experience
Cloud & Data Platforms
Strong experience with Azure or AWS cloud platforms
Data orchestration tools (ADF, Airflow, or similar)
Strong SQL and data modeling skills
DevOps & Automation
Git-based version control
CI/CD pipelines for data engineering workloads
Terraform or similar IaC tools
Preferred Qualifications
Experience with MLflow and MLOps workflows
Exposure to Microsoft Fabric or Snowflake
Databricks certifications (Professional Data Engineer / Architect)
Experience working in Agile environments
Education
Quick Fit Indicators
Leads Databricks lakehouse implementations
Strong Spark optimization and governance expertise
Mentors and scales engineering teams
Owns delivery, quality, and platform reliability