Title : Databricks Architect
Onsite schedule :
2-3 days onsite (Arlington, VA)
4250 N Fairfax Drive, 11
th
floor
Arlington, VA 22203
(HEAVILY PRIORITIZED)
Also open to someone sitting in St, Louis
Position Overview :
We are seeking a highly skilled and strategic Lead Databricks Architect to spearhead our migration from Hadoop to Databricks, establishing scalable, repeatable Lakehouse solutions. You will lead the design, implementation, and optimization of cloud-based data platforms, enabling advanced analytics, AI capabilities, and modern data governance. This role requires deep expertise in big data architectures, hands-on experience with Databricks, and a proven track record in cloud migration projects.
Key Responsibilities
- Lead the identification and categorization of existing Hadoop workloads (ETL, batch, streaming) and data sources for migration to Databricks.
- Design and implement scalable, repeatable migration use cases, focusing on MVP (Minimum Viable Product) approaches to accelerate value delivery.
- Provision and architect Databricks environments, including sandbox workspaces with Lakehouse architecture and federation capabilities.
- Enable seamless connectivity to external data sources (e.g., Hive) and oversee pilot migrations using tools such as Databricks Migration Accelerator or third-party partner solutions.
- Validate migrated workloads for performance, cost efficiency, and data integrity, leveraging features like Z-ordering, Liquid Clustering, Lakehouse AI monitoring, and Serverless warehouse capabilities.
- Monitor query performance, storage efficiency, and pipeline health using advanced Databricks features and best practices.
- Collaborate cross-functionally with data engineering, analytics, and governance teams to validate outcomes and incorporate feedback.
- Document learnings, blockers, and feature gaps to inform broader rollout and continuous improvement efforts.
- Define and track success metrics such as migration time, query latency, cost savings, and feature adoption.
- Develop a phased roadmap for full-scale migration, advanced feature adoption, and future platform optimizations.
Qualifications
Bachelor's or Master's degree in Computer Science, Engineering, or related field.10+ years of experience in data architecture, with significant hands-on experience in Hadoop and Databricks environments.Proven expertise in cloud data platforms (Azure), data engineering, and ETL processes.Strong understanding of Lakehouse architecture, data federation, and modern data governance frameworks (e.g., Unity Catalog).Experience leading large-scale migration projects, including MVP definition and iterative delivery.Advanced proficiency with Databricks features such as Delta Lake, Liquid Clustering, AI monitoring, and serverless compute.Excellent communication, leadership, and stakeholder management skills.Ability to mentor and guide cross-functional teams in adopting best practices and innovative data solutions.Preferred Skills
Experience with Databricks Migration Accelerator or similar migration tools.Hands-on expertise in testing advanced features (dynamic clustering, Lakehouse Federation, Unity Catalog).Knowledge of data security, access controls, and compliance in cloud environments.Experience generating synthetic data and ensuring data governance in migration scenarios.