Position : - Databricks Engineer ,
Location : - O Fallon, MO / Seattle - WA – Onsite ,
Client : - TCS ,
Duration : - Full Time Position,
Job Description : -
Experience Required - 8+ Years
Must Have Technical / Functional Skills
- Tenured in the fields of Computer Science / Engineering or Software Engineering.
- Bachelor's degree in computer science, or a related technical field including programming with 15+ years of demonstrated experience in software development and delivery.
- Deep hands-on experience with Databricks and Python.
- Spark and SQL Expertise : Writing performant Spark SQL and PySpark transformations. Optimizing joins, window functions, and aggregations. Handling large-scale data with partitioning and caching strategies
- Experience with real-time data processing and streaming pipelines.
- Proficient in both Delta Live Tables (DLT) and Workflows – Understanding of Lakeflow Declarative Pipelines is an asset.
- DevOps and CI / CD - Using Databricks Asset Bundles (DABs) for deployment - Integrating with Git, Bitbucket, and Jenkins for version control and automation
- Proven track record using dbt for building robust, testable data transformation workflows following TDD.
- Strong grasp of cloud-based database infrastructure (AWS, Azure, or GCP).
- Skilled in developing insightful dashboards and scalable data models using Power BI.
- Expert in SQL development and performance optimization.
- Demonstrated success in building and maintaining data observability tools and frameworks.
- Proven ability to plan and execute deployments, upgrades, and migrations with minimal disruption to operations.
Roles & Responsibilities : -
Focus on standing up / adopting tools that will enable both internal and external customers. Key leader / contributor in the data and software and platform and data warehouse development life cycle (i.e. design, development, documentation, testing, deployment, support)Work efficiently within a high security, PII and PCI-DSS environmentsKey leader / contributor in the data and software and platform and data warehouse development life cycle (i.e. design, development, documentation, testing, deployment, support)Support architects and engineers as they design and build effective, agile applications and platformsDesign, develop, and optimize batch and real-time data pipelines using Medallion Architecture, preferably on Databricks cloud data platformsBuild data transformation workflows using dbt, with a strong focus on Test-Driven Development (TDD) and modular design.Implement and manage CI / CD pipelines using GitLab and Jenkins, enabling automated testing, deployment, and monitoring of data workflows.Drive DataOps practices by integrating testing, monitoring, versioning, and collaboration into every phase of the data pipeline lifecycle.Build scalable and reusable data models that support business analytics and dashboarding in Power BI.Develop and support real-time data streaming pipelines (e.g., using Kafka, Spark Structured Streaming) for near-instant data availability.Establish and implement data observability practices, including monitoring data quality, freshness, lineage, and anomaly detection across the platform.Plan and own deployments, migrations, and upgrades across data platforms and pipelines to minimize service impacts, including developing and executing mitigation plans.Collaborate with stakeholders to understand data requirements and deliver reliable, high-impact data solutions.Document pipeline architecture, processes, and standards, promoting consistency and transparency across the team.Apply exceptional problem-solving and analytical skills to troubleshoot complex data and system issues.Demonstrate excellent written and verbal communication skills when collaborating across technical and non-technical teams.