Tkxel is a leading software development company located in Reston, Virginia. We are committed to developing innovative software solutions for leading enterprises in the world, helping them grow their businesses using the latest technology solutions.
Job Description
This is a remote position.
- Data Pipeline Development : Builds understanding of the data needs of the client and designs, constructs, installs, tests, and maintains highly scalable data management systems using Microsoft Fabric suite (Azure Data Factory, Azure Synapse Analytics, etc.) and other relevant technologies that can efficiently and effectively meet those needs.
- Perform data availability assessment : Implement robust processes to ensure the timely identification, collection, and validation of relevant data sets required for the client.
- ETL Processes : Develops ETL processes to extract data from various sources, transform the data according to business rules, and load it into a centralized data repository, ensuring data accuracy and availability.
- Data Lake : Implements and manages data storage solutions using Azure OneLake and ensures optimal data storage architecture for ease of access and analysis.
- Data Integration : Integrates data from various business systems into a unified data platform, enabling a consolidated view of information across the organization.
- Data Quality and Governance : Ensures data accuracy and quality by implementing data governance and quality control measures, including data validation and cleansing.
- Performance Optimisation : Monitors, tunes, and reports on the performance of data pipelines and databases to ensure they meet the functional and performance requirements.
- Security and Compliance : Implements security measures to protect data integrity and compliance with data protection regulations and company policies.
Requirements
Skills :
Azure Data Lake or Apache Delta LakeApache Spark or DatabricksKnowledge of implementing Apache Spark using PySparkJ-18808-Ljbffr