Job Description
Job Description
, on-site
J ob Title : Data Catalog Developer
Company : Direct Recruit Agency
Contract Details : Full-time, on-site
We are seeking a highly skilled and experienced Data Catalog Developer to join our team at Direct Recruit Agency. As a Data Catalog Developer, you will be responsible for designing, developing, and maintaining data catalogs for our clients. This is a full-time, on-site position that offers a competitive salary and benefits package.
Pay rate 100-105 / hr w2
- Expertise in Collibra is a must.
- Will be building Collibra Data Catalog-
- Experience in the new Collibra software - Edge
- Top must-have hard skills :
- Expertise in Collibra Data Management, Data Asset, Data Governance, and BAU support of Collibra Data Catalog.
- Edge Server experience is a must. Collibra Ranger certification preferred.
- Top nice-to-have hard skills : Databricks, AWS ( S3, Glue, Aurora Postgres, Athena), SQL
- Top soft skills : Communication, Problem-Solving, Collaboration, Attention to Detail .
- Team size : 10 Key aspects of the role :
- Development of Data Catalog, build Collibra workflows, and Integrate Edge server with various data sources, authentication, and access controls.
- Day-to-day Expectations : Data Catalog build out, Metadata Synchronization, Lineage Harvester
- Interview Plan / Process : Two virtual Interviews and one final On-site interview
- Citizenship : USC only
Your role as a Senior Data Engineer
Work on migrating applications from an on-premises location to the cloud service providers.Develop products and services on the latest technologies through contributions in development, enhancements, testing, and implementation.Develop, modify, and extend code for building cloud infrastructure, and automate using CI / CD pipeline.Partners with business and peers in the pursuit of solutions that achieve business goals through an agile software development methodology.Perform problem analysis, data analysis, reporting, and communication.Work with peers across the system to define and implement best practices and standards.Assess applications and help determine the appropriate application infrastructure patterns.Use the best practices and knowledge of internal or external drivers to improve products or services.Qualifications :
What we are looking for :
Bachelor's degree in Computer Science, Information Systems, or a related fieldMinimum of 3 years of experience as a Data Catalog Developer or in a similar roleHands-on experience in building ETL using Databricks SaaS infrastructure.Experience in developing data pipeline solutions to ingest and exploit new and existing data sources.Expertise in leveraging SQL, programming languages like Python, and ETL tools like DatabricksPerform code reviews to ensure requirements, optimal execution patterns, and adherence to established standards.Computer Science or EquivalentExpertise in AWS Compute (EC2, EMR), AWS Storage (S3, EBS), AWS Databases (RDS, DynamoDB), AWS Data Integration (Glue).Advanced understanding of Container Orchestration services, including Docker and Kubernetes, and a variety of AWS tools and services.Good understanding of AWS Identity and Access Management, AWS Networking, and AWS Monitoring tools.Proficiency in CI / CD and deployment automation using GITLAB pipeline.Proficiency in Cloud infrastructure provisioning tools, e.g., Terraform.Proficiency in one or more programming languages, e.g., Python, Scala.Experience in Starburst, Trino, and building SQL queries in a federated architecture.Good knowledge of Lake house architecture.Design, develop, and optimize scalable ETL / ELT pipelines using Databricks and Apache Spark (PySpark and Scala).Build data ingestion workflows from various sources (structured, semi-structured, and unstructured).Develop reusable components and frameworks for efficient data processing.Implement best practices for data quality, validation, and governance.Collaborate with data architects, analysts, and business stakeholders to understand data requirements.Tune Spark jobs for performance and scalability in a cloud-based environment.Maintain robust data lake or Lakehouse architecture.Ensure high availability, security, and integrity of data pipelines and platforms.Support troubleshooting, debugging, and performance optimization in production workloads.If you are a highly motivated and skilled Data Catalog Developer looking for a challenging and rewarding opportunity, we encourage you to apply for this position. Join our dynamic team at Direct Recruit Agency and be a part of our mission to provide top-notch data solutions to our clients.