Talent.com
No longer accepting applications
Data Engineer III

Data Engineer III

The Custom Group of CompaniesNew York, NY, United States
8 days ago
Job type
  • Full-time
Job description

Your role as a Senior Data Engineer

  • Work on migrating applications from an on-premises location to the cloud service providers.
  • Develop products and services on the latest technologies through contributions in development, enhancements, testing and implementation.
  • Develop, modify, extend code for building cloud infrastructure, and automate using CI / CD pipeline.
  • Partners with business and peers in the pursuit of solutions that achieve business goals through an agile software development methodology.
  • Perform problem analysis, data analysis, reporting, and communication.
  • Work with peers across the system to define and implement best practices and standards.
  • Assess applications and help determine the appropriate application infrastructure patterns.
  • Use the best practices and knowledge of internal or external drivers to improve products or services.

Qualifications

What we are looking for :

  • Hands-on experience in building ETL using Databricks SaaS infrastructure.
  • Experience in developing data pipeline solutions to ingest and exploit new and existing data sources.
  • Expertise in leveraging SQL, programming language like Python and ETL tools like Databricks
  • Perform code reviews to ensure requirements, optimal execution patterns and adherence to established standards.
  • Computer Science or Equivalent

  • Expertise in AWS Compute (EC2, EMR), AWS Storage (S3, EBS), AWS Databases (RDS, DynamoDB), AWS Data Integration (Glue).
  • Advanced understanding of Container Orchestration services including Docker and Kubernetes, and a variety of AWS tools and services.
  • Good understanding of AWS Identify and Access management, AWS Networking and AWS Monitoring tools.
  • Proficiency in CI / CD and deployment automation using GITLAB pipeline.
  • Proficiency in Cloud infrastructure provisioning tools e.g., Terraform.
  • Proficiency in one or more programming languages e.g., Python, Scala.
  • Experience in Starburst, Trino and building SQL queries in federated architecture.
  • Good knowledge of Lake house architecture.
  • Design, develop, and optimize scalable ETL / ELT pipelines using Databricks and Apache Spark (PySpark and Scala).
  • Build data ingestion workflows from various sources (structured, semi-structured, and unstructured).
  • Develop reusable components and frameworks for efficient data processing.
  • Implement best practices for data quality, validation, and governance.
  • Collaborate with data architects, analysts, and business stakeholders to understand data requirements.
  • Tune Spark jobs for performance and scalability in a cloud-based environment.
  • Maintain robust data lake or Lakehouse architecture.
  • Ensure high availability, security, and integrity of data pipelines and platforms.
  • Support troubleshooting, debugging, and performance optimization in production workloads.
  • Create a job alert for this search

    Data Engineer Iii • New York, NY, United States