Cloudera Data Engineer

Akaasa TechnologiesUnited States

5 days ago

Job type

Full-time

Quick Apply

Job description

Cloudera Data Engineer

Candidate should have solid background in Cloudera cluster, Data Engineering, and AWS .

Job Summary

We are seeking a Cloudera Data Engineer to support the migration of a Medicaid Data Warehouse Implementation in an AWS environment. The resource will support the migration and continued operations of an existing Cloudera / Hive / Scala-based data pipeline environment from one AWS account to another.
This position is responsible for ensuring a seamless transition, validating data integrity and job performance, and maintaining reliable daily operations post-migration.
The role will work closely with the existing project team for the underlying AWS infrastructure (VPC, IAM, S3, EC2, networking). The resource will focus on Cloudera cluster migration, data pipeline reconfiguration, and operational stability.

Key Responsibilities

Replicate and configure the existing Cloudera cluster (HDFS, YARN, Hive, Spark) in the new AWS account.

Coordinate with the project team to ensure proper infrastructure provisioning (EC2, security groups, IAM roles, and networking).

Reconfigure cluster connectivity and job dependencies for the new environment.

Migrate and validate metadata stores (Hive Metastore, job configs, dependencies).

Validate job execution and data outputs for parity with the existing environment.

Deploy, test, and operate existing Hive, Spark (Scala) jobs post-migration.

Maintain job schedules, dependencies, and runtime configurations.

Monitor job performance, identify bottlenecks, and apply tuning or code-level optimizations.

Troubleshoot failures and implement automated recovery or alerting where applicable.

Monitor Cloudera Manager dashboards, cluster health, and resource utilization.

Manage user roles and access within the Cloudera environment.

Implement periodic data cleanup, archiving, and housekeeping processes.

Document configurations, migration steps, and operational runbooks.

Required Skills and Experience :

Bachelor's degree in computer science, Information Systems, or a related field.

7+ years of experience in data engineering or big data development

4+ years' experience with Cloudera platform (HDFS, YARN, Hive, Spark, Oozie)

Experience deploying and operating Cloudera workloads on AWS (EC2, S3, IAM, CloudWatch)

Strong proficiency in Scala, Java, and HiveQL; Python or Bash scripting experience preferred

Strong proficiency in Apache Spark & Scala programming for data processing and transformation.

Hands-on experience with the Cloudera distribution of Hadoop.

Hands-on experience implementing business rules processing using Drools.

Able to work with infrastructure, DevOps, and data governance teams in a multi-disciplinary environment.

Preferred Qualifications :

Candidates with Cloudera certification (e.g., CDP Data Engineer or Cloudera Administrator)

Experience with Cloudera version upgrades or AWS-to-AWS environment migrations.

Experience in public-sector or large enterprise data environments.

Create a job alert for this search

Data Engineer • United States

Related jobs

Data Engineer

Akaasa TechnologiesUnited States

Full-time

Quick Apply

Experience : 12+ years 100% Remote About the Role We are seeking a Data Engineer...Show moreLast updated: 30+ days ago

Promoted

Telemedicine Physician

QuickMDNowata, OK, US

Full-time

QuickMD is a leading telemedicine provider, delivering high-quality virtual care across 44 states.Since our founding in 2019, we have helped more than 100,000 patients access essential medical trea...Show moreLast updated: 30+ days ago

Promoted
New!

Big Data Cloud Engineer

nTech WorkforceUnited States

Full-time

Title : Lead Big Data Cloud Engineer.Candidates must be willing to work onsite in Reston, VA or Washington, DC, once per month for all-hands meetings. We are seeking a Lead Big Data Cloud Engineer fo...Show moreLast updated: 18 hours ago

Cloud Database EngineerLocation

ValiantIQ INCUSA

Full-time

EN-US" link="blue" vlink="purple" style="word-wrap : break-word"> Show moreLast updated: 30+ days ago

Promoted

Insurance Sales Agent - 4 Day Work Week

Platinum Supplemental InsuranceBartlesville, Oklahoma

Full-time

Ready to jump-start your career and take charge of your earning potential?.At Platinum Supplemental Insurance, we’re looking for motivated individuals ready to learn, grow, and thrive in a fast-pac...Show moreLast updated: 30+ days ago

Promoted

Sales Representative - 4 Day Work Week

Platinum Supplemental InsuranceBartlesville, Oklahoma

Full-time

Promoted

Insurance Sales Representative - 4 Day Work Week

Platinum Supplemental InsuranceBartlesville, Oklahoma

Full-time

Cloudera Data Engineer - Remote

JobgetherUS

Remote

Full-time

Quick Apply

This position is posted by Jobgether on behalf of a partner company.We are currently looking for a.Cloudera Data Engineer - Remote. We are seeking a skilled Cloudera Data Engineer to lead the migrat...Show moreLast updated: 6 days ago

Promoted

Senior Data Engineer

Plum IncUnited States

Full-time

PLUM is a fintech company empowering financial institutions to grow their business through a cutting-edge suite of AI-driven software, purpose-built for lenders and their partners across the financ...Show moreLast updated: 30+ days ago

Snowflake Data Engineer

Sutton BankUSA

Full-time

Quick Apply

Responsible for developing and maintaining the Snowflake database structure that supports the bank's operations.Designs and documents database schemas, writes and debugs stored procedures, while op...Show moreLast updated: 6 days ago

Snowflake Data Engineer & Python

VDart IncUnited States

Full-time

Quick Apply

Role : Snowflake Data Engineer & Python Hire Type : Contract Position Remote ...Show moreLast updated: 7 days ago

Data Engineer II, Fandango (AWS / Redshift / PySpark) (Remote - US)

JobgetherUS

Remote

Full-time

Quick Apply

This position is posted by Jobgether on behalf of a partner company.We are currently looking for a.Data Engineer II, Fandango (AWS / Redshift / PySpark). This role offers the opportunity to build and ma...Show moreLast updated: 7 days ago

Salesforce Data Cloud Engineer

Kanshe InfotechUnited States

Full-time

Quick Apply

Job Title : Salesforce Data Cloud Engineer Duration : 6+ Months Show moreLast updated: 1 day ago

Promoted

Entry Level Insurance Sales Representative - 4 Day Work Week

Platinum Supplemental InsuranceBartlesville, Oklahoma

Full-time

Promoted

Outside Insurance Sales - 4 Day Work Week

Platinum Supplemental InsuranceBartlesville, Oklahoma

Full-time

Promoted

Entry Level Insurance Sales - 4 Day Work Week

Platinum Supplemental InsuranceBartlesville, Oklahoma

Full-time

Data Engineer

Apptad IncUnited States

Full-time

Quick Apply

Job Role - Data Engineer Location - Secaucus, NJ(Remote) Job Details : We are seeking a skilled Data Engineer with strong experienc...Show moreLast updated: 30+ days ago

Cloud engineer

Two95 International Inc.US

Remote

Full-time

Quick Apply

Seasoned messaging expert with extensive, well-rounded background in a diverse set of messaging middleware solutions (commercial, open source, in-house) with in-depth understanding of architectures...Show moreLast updated: 30+ days ago