Position title : Big data developer
Location : Charlotte, NC
Onsite : 3 days a week
Contract : 6-24 months to perm
Must have : Hadoop, Pyspark and Kafka
Key Responsibilities :
- Design and implement scalable data ingestion and transformation pipelines using PySpark or Scala , Hadoop , Hive , and Dremio .
- Build and manage Kafka batch pipelines for reliable data streaming and integration.
- Work with on-prem Hadoop ecosystems (Cloudera, Hortonworks, MapR) or cloud-native big data platforms .
- Develop and maintain RESTful APIs using Python (FastAPI, Flask, or Django) to expose data and services.
- Collaborate with data scientists, ML engineers, and platform teams to ensure seamless data flow and system performance.
- Monitor, troubleshoot, and optimize production data pipelines and services.
- Ensure security, scalability, and reliability across all data engineering components.
- (Optional but valuable) Contribute to the design and deployment of AI-driven RAG systems for enterprise use cases.
Required Skills & Qualifications :
experience in Big Data Engineering .Strong hands-on experience with PySpark or Scala .Deep expertise in on-prem Hadoop distributions (Cloudera, Hortonworks, MapR) or cloud-based big data platforms.Proficiency in Kafka batch processing , Hive , and Dremio .Solid understanding of REST API development using Python frameworks.Familiarity with cloud platforms (GCP, AWS, or Azure).Experience or exposure to AI and RAG architectures is a plus.Excellent problem-solving, communication, and collaboration skills.