Talent.com
Senior Data Engineer - Spark, Airflow
Senior Data Engineer - Spark, AirflowSigmaways Inc • Santa Rosa, CA, United States
Senior Data Engineer - Spark, Airflow

Senior Data Engineer - Spark, Airflow

Sigmaways Inc • Santa Rosa, CA, United States
14 hours ago
Job type
  • Full-time
Job description

We are seeking an experienced Data Engineer to design and optimize scalable data pipelines that drive our global data and analytics initiatives.

In this role, you will leverage technologies such as Apache Spark , Airflow , and Python to build high performance data processing systems and ensure data quality, reliability, and lineage across Mastercard’s data ecosystem.

The ideal candidate combines strong technical expertise with hands-on experience in distributed data systems, workflow automation, and performance tuning to deliver impactful, data-driven solutions at enterprise scale.

Responsibilities :

  • Design and optimize Spark-based ETL pipelines for large-scale data processing.
  • Build and manage Airflow DAGs for scheduling, orchestration, and checkpointing.
  • Implement partitioning and shuffling strategies to improve Spark performance.
  • Ensure data lineage, quality, and traceability across systems.
  • Develop Python scripts for data transformation, aggregation, and validation.
  • Execute and tune Spark jobs using spark-submit.
  • Perform DataFrame joins and aggregations for analytical insights.
  • Automate multi-step processes through shell scripting and variable management.
  • Collaborate with data, DevOps, and analytics teams to deliver scalable data solutions.

Qualifications :

  • Bachelor’s degree in Computer Science, Data Engineering, or related field (or equivalent experience).
  • At least 7 years of experience in data engineering or big data development.
  • Strong expertise in Apache Spark architecture, optimization, and job configuration.
  • Proven experience with Airflow DAGs using authoring, scheduling, checkpointing, monitoring.
  • Skilled in data shuffling, partitioning strategies, and performance tuning in distributed systems.
  • Expertise in Python programming including data structures and algorithmic problem-solving.
  • Hands-on with Spark DataFrames and PySpark transformations using joins, aggregations, filters.
  • Proficient in shell scripting, including managing and passing variables between scripts.
  • Experienced with spark submit for deployment and tuning.
  • Solid understanding of ETL design, workflow automation, and distributed data systems.
  • Excellent debugging and problem-solving skills in large-scale environments.
  • Experience with AWS Glue, EMR, Databricks, or similar Spark platforms.
  • Knowledge of data lineage and data quality frameworks like Apache Atlas.
  • Familiarity with CI / CD pipelines, Docker / Kubernetes, and data governance tools.
  • Create a job alert for this search

    Senior Data Engineer • Santa Rosa, CA, United States

    Related jobs
    Travel Cath Lab Tech - $2,460 to $2,750 per week in San Rafael, CA

    Travel Cath Lab Tech - $2,460 to $2,750 per week in San Rafael, CA

    AlliedTravelCareers • San Rafael, CA, US
    Full-time
    AlliedTravelCareers is working with Prime Time Healthcare to find a qualified Cath Lab Tech in San Rafael, California, 94901!. Now Hiring : Allied Healthcare Cath Lab - San Rafael, CA.Contact us for ...Show more
    Last updated: 30+ days ago • Promoted
    Side Hustle Project Lead

    Side Hustle Project Lead

    Finance Buzz • Hidden Valley Lake, California, US
    Full-time +1
    We’re offering a role for someone who wants to lead their own side-income project in their spare time.You’ll explore various proven side hustles, select the ones that fit your lifestyle, and run th...Show more
    Last updated: 30+ days ago • Promoted
    Travel Interventional Radiology (IR) - $2,165 per week in San Rafael, CA

    Travel Interventional Radiology (IR) - $2,165 per week in San Rafael, CA

    Triage Staffing LLC • San Rafael, CA, US
    Full-time
    Travel Radiology : Interventional Radiology Tech San Rafael.Shift Details : 8H Days (8 : 30 AM-5 : 00 PM).Length : 26 WEEKS 26 weeks. Apply for specific facility Tech.Show more
    Last updated: 21 days ago • Promoted
    Senior Data Engineer - Spark, Airflow (Santa Rosa)

    Senior Data Engineer - Spark, Airflow (Santa Rosa)

    Sigmaways Inc • Santa Rosa, CA, United States
    Full-time
    In this role, you will leverage technologies such as.The ideal candidate combines strong technical expertise with hands-on experience in distributed data systems, workflow automation, and performan...Show more
    Last updated: 8 hours ago • Promoted • New!
    Senior Back End Engineer - AI Workflow / Application Builder

    Senior Back End Engineer - AI Workflow / Application Builder

    Ikuto • Santa Rosa, CA, United States
    Full-time
    Senior Backend Engineer – AI Application Builder.SoMa, San Francisco, CA | 💼 Full-Time | Onsite.Salary : $200K–$325K + Meaningful Equity (1%+). Join an early-stage AI SaaS start-up creating an.AI co...Show more
    Last updated: 8 hours ago • Promoted • New!
    Senior or Staff AI Engineer

    Senior or Staff AI Engineer

    Homebound • Santa Rosa, California, USA
    Full-time
    Homebound is on a mission to make it possible for anyone anywhere to build a home using technology.Created by an experienced team of construction real estate design and technology experts Homebound...Show more
    Last updated: 15 days ago • Promoted
    Senior Bioinformatics and Data Scientist Furman lab

    Senior Bioinformatics and Data Scientist Furman lab

    Buck Institute • Novato, California, United States
    Full-time
    Senior Bioinformatics and Data Scientist.Buck Institute for Research on Aging to work on projects in academia with a close-knit team, collaborate with experts in the field of aging, and participate...Show more
    Last updated: 30+ days ago • Promoted
    Senior Backend Engineer, NBA 2K

    Senior Backend Engineer, NBA 2K

    2k • Novato, California, United States
    Full-time
    At Visual Concepts, we believe great games are made by diverse and empowered teams with a shared passion for play.As one of the world’s top game development studios, we have shipped over 100 multi-...Show more
    Last updated: 30+ days ago • Promoted
    Lead Data Engineer

    Lead Data Engineer

    Mentor Talent Acquisition • Santa Rosa, CA, United States
    Full-time
    We’re looking for a Lead Data Engineer to spearhead the design, implementation, and iteration of a world-class, modern data infrastructure that powers analytics, data science, and ML / AI systems.You...Show more
    Last updated: 14 hours ago • Promoted • New!
    Senior ML Data Engineer

    Senior ML Data Engineer

    Midjourney • Santa Rosa, CA, United States
    Full-time
    We're the data team behind Midjourney's image generation models.We handle the dataset side : processing, filtering, scoring, captioning, and all the distributed compute that makes high-quality train...Show more
    Last updated: 14 hours ago • Promoted • New!
    Data Engineer

    Data Engineer

    Midjourney • Santa Rosa, CA, United States
    Full-time
    Midjourney is a research lab exploring new mediums to expand the imaginative powers of the human species.We are a small, self-funded team focused on design, human infrastructure, and AI.We have no ...Show more
    Last updated: 14 hours ago • Promoted • New!
    Full Stack Engineer

    Full Stack Engineer

    DataInsight • Santa Rosa, CA, US
    Full-time
    Full Stack Engineer (Django + React).On-site / Hybrid (San Francisco).DataInsight is a fast-growing, well-funded startup with strong early traction within healthcare. Our platform is reshaping how c...Show more
    Last updated: 1 day ago • Promoted
    Field Sales Representative

    Field Sales Representative

    AT&T • Point Reyes Station, CA, US
    Full-time
    Job Description : Join an elite group of sales professionals bringing customized, white glove experiences directly in the customer’s home. Field Sales Representatives at AT&T are driven to connect – ...Show more
    Last updated: 2 days ago • Promoted
    Remote Backend Software Engineer : Python - AI Trainer ($80-$120 per hour)

    Remote Backend Software Engineer : Python - AI Trainer ($80-$120 per hour)

    Mercor • Novato, California, US
    Remote
    Part-time
    Mercor is hiring experienced Python Engineers • • to support a variety of high-impact research collaborations with leading AI labs. Freelancers will help improve AI systems through work extending codi...Show more
    Last updated: 14 hours ago • Promoted • New!
    Scuba Diver $25 / HR

    Scuba Diver $25 / HR

    Six Flags Discovery Kingdom • Lagunitas, California, US
    Full-time
    Are you ready to apply Make sure you understand all the responsibilities and tasks associated with this role before proceeding. Follow all policies and procedures of Six Flags Discovery Kingdom, Ani...Show more
    Last updated: 10 hours ago • Promoted • New!
    Data Platform Engineer / AI Workloads (Santa Rosa)

    Data Platform Engineer / AI Workloads (Santa Rosa)

    The Crypto Recruiters • Santa Rosa, CA, US
    Part-time +1
    We are actively searching for a Data Infrastructure Engineer to join our team on a permanent basis.In this founding engineer role you will focus on building next-generation data infrastructure for ...Show more
    Last updated: 15 hours ago • Promoted • New!
    Data Analyst

    Data Analyst

    10,000 Degrees • San Rafael, California, United States
    Full-time
    Director of Evaluation & Learning.Degrees is unlocking student success at scale by giving more students equitable opportunities for a quality college education and career success.The most efficient...Show more
    Last updated: 8 days ago • Promoted
    Senior Software Engineer - AI Agent Infrastructure (Healthcare)

    Senior Software Engineer - AI Agent Infrastructure (Healthcare)

    Honey Health • Santa Rosa, CA, US
    Full-time
    Honey Health is the all-in-one AI back office for primary and specialty care.Our AI agents autonomously handle core back-office jobs, such as aggregating patients data, processing orders and prescr...Show more
    Last updated: 20 days ago • Promoted