Talent.com
Data Engineer

Data Engineer

Soft source incHouston, TX, United States
30+ days ago
Job type
  • Full-time
  • Quick Apply
Job description

Overview :

Delivers the Palantir Foundry exit on a modern Snowflake stack by building reliable, performant, and testable ELT pipelines; recreates Foundry transformations and rule-based event logic; and ensures historical data extraction, reconciliation, and cutover readiness.

Years of Experience :

7+ years overall; 3+ years hands-on with Snowflake.

Key Responsibilities :

  • Extract historical datasets from Palantir (dataset export, parquet) to S3 / ADLS and load into Snowflake; implement checksum and reconciliation controls.
  • Rebuild Foundry transformations as dbt models and / or Snowflake SQL; implement curated schemas and incremental patterns using Streams and Tasks.
  • Implement the batch event / rules engine that evaluates time-series plus reference data on a schedule (e.g., 30 60 minutes) and produces auditable event tables.
  • Configure orchestration in Airflow running on AKS and, where appropriate, Snowflake Tasks; monitor, alert, and document operational runbooks.
  • Optimize warehouses, queries, clustering, and caching; manage cost with Resource Monitors and usage telemetry.
  • Author automated tests (dbt tests, Great Expectations or equivalent), validate parity versus legacy outputs, and support UAT and cutover.
  • Collaborate with BI / analytics teams (Sigma, Power BI) on dataset contracts, performance, and security requirements.

Required Qualifications :

  • Strong Snowflake SQL and Python for ELT, utilities, and data validation.
  • Production experience with dbt (models, tests, macros, documentation, lineage).
  • Orchestration with Airflow (preferably on AKS / Kubernetes ) and use of Snowflake Tasks / Streams for incrementals.
  • Proficiency with cloud object storage ( S3 / ADLS ), file formats ( Parquet / CSV ), and bulk / incremental load patterns ( Snowpipe, External Tables ).
  • Version control and CI / CD with GitHub / GitLab ; environment promotion and release hygiene.
  • Data quality and reconciliation fundamentals, including checksums, row / aggregate parity, and schema integrity tests.
  • Performance and cost tuning using query profiles, micro-partitioning behavior, and warehouse sizing policies.
  • Preferred Qualifications :

  • Experience migrating from legacy platforms ( Palantir Foundry , Cloudera / Hive / Spark ) and familiarity with Trino / Starburst federation patterns.
  • Time-series data handling and rules / pattern detection; exposure to Snowpark or UDFs for complex transforms.
  • Familiarity with consumption patterns in Sigma and Power BI (Import, DirectQuery, composite models, RLS / OLS considerations).
  • Security and governance in Snowflake (RBAC, masking, row / column policies), tagging, and cost allocation.
  • Exposure to containerized workloads on AKS , lightweight apps for surfacing data (e.g., Streamlit ), and basic observability practices.
  • Create a job alert for this search

    Engineer Engineer • Houston, TX, United States