Data Engineer

Soft source incHouston, TX, United States

30+ days ago

Job type

Full-time

Quick Apply

Job description

Overview :

Delivers the Palantir Foundry exit on a modern Snowflake stack by building reliable, performant, and testable ELT pipelines; recreates Foundry transformations and rule-based event logic; and ensures historical data extraction, reconciliation, and cutover readiness.

Years of Experience :

7+ years overall; 3+ years hands-on with Snowflake.

Key Responsibilities :

Extract historical datasets from Palantir (dataset export, parquet) to S3 / ADLS and load into Snowflake; implement checksum and reconciliation controls.
Rebuild Foundry transformations as dbt models and / or Snowflake SQL; implement curated schemas and incremental patterns using Streams and Tasks.
Implement the batch event / rules engine that evaluates time-series plus reference data on a schedule (e.g., 30 60 minutes) and produces auditable event tables.
Configure orchestration in Airflow running on AKS and, where appropriate, Snowflake Tasks; monitor, alert, and document operational runbooks.
Optimize warehouses, queries, clustering, and caching; manage cost with Resource Monitors and usage telemetry.
Author automated tests (dbt tests, Great Expectations or equivalent), validate parity versus legacy outputs, and support UAT and cutover.
Collaborate with BI / analytics teams (Sigma, Power BI) on dataset contracts, performance, and security requirements.

Required Qualifications :

Strong Snowflake SQL and Python for ELT, utilities, and data validation.

Production experience with dbt (models, tests, macros, documentation, lineage).

Orchestration with Airflow (preferably on AKS / Kubernetes ) and use of Snowflake Tasks / Streams for incrementals.

Proficiency with cloud object storage ( S3 / ADLS ), file formats ( Parquet / CSV ), and bulk / incremental load patterns ( Snowpipe, External Tables ).

Version control and CI / CD with GitHub / GitLab ; environment promotion and release hygiene.

Data quality and reconciliation fundamentals, including checksums, row / aggregate parity, and schema integrity tests.

Performance and cost tuning using query profiles, micro-partitioning behavior, and warehouse sizing policies.

Preferred Qualifications :

Experience migrating from legacy platforms ( Palantir Foundry , Cloudera / Hive / Spark ) and familiarity with Trino / Starburst federation patterns.

Time-series data handling and rules / pattern detection; exposure to Snowpark or UDFs for complex transforms.

Familiarity with consumption patterns in Sigma and Power BI (Import, DirectQuery, composite models, RLS / OLS considerations).

Security and governance in Snowflake (RBAC, masking, row / column policies), tagging, and cost allocation.

Exposure to containerized workloads on AKS , lightweight apps for surfacing data (e.g., Streamlit ), and basic observability practices.

Create a job alert for this search

Engineer Engineer • Houston, TX, United States