Job Title : - Senior Data Engineer
Location : - Houston, Texas (On-Site / Hybrid)
Job Type : - Long Term Contract
Must be Local to Houston, TX
Only GC, USC, H4EAD Candidates on W2 or 1099 will be entertained
Overview :
Delivers the Palantir Foundry exit on a modern Snowflake stack by building reliable, performant, and testable ELT pipelines; recreates Foundry transformations and rule-based event logic; and ensures historical data extraction, reconciliation, and cutover readiness.
Years of Experience :
7+ years overall; 3+ years hands-on with Snowflake.
- Key Responsibilities : Extract historical datasets from Palantir (dataset export, parquet) to S3 / ADLS and load into Snowflake; implement checksum and reconciliation controls.
- Rebuild Foundry transformations as dbt models and / or Snowflake SQL; implement curated schemas and incremental patterns using Streams and Tasks.
- Implement the batch event / rules engine that evaluates time-series plus reference data on a schedule (e.g., 30–60 minutes) and produces auditable event tables.
- Configure orchestration in Airflow running on AKS and, where appropriate, Snowflake Tasks; monitor, alert, and document operational runbooks.
- Optimize warehouses, queries, clustering, and caching; manage cost with Resource Monitors and usage telemetry.
- Author automated tests (dbt tests, Great Expectations or equivalent), validate parity versus legacy outputs, and support UAT and cutover.
- Collaborate with BI / analytics teams (Sigma, Power BI) on dataset contracts, performance, and security requirements.
- Required Qualifications : Strong Snowflake SQL and Python for ELT, utilities, and data validation.
- Production experience with dbt (models, tests, macros, documentation, lineage).
- Orchestration with Airflow (preferably on AKS / Kubernetes ) and use of Snowflake Tasks / Streams for incrementals.
- Proficiency with cloud object storage ( S3 / ADLS ), file formats ( Parquet / CSV ), and bulk / incremental load patterns ( Snowpipe, External Tables ).
- Version control and CI / CD with GitHub / GitLab ; environment promotion and release hygiene.
- Data quality and reconciliation fundamentals, including checksums, row / aggregate parity, and schema integrity tests.
- Performance and cost tuning using query profiles, micro-partitioning behavior, and warehouse sizing policies.
- Preferred Qualifications : Experience migrating from legacy platforms ( Palantir Foundry , Cloudera / Hive / Spark ) and familiarity with Trino / Starburst federation patterns.
- Time-series data handling and rules / pattern detection; exposure to Snowpark or UDFs for complex transforms.
- Familiarity with consumption patterns in Sigma and Power BI (Import, DirectQuery, composite models, RLS / OLS considerations).
- Security and governance in Snowflake (RBAC, masking, row / column policies), tagging, and cost allocation.
- Exposure to containerized workloads on AKS , lightweight apps for surfacing data (e.g., Streamlit ), and basic observability practices.