Talent.com
Data Engineer (Founding Team)

Data Engineer (Founding Team)

FabrionBodega Bay, CA, US
30+ days ago
Job type
  • Full-time
Job description

Job Description

Job Description

Data / ETL Engineer (Founding Team)

Location : San Francisco Bay Area

Type : Full-Time

Compensation : Competitive salary + early-stage equity

Backed by 8VC, we're building a world-class team to tackle one of the industry’s most critical infrastructure problems.

About the Role

We’re building a multi-tenant, AI-native platform where enterprise data becomes actionable through semantic enrichment, intelligent agents, and governed interoperability. At the heart of this architecture lies our Data Fabric — an intelligent, governed layer that turns fragmented and siloed data into a connected ontology ready for model training, vector search, and insight-to-action workflows.

We're looking for engineers who enjoy hard data problems at scale : messy unstructured data, schema drift, multi-source joins, security models, and AI-ready semantic enrichment. You’ll build the backend systems, data pipelines, connector frameworks, and graph-based knowledge models that fuel agentic applications.

If you've worked on streaming unstructured pipelines, built connectors into ugly legacy systems, or mapped knowledge graphs that scale — this role will feel like home.

Responsibilities

Build highly reliable, scalable data ingestion and transformation pipelines across structured, semi-structured, and unstructured data sources

Develop and maintain a connector framework for ingesting from enterprise systems (ERPs, PLMs, CRMs, legacy data stores, email, Excel, docs, etc.)

Design and maintain the data fabric layer — including a knowledge graph (Neo4j or Puppygraph) enriched with ontologies, metadata, and relationships

Normalize and vectorize data for downstream AI / LLM workflows — enabling retrieval-augmented generation (RAG), summarization, and alerting

Create and manage data contracts, access layers, lineage, and governance mechanisms

Build and expose secure APIs for downstream services, agents, and users to query enriched semantic data

Collaborate with ML / LLM teams to feed high-quality enterprise data into model training and tuning pipelines

What We’re Looking For

Core Experience :

5+ years building large-scale data infrastructure in production environments

Deep experience with ingestion frameworks (Kafka, Airbyte, Meltano, Fivetran) and data pipeline orchestration (Airflow, Dagster, Prefect)

Comfortable processing unstructured data formats : PDFs, Excel, emails, logs, CSVs, web APIs

Experience working with columnar stores, object storage, and lakehouse formats (Iceberg, Delta, Parquet)

Strong background in knowledge graphs or semantic modeling (e.g. Neo4j, RDF, Gremlin, Puppygraph)

Familiarity with GraphQL, RESTful APIs, and designing developer-friendly data access layers

Experience implementing data governance : RBAC, ABAC, data contracts, lineage, data quality checks

Mindset & Culture Fit :

You’re a system thinker : you want to model the real world, not just process it

Comfortable navigating ambiguous data models and building from scratch

Passionate about enabling AI systems with real-world, messy enterprise data

Pragmatic about scalability, observability, and schema evolution

Value autonomy, high trust, and meaningful ownership over infrastructure

Bonus Skills

Prior work with vector DBs (e.g. Weaviate, Qdrant, Pinecone) and embedding pipelines

Experience building or contributing to enterprise connector ecosystems

Knowledge of ontology versioning , graph diffing , or semantic schema alignment

Familiarity with data fabric patterns (e.g. Palantir Ontology, Linked Data, W3C standards)

Familiar with fine-tuning LLMs or enabling RAG pipelines using enterprise knowledge

Experience enforcing data access policy with tools like OPA , Keycloak , Snowflake row-level security

Why This Role Matters

Agents are only as smart as the data they operate on. This role builds the foundation — the semantic, governed, connected substrate — that makes autonomous decision-making and agent action possible. From factory ERP records to geopolitical news alerts, the data fabric unifies it all.

If you're excited to tame complexity, unify chaos, and power intelligent systems with trusted data — we’d love to hear from you.

Create a job alert for this search

Founding Engineer • Bodega Bay, CA, US

Related jobs
  • Promoted
Data Platform Engineer

Data Platform Engineer

VirtualVocationsSanta Rosa, California, United States
Full-time
A company is looking for a Data Platform Engineer, Data Capture.Key Responsibilities Expand developer tools for capturing business events and operational data into the Lakehouse Enhance self-ser...Show moreLast updated: 30+ days ago
  • Promoted
  • New!
Data Engineer for AI

Data Engineer for AI

VirtualVocationsSanta Rosa, California, United States
Full-time
A company is looking for a Data Engineer (AI Platforms).Key Responsibilities Design, build, and optimize scalable data pipelines and migrate from legacy systems to an AI-centric platform Evolve ...Show moreLast updated: 13 hours ago
  • Promoted
Principal Data Engineer

Principal Data Engineer

VirtualVocationsSanta Rosa, California, United States
Full-time
A company is looking for a Principal Platform Data Engineer (Databricks Platform).Key Responsibilities Collaborate with stakeholders to translate business requirements into technical specificatio...Show moreLast updated: 30+ days ago
  • Promoted
Lead Azure Data Engineer

Lead Azure Data Engineer

VirtualVocationsSanta Rosa, California, United States
Full-time
A company is looking for a Lead Azure Data Engineer.Key Responsibilities Designs and develops data solutions using Microsoft Azure technologies to meet business needs Manages IT controls and pro...Show moreLast updated: 15 days ago
  • Promoted
Staff Data Engineer

Staff Data Engineer

VirtualVocationsSanta Rosa, California, United States
Full-time
A company is looking for a Staff Data Engineer.Key Responsibilities Design and implement data pipelines and architectures Collaborate with cross-functional teams to optimize data usage Ensure d...Show moreLast updated: 30+ days ago
  • Promoted
Senior Data Management Engineer

Senior Data Management Engineer

VirtualVocationsSanta Rosa, California, United States
Full-time
A company is looking for a Senior Engineer, Data Management (Remote).Key Responsibilities Build and automate data ingestion, transformation, and aggregation pipelines Conduct complex data analys...Show moreLast updated: 1 day ago
  • Promoted
Senior Data & Analytics Engineer

Senior Data & Analytics Engineer

MeshyBodega Bay, CA, US
Full-time
Headquartered in the Silicon Valley, Meshy is the leading 3D generative AI company on a mission to.Meshy makes it effortless for both professional artists and hobbyists to create unique 3D assets&m...Show moreLast updated: 1 day ago
  • Promoted
Senior Data Integration Engineer

Senior Data Integration Engineer

VirtualVocationsSanta Rosa, California, United States
Full-time
Key Responsibilities Integrate and maintain interoperability solutions for clinical and scheduling data from EHR systems Provide operational support during new client integrations and document b...Show moreLast updated: 30+ days ago
  • Promoted
  • New!
Senior Data Engineer II

Senior Data Engineer II

VirtualVocationsSanta Rosa, California, United States
Full-time
A company is looking for a Senior Data Engineer II to join their data engineering team.Key Responsibilities Design, develop, and maintain scalable data pipelines using Apache Spark on Databricks ...Show moreLast updated: 9 hours ago
  • Promoted
Lead Data Engineer

Lead Data Engineer

VirtualVocationsSanta Rosa, California, United States
Full-time
A company is looking for a Lead Data Engineer to design, build, and manage enterprise-grade data pipelines.Key Responsibilities Design, develop, and optimize metadata-driven data pipelines in Fab...Show moreLast updated: 30+ days ago
  • Promoted
Senior Data Engineer

Senior Data Engineer

SmithRxBodega Bay, CA, US
Full-time
SmithRx is a rapidly growing, venture-backed Health-Tech company.Our mission is to disrupt the expensive and inefficient Pharmacy Benefit Management (PBM) sector by building a next-generation drug ...Show moreLast updated: 30+ days ago
  • Promoted
Data Cloud Solution Engineer

Data Cloud Solution Engineer

VirtualVocationsSanta Rosa, California, United States
Full-time
A company is looking for a Solution Engineer, Data Cloud.Key Responsibilities Conduct in-person and web-based meetings to drive technical discovery and close opportunities Collaborate with Accou...Show moreLast updated: 1 day ago
  • Promoted
Senior Forward Deployed Engineer

Senior Forward Deployed Engineer

VirtualVocationsSanta Rosa, California, United States
Full-time
A company is looking for a Senior Forward Deployed Engineer, AI (Remote).Key Responsibilities Lead the design, development, and deployment of AI / ML-powered solutions tailored to customer needs A...Show moreLast updated: 30+ days ago
  • Promoted
Data Engineer II

Data Engineer II

VirtualVocationsSanta Rosa, California, United States
Full-time
A company is looking for a Data Engineer II.Key Responsibilities Produce high-quality data models and maintain data integrity for analytics products Develop scalable ELT pipelines and business i...Show moreLast updated: 30+ days ago
  • Promoted
Lead AI Data Engineer

Lead AI Data Engineer

VirtualVocationsSanta Rosa, California, United States
Full-time
A company is looking for a Lead AI and Data Solution Engineer (LLMs, MCP).Key Responsibilities Lead the design, development, and deployment of enterprise-scale data and AI solutions Architect an...Show moreLast updated: 30+ days ago
  • Promoted
  • New!
Data Governance Engineer

Data Governance Engineer

VirtualVocationsSanta Rosa, California, United States
Full-time
A company is looking for a Data Governance Customer Engineer to assist enterprise customers in implementing Microsoft Purview across complex data environments. Key Responsibilities Guide customers...Show moreLast updated: 7 hours ago
  • Promoted
Senior SAP BW Data Engineer

Senior SAP BW Data Engineer

VirtualVocationsSanta Rosa, California, United States
Full-time
Key Responsibilities Implement updates to SAP BW extractors, transformations, info packages, and process chains Develop modern data solutions using Azure Synapse Workflows, Data Pipelines, and S...Show moreLast updated: 2 days ago
  • Promoted
Senior Data Engineer

Senior Data Engineer

VirtualVocationsSanta Rosa, California, United States
Full-time
A company is looking for a Senior Data Engineer to lead the data engineering function and enhance data capabilities.Key Responsibilities Design, build, and optimize data architecture, pipelines, ...Show moreLast updated: 30+ days ago