Software Engineer, AI Data Platform [32729]Stealth Startup • San Francisco, CA, United States

Software Engineer, AI Data Platform [32729]

Stealth Startup • San Francisco, CA, United States

4 days ago

Job type

Full-time

Job description

The company is redefining how enterprises prepare and optimize data at the most fundamental layer of the AI stack—where raw information becomes usable intelligence. Our technology operates deep in the data infrastructure layer, making data efficient, secure, and ready for scale.

We eliminate the hidden inefficiencies in modern data platforms—slashing storage and compute costs, accelerating pipelines, and boosting platform efficiency. The result : 60%+ lower storage costs, up to 60% lower compute spend, 3× faster data processing, and 20% overall efficiency gains.

Why It Matters

Massive data should fuel innovation, not drain budgets. We remove the bottlenecks holding AI and analytics back—making data lighter, faster, and smarter so teams can ship breakthroughs, not babysit storage and compute bills.

Who We Are

World renowned researchers in compression, information theory, and data systems
Elite engineers from Google, Pure Storage, Cohesity and top cloud teams
Enterprise sellers who turn ROI into seven‑figure wins.

$65M+ raised from NEA, Bain Capital, A

Capital, and operators behind Okta, Eventbrite, Tesla, and Databricks. Our platform already processes hundreds of petabytes for industry leaders

Our Mission :

We’re building the default data substrate for AI, and a generational company built to endure.

Smarter Infrastructure for the AI Era :

We make data efficient, safe, and ready for scale—think smarter, more foundational infrastructure for the AI era. Our technology integrates directly with modern data stacks like Snowflake, Databricks, and S3-based data lakes, enabling :

60%+ reduction in storage costs and up to 60% lower compute spend

3x faster data processing

20% platform efficiency gains

Trusted by Industry Leaders

Enterprise leaders globally already rely on the company to cut costs, boost performance, and unlock more value from their existing data platforms.

A Deep Tech Approach to AI

We’re unlocking the layers beneath platforms like Snowflake and Databricks, making them faster, cheaper, and more AI-native. We combine advanced research with practical productization, powered by a dual-track strategy :

Research : Led by Chief Scientist Andrea Montanari (Stanford Professor), we publish 1–2 top-tier papers per quarter.

Product : Actively processing 100+ PBs today and targeting Exabyte scale by Q4 2025.

Backed by the Best

We’ve raised $60M+ from NEA, Bain Capital, A Capital, and operators behind Okta, Eventbrite, Tesla, and Databricks.

Our Mission

To convert entropy into intelligence, so every builder—human or AI—can make the impossible real.

We’re building the default data substrate for AI, and a generational company built to endure beyond any single product cycle.

WHAT YOU’LL DO

This is a deep systems role for someone who lives and breathes distributed infrastructure, understands how data moves at scale, and wants to build the next‑generation AI data platform from the ground up.

Own the ACID backbone. Design and harden transactional layers and metadata services so that petabyte‑scale tables can time‑travel in microseconds and schema evolution becomes a non-event.

Turn metadata into rocket fuel. Build compaction, caching, and pruning services that keep millions of file pointers within 50 ms from lookup to plan.

Squeeze more signal per byte. Optimize data layouts—from column ordering to dictionary and bit‑packing, bloom filters, and zone‑map indexes—to cut scan I / O by 10× on real‑world workloads.

Ship adaptive indexing with research. Co‑invent machine‑driven indexes that learn access patterns and automatically re‑partition nightly—no more manual “analyze table” ever again.

Scale the engine, not the babysitting. Write Spark, Flink, or batch pipelines that autoscale across S3, GCS, and ADLS; expose observability hooks; and survive chaos drills without triggering a pager storm.

Code for longevity. Write clean, test‑soaked Java, Scala, Go, or C++. Document key invariants so future teams extend the system—instead of rewriting it.

Measure success in human latency. If analysts see their dashboards refresh in blink‑level time, you’ve won. Publish your breakthrough and mentor the next engineer to raise the bar again.

WHAT WE’RE LOOKING FOR

You’ve built systems where performance, resilience, and clarity of design all matter. You thrive at the intersection of infrastructure engineering and applied research, and care deeply about both how something works and how well it works at scale.

Core Skills

Distributed Systems and Storage Fundamentals — consistency, replication, sharding, durability, transactions.

Columnar Storage Optimization — deep knowledge of Parquet or similar formats (column ordering, compression, zone maps).

Metadata and Indexing Systems — experience building metadata‑driven services, compaction, caching, and adaptive indexing.

Distributed Compute at Scale — production‑grade Spark / Flink or equivalent pipeline development across S3, GCS, or ADLS.

Programming for Scale and Longevity — strong coding in Java, Scala, Go, or C++, with clean testing and documentation practices.

Resilient Systems and Observability — you’ve built systems that survive chaos drills and expose the right metrics.

Desired Skills

Exposure to open table formats such as Apache Iceberg, Delta Lake, or Hudi.

Experience with catalog services, query planning, or compaction frameworks .

OSS contributions or published work in data infrastructure or distributed systems.

WHY JOIN US

If you’ve helped build the modern data stack at a large company—Databricks, Snowflake, Confluent, or similar—you already know how critical lakehouse infrastructure is to AI and analytics at scale. At the company, you’ll take that knowledge and apply it where it matters most…at the most fundamental layer in the data ecosystem.

Own the product, not just the feature. At the company, you won’t be optimizing edge cases or maintaining legacy systems. You’ll architect and build foundational components that define how enterprises manage and optimize data for AI.

Move faster, go deeper. No multi‑month review cycles or layers of abstraction—just high‑agency engineering work where great ideas ship weekly. You’ll work directly with the founding team, engage closely with design partners, and see your impact hit production fast.

Work on hard, meaningful problems. From transaction layer design in Delta and Iceberg, to petabyte‑scale compaction and schema evolution, to adaptive indexing and cost‑aware query planning—this is deep systems engineering at scale.

Join a team of expert builders. Our engineers have designed the core internals of cloud‑scale data systems, and we maintain a culture of peer‑driven learning, hands‑on prototyping, and technical storytelling.

Core Differentiation : We’refocused on unlocking a deeper layer of AI infrastructure. By optimizing the way data is stored, processed, and retrieved, we make platforms like Snowflake and Databricks faster, more cost‑efficient, and more AI‑native. Our work sits at the most fundamental layer of the AI stack : where raw data becomes usable intelligence.

Be part of something early—without the chaos. The company has already secured $65M+ from NEA, Bain Capital Ventures, A

Capital, and legendary operators from Okta, Tesla, and Databricks.

Grow with the company. You’ll have the chance to grow into a technical leadership role, mentor future hires, and shape both the engineering culture and product direction as we scale.

COMPENSATION & BENEFITS

Competitive salary and meaningful equity

Unlimited PTO + quarterly recharge days

Premium health, vision, and dental

Team offsites, deep tech talks, and learning stipends

Help build the foundational infrastructure for the AI era

Te company is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

#J-18808-Ljbffr

Create a job alert for this search

Software Engineer Data Platform • San Francisco, CA, United States

Related jobs

Senior Software Engineer, AI Data & Experiments

Scale AI, Inc. • San Francisco, CA, United States

Full-time

A leading AI technology company based in San Francisco is seeking a Data Engineer with over 5 years of experience to contribute to critical systems supporting data-driven decision making.The role i...Show more

Last updated: 11 hours ago • Promoted • New!

Software Engineer, AI Agents Tooling & Platforms

Cloudflare Inc • San Francisco, CA, United States

Full-time

At Cloudflare, we are on a mission to help build a better Internet.Today the company runs one of the world's largest networks that powers millions of websites and other Internet properties for cust...Show more

Last updated: 30+ days ago • Promoted

Applied AI Software Engineer

Canvas Medical • San Francisco, CA, United States

Full-time

Canvas Medical is the electronic medical records (EMR) and payments development platform for healthcare.We build modern, elegant front- and back-end tooling to enable new ways for developers and cl...Show more

Last updated: 14 days ago • Promoted

Senior Distributed Systems Engineer - AI Data Platform

Alluxio, Inc. • Foster City, CA, United States

Full-time

A data orchestration company in California is seeking a Senior Software Engineer to advance their data layer for modern AI and analytics. You will work on optimizing distributed systems and enhancin...Show more

Last updated: 4 days ago • Promoted

Remote AI Platform Engineer — Build Scalable AI Infra

Figma Job • San Francisco, CA, United States

Remote

Full-time

A leading design technology company is seeking a Software Engineer to join their AI Platforms team.The role involves developing scalable AI frameworks and collaborating with various teams to enhanc...Show more

Last updated: 5 hours ago • Promoted • New!

AI Software Engineer

Tamarind Bio • San Francisco, CA, United States

Full-time

This range is provided by Tamarind Bio.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. We're looking for an AI Software Engineer to lead the dev...Show more

Last updated: 30+ days ago • Promoted

AI Software Engineer

unitQ • San Francisco, CA, United States

Full-time

At unitQ, we leverage AI and advanced analytics to enable businesses to proactively monitor and improve product quality based on real‑time user feedback from both public and private channels.Backed...Show more

Last updated: 30+ days ago • Promoted

AI Platform Engineer – Enable AI Across the SDLC

ButterflyMX, Inc. • San Francisco, CA, United States

Full-time

A technology innovation company in San Francisco is seeking an AI Enablement Engineer to embed AI into their software development process. This hands-on role involves building infrastructure and too...Show more

Last updated: 5 hours ago • Promoted • New!

Platform Engineer : Data Infra for AI Systems | Hybrid + Equity

bem • San Francisco, CA, United States

Full-time

A forward-thinking tech company in San Francisco is seeking a Platform Engineer to architect data infrastructures and develop multi-cloud solutions for AI systems. This role combines deep data knowl...Show more

Last updated: 5 days ago • Promoted

Software Engineer, Enterprise AI

Scale AI • San Francisco, CA, United States

Full-time

Scale GP (Scale Generative AI Platform) is an enterprise-grade Generative AI platform that provides APIs for knowledge retrieval, inference, evaluation, and more. We are looking for a strong enginee...Show more

Last updated: 30+ days ago • Promoted

AI Platform Engineer

Crowe • San Francisco, CA, United States

Full-time

Your Journey at Crowe Starts Here : .At Crowe, you can build a meaningful and rewarding career.With real flexibility to balance work with life moments, you're trusted to deliver results and make an i...Show more

Last updated: 11 days ago • Promoted

AI Platform Engineer — Scale Data Pipelines

Pocus • San Francisco, CA, United States

Full-time

An innovative AI-driven tech company in San Francisco seeks a developer to enhance their AI platform.You will build and maintain a scalable, reliable system and work with sensitive data, ensuring s...Show more

Last updated: 6 days ago • Promoted

AI Software Engineer

Rattle • San Francisco, CA, United States

Full-time

Rattle is building the first AI-powered Revenue Intelligence Platform, solving the most critical problem in B2B sales : 75% of companies miss their revenue forecasts because the entire revenue tech ...Show more

Last updated: 30+ days ago • Promoted

AI Systems & Data Engineer

HyperFi • San Francisco, CA, United States

Full-time

We're building the kind of platform we always wanted to use : fast, flexible, and built for making sense of real-world complexity. Behind the scenes is a robust, event-driven architecture that connec...Show more

Last updated: 11 days ago • Promoted

Software Engineer, AI

Monograph • San Francisco, CA, United States

Full-time

Ambrook's mission is to make sustainability profitable for family-run businesses.In the face of historic heat waves, drought, flooding, supply chain disruptions, water shortages, and pollution, cli...Show more

Last updated: 30+ days ago • Promoted

Senior AI & Data Platform Engineer (Onsite)

Icon Ventures • San Francisco, CA, United States

Full-time

A leading technology firm in San Francisco is looking for a Staff AI & Data Platform Engineer to design and develop scalable machine learning infrastructure. This role involves cross-functional coll...Show more

Last updated: 4 days ago • Promoted

Senior Software Engineer, Data Platform - Foundation AI

Roblox • San Mateo, CA, United States

Full-time

Every day, tens of millions of people come to Roblox to explore, create, play, learn, and connect with friends in 3D immersive digital experiences- all created by our global community of developers...Show more

Last updated: 18 days ago • Promoted

AI Platform Engineer

Millennium Management • San Francisco, CA, United States

Full-time

We're seeking an AI Platform Engineer to contribute to the development and management of cutting-edge AI infrastructure for our complex enterprise environment. In this role, you'll harness your expe...Show more

Last updated: 11 days ago • Promoted