Talent.com
Software Engineer, AI Data Platform [32729]
Software Engineer, AI Data Platform [32729]Stealth Startup • San Francisco, CA, United States
Software Engineer, AI Data Platform [32729]

Software Engineer, AI Data Platform [32729]

Stealth Startup • San Francisco, CA, United States
4 days ago
Job type
  • Full-time
Job description

The company is redefining how enterprises prepare and optimize data at the most fundamental layer of the AI stack—where raw information becomes usable intelligence. Our technology operates deep in the data infrastructure layer, making data efficient, secure, and ready for scale.

We eliminate the hidden inefficiencies in modern data platforms—slashing storage and compute costs, accelerating pipelines, and boosting platform efficiency. The result : 60%+ lower storage costs, up to 60% lower compute spend, 3× faster data processing, and 20% overall efficiency gains.

Why It Matters

Massive data should fuel innovation, not drain budgets. We remove the bottlenecks holding AI and analytics back—making data lighter, faster, and smarter so teams can ship breakthroughs, not babysit storage and compute bills.

Who We Are

  • World renowned researchers in compression, information theory, and data systems
  • Elite engineers from Google, Pure Storage, Cohesity and top cloud teams
  • Enterprise sellers who turn ROI into seven‑figure wins.

Powered by World-Class Investors & Customers

$65M+ raised from NEA, Bain Capital, A

  • Capital, and operators behind Okta, Eventbrite, Tesla, and Databricks. Our platform already processes hundreds of petabytes for industry leaders
  • Our Mission :

    We’re building the default data substrate for AI, and a generational company built to endure.

    Smarter Infrastructure for the AI Era :

    We make data efficient, safe, and ready for scale—think smarter, more foundational infrastructure for the AI era. Our technology integrates directly with modern data stacks like Snowflake, Databricks, and S3-based data lakes, enabling :

  • 60%+ reduction in storage costs and up to 60% lower compute spend
  • 3x faster data processing
  • 20% platform efficiency gains
  • Trusted by Industry Leaders

    Enterprise leaders globally already rely on the company to cut costs, boost performance, and unlock more value from their existing data platforms.

    A Deep Tech Approach to AI

    We’re unlocking the layers beneath platforms like Snowflake and Databricks, making them faster, cheaper, and more AI-native. We combine advanced research with practical productization, powered by a dual-track strategy :

  • Research : Led by Chief Scientist Andrea Montanari (Stanford Professor), we publish 1–2 top-tier papers per quarter.
  • Product : Actively processing 100+ PBs today and targeting Exabyte scale by Q4 2025.
  • Backed by the Best

    We’ve raised $60M+ from NEA, Bain Capital, A Capital, and operators behind Okta, Eventbrite, Tesla, and Databricks.

    Our Mission

    To convert entropy into intelligence, so every builder—human or AI—can make the impossible real.

    We’re building the default data substrate for AI, and a generational company built to endure beyond any single product cycle.

    WHAT YOU’LL DO

    This is a deep systems role for someone who lives and breathes distributed infrastructure, understands how data moves at scale, and wants to build the next‑generation AI data platform from the ground up.

  • Own the ACID backbone. Design and harden transactional layers and metadata services so that petabyte‑scale tables can time‑travel in microseconds and schema evolution becomes a non-event.
  • Turn metadata into rocket fuel. Build compaction, caching, and pruning services that keep millions of file pointers within 50 ms from lookup to plan.
  • Squeeze more signal per byte. Optimize data layouts—from column ordering to dictionary and bit‑packing, bloom filters, and zone‑map indexes—to cut scan I / O by 10× on real‑world workloads.
  • Ship adaptive indexing with research. Co‑invent machine‑driven indexes that learn access patterns and automatically re‑partition nightly—no more manual “analyze table” ever again.
  • Scale the engine, not the babysitting. Write Spark, Flink, or batch pipelines that autoscale across S3, GCS, and ADLS; expose observability hooks; and survive chaos drills without triggering a pager storm.
  • Code for longevity. Write clean, test‑soaked Java, Scala, Go, or C++. Document key invariants so future teams extend the system—instead of rewriting it.
  • Measure success in human latency. If analysts see their dashboards refresh in blink‑level time, you’ve won. Publish your breakthrough and mentor the next engineer to raise the bar again.
  • WHAT WE’RE LOOKING FOR

    You’ve built systems where performance, resilience, and clarity of design all matter. You thrive at the intersection of infrastructure engineering and applied research, and care deeply about both how something works and how well it works at scale.

    Core Skills

  • Distributed Systems and Storage Fundamentals — consistency, replication, sharding, durability, transactions.
  • Columnar Storage Optimization — deep knowledge of Parquet or similar formats (column ordering, compression, zone maps).
  • Metadata and Indexing Systems — experience building metadata‑driven services, compaction, caching, and adaptive indexing.
  • Distributed Compute at Scale — production‑grade Spark / Flink or equivalent pipeline development across S3, GCS, or ADLS.
  • Programming for Scale and Longevity — strong coding in Java, Scala, Go, or C++, with clean testing and documentation practices.
  • Resilient Systems and Observability — you’ve built systems that survive chaos drills and expose the right metrics.
  • Desired Skills

  • Exposure to open table formats such as Apache Iceberg, Delta Lake, or Hudi.
  • Experience with catalog services, query planning, or compaction frameworks .
  • OSS contributions or published work in data infrastructure or distributed systems.
  • WHY JOIN US

    If you’ve helped build the modern data stack at a large company—Databricks, Snowflake, Confluent, or similar—you already know how critical lakehouse infrastructure is to AI and analytics at scale. At the company, you’ll take that knowledge and apply it where it matters most…at the most fundamental layer in the data ecosystem.

  • Own the product, not just the feature. At the company, you won’t be optimizing edge cases or maintaining legacy systems. You’ll architect and build foundational components that define how enterprises manage and optimize data for AI.
  • Move faster, go deeper. No multi‑month review cycles or layers of abstraction—just high‑agency engineering work where great ideas ship weekly. You’ll work directly with the founding team, engage closely with design partners, and see your impact hit production fast.
  • Work on hard, meaningful problems. From transaction layer design in Delta and Iceberg, to petabyte‑scale compaction and schema evolution, to adaptive indexing and cost‑aware query planning—this is deep systems engineering at scale.
  • Join a team of expert builders. Our engineers have designed the core internals of cloud‑scale data systems, and we maintain a culture of peer‑driven learning, hands‑on prototyping, and technical storytelling.
  • Core Differentiation : We’refocused on unlocking a deeper layer of AI infrastructure. By optimizing the way data is stored, processed, and retrieved, we make platforms like Snowflake and Databricks faster, more cost‑efficient, and more AI‑native. Our work sits at the most fundamental layer of the AI stack : where raw data becomes usable intelligence.
  • Be part of something early—without the chaos. The company has already secured $65M+ from NEA, Bain Capital Ventures, A
  • Capital, and legendary operators from Okta, Tesla, and Databricks.
  • Grow with the company. You’ll have the chance to grow into a technical leadership role, mentor future hires, and shape both the engineering culture and product direction as we scale.
  • COMPENSATION & BENEFITS

  • Competitive salary and meaningful equity
  • Unlimited PTO + quarterly recharge days
  • Premium health, vision, and dental
  • Team offsites, deep tech talks, and learning stipends
  • Help build the foundational infrastructure for the AI era
  • Te company is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

    #J-18808-Ljbffr

    Create a job alert for this search

    Software Engineer Data Platform • San Francisco, CA, United States

    Related jobs
    Senior Software Engineer, AI Data & Experiments

    Senior Software Engineer, AI Data & Experiments

    Scale AI, Inc. • San Francisco, CA, United States
    Full-time
    A leading AI technology company based in San Francisco is seeking a Data Engineer with over 5 years of experience to contribute to critical systems supporting data-driven decision making.The role i...Show more
    Last updated: 11 hours ago • Promoted • New!
    Software Engineer, AI Agents Tooling & Platforms

    Software Engineer, AI Agents Tooling & Platforms

    Cloudflare Inc • San Francisco, CA, United States
    Full-time
    At Cloudflare, we are on a mission to help build a better Internet.Today the company runs one of the world's largest networks that powers millions of websites and other Internet properties for cust...Show more
    Last updated: 30+ days ago • Promoted
    Applied AI Software Engineer

    Applied AI Software Engineer

    Canvas Medical • San Francisco, CA, United States
    Full-time
    Canvas Medical is the electronic medical records (EMR) and payments development platform for healthcare.We build modern, elegant front- and back-end tooling to enable new ways for developers and cl...Show more
    Last updated: 14 days ago • Promoted
    Senior Distributed Systems Engineer - AI Data Platform

    Senior Distributed Systems Engineer - AI Data Platform

    Alluxio, Inc. • Foster City, CA, United States
    Full-time
    A data orchestration company in California is seeking a Senior Software Engineer to advance their data layer for modern AI and analytics. You will work on optimizing distributed systems and enhancin...Show more
    Last updated: 4 days ago • Promoted
    Remote AI Platform Engineer — Build Scalable AI Infra

    Remote AI Platform Engineer — Build Scalable AI Infra

    Figma Job • San Francisco, CA, United States
    Remote
    Full-time
    A leading design technology company is seeking a Software Engineer to join their AI Platforms team.The role involves developing scalable AI frameworks and collaborating with various teams to enhanc...Show more
    Last updated: 5 hours ago • Promoted • New!
    AI Software Engineer

    AI Software Engineer

    Tamarind Bio • San Francisco, CA, United States
    Full-time
    This range is provided by Tamarind Bio.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. We're looking for an AI Software Engineer to lead the dev...Show more
    Last updated: 30+ days ago • Promoted
    AI Software Engineer

    AI Software Engineer

    unitQ • San Francisco, CA, United States
    Full-time
    At unitQ, we leverage AI and advanced analytics to enable businesses to proactively monitor and improve product quality based on real‑time user feedback from both public and private channels.Backed...Show more
    Last updated: 30+ days ago • Promoted
    AI Platform Engineer – Enable AI Across the SDLC

    AI Platform Engineer – Enable AI Across the SDLC

    ButterflyMX, Inc. • San Francisco, CA, United States
    Full-time
    A technology innovation company in San Francisco is seeking an AI Enablement Engineer to embed AI into their software development process. This hands-on role involves building infrastructure and too...Show more
    Last updated: 5 hours ago • Promoted • New!
    Platform Engineer : Data Infra for AI Systems | Hybrid + Equity

    Platform Engineer : Data Infra for AI Systems | Hybrid + Equity

    bem • San Francisco, CA, United States
    Full-time
    A forward-thinking tech company in San Francisco is seeking a Platform Engineer to architect data infrastructures and develop multi-cloud solutions for AI systems. This role combines deep data knowl...Show more
    Last updated: 5 days ago • Promoted
    Software Engineer, Enterprise AI

    Software Engineer, Enterprise AI

    Scale AI • San Francisco, CA, United States
    Full-time
    Scale GP (Scale Generative AI Platform) is an enterprise-grade Generative AI platform that provides APIs for knowledge retrieval, inference, evaluation, and more. We are looking for a strong enginee...Show more
    Last updated: 30+ days ago • Promoted
    AI Platform Engineer

    AI Platform Engineer

    Crowe • San Francisco, CA, United States
    Full-time
    Your Journey at Crowe Starts Here : .At Crowe, you can build a meaningful and rewarding career.With real flexibility to balance work with life moments, you're trusted to deliver results and make an i...Show more
    Last updated: 11 days ago • Promoted
    AI Platform Engineer — Scale Data Pipelines

    AI Platform Engineer — Scale Data Pipelines

    Pocus • San Francisco, CA, United States
    Full-time
    An innovative AI-driven tech company in San Francisco seeks a developer to enhance their AI platform.You will build and maintain a scalable, reliable system and work with sensitive data, ensuring s...Show more
    Last updated: 6 days ago • Promoted
    AI Software Engineer

    AI Software Engineer

    Rattle • San Francisco, CA, United States
    Full-time
    Rattle is building the first AI-powered Revenue Intelligence Platform, solving the most critical problem in B2B sales : 75% of companies miss their revenue forecasts because the entire revenue tech ...Show more
    Last updated: 30+ days ago • Promoted
    AI Systems & Data Engineer

    AI Systems & Data Engineer

    HyperFi • San Francisco, CA, United States
    Full-time
    We're building the kind of platform we always wanted to use : fast, flexible, and built for making sense of real-world complexity. Behind the scenes is a robust, event-driven architecture that connec...Show more
    Last updated: 11 days ago • Promoted
    Software Engineer, AI

    Software Engineer, AI

    Monograph • San Francisco, CA, United States
    Full-time
    Ambrook's mission is to make sustainability profitable for family-run businesses.In the face of historic heat waves, drought, flooding, supply chain disruptions, water shortages, and pollution, cli...Show more
    Last updated: 30+ days ago • Promoted
    Senior AI & Data Platform Engineer (Onsite)

    Senior AI & Data Platform Engineer (Onsite)

    Icon Ventures • San Francisco, CA, United States
    Full-time
    A leading technology firm in San Francisco is looking for a Staff AI & Data Platform Engineer to design and develop scalable machine learning infrastructure. This role involves cross-functional coll...Show more
    Last updated: 4 days ago • Promoted
    Senior Software Engineer, Data Platform - Foundation AI

    Senior Software Engineer, Data Platform - Foundation AI

    Roblox • San Mateo, CA, United States
    Full-time
    Every day, tens of millions of people come to Roblox to explore, create, play, learn, and connect with friends in 3D immersive digital experiences- all created by our global community of developers...Show more
    Last updated: 18 days ago • Promoted
    AI Platform Engineer

    AI Platform Engineer

    Millennium Management • San Francisco, CA, United States
    Full-time
    We're seeking an AI Platform Engineer to contribute to the development and management of cutting-edge AI infrastructure for our complex enterprise environment. In this role, you'll harness your expe...Show more
    Last updated: 11 days ago • Promoted