Talent.com
Research Scientist - Data
Research Scientist - DataStorm3 • Sonoma, CA, United States
Research Scientist - Data

Research Scientist - Data

Storm3 • Sonoma, CA, United States
10 hours ago
Job type
  • Full-time
Job description

⚡ Research Scientist - Data focus

💊 Foundation Models, AI Research Institute

🌎 San Francisco Bay Area, USA

💸 $200,000 - $350,000 salary + bonus

Come join a revolutionary AI research lab in SF Bay Area that is poised to develop & publish high-impact breakthroughs in GenAI - across LLMs and Multimodal AI.

As part of the team, you’ll work at the intersection of data, large-scale training, and foundation model innovation. You will collaborate with world-class researchers, data scientists, and engineers to solve critical challenges in creating robust, scalable, and reasoning-capable LLMs. Your research will shape the way data is curated, processed, and leveraged to train the next generation of intelligent systems.

Responsibilities :

  • Lead research on data-centric approaches for LLMs , including pretraining corpus design, data valuation, and speculative decoding strategies.
  • Develop pipelines to process challenging data sources into structured and reproducible training datasets.
  • Build and optimize agentic data pipelines , integrating retrieval, self-curation, and multi-agent feedback for high-quality training and evaluation data.
  • Collaborate with researchers on alignment and reasoning-focused training that leverage data-driven approaches for improving LLM capabilities.
  • Prototype and deploy evaluation frameworks to measure data quality, coverage, and downstream impact on LLM reasoning.
  • Publish findings at top-tier venues (e.g., NeurIPS, ICLR, ACL, EMNLP) and represent the institute at international conferences.
  • Contribute to open-source tools, datasets, and benchmarks that advance the global foundation model research community.

Requirements :

  • Master’s degree in Computer Science, Data Science, or a related technical field (PhD strongly preferred)
  • Experience collecting and curating high-quality text data including multi-lingual data.
  • Hands-on experience with large-scale dataset curation and preprocessing for ML / LLM training.
  • Prior works synthesizing complex datasets. Code, math, and agentic data are higher priority
  • Experience with ML infrastructure for scalable training, evaluation, and debugging .
  • Experience at the intersection of data and post-training (RL / SFT)
  • Proven ability to independently drive research questions related to data quality, scaling, or reasoning .
  • Preferred Experience :

  • Experience with retrieval-augmented generation (RAG) , agentic data pipelines, or reasoning benchmarks.
  • Contributions to speculative decoding, self-curation, or reinforcement learning from synthetic data .
  • Background in knowledge graphs, semantic search, or indexing systems .
  • Strong publication record in leading AI conferences.
  • Prior contributions to open-source ML data tools or benchmarks .
  • Prior work on speculative decoding / contributions to LLM serving engines
  • Prior work on training LLM-as-a-judge
  • Deep expertise with tokenization / training tokenizers
  • Why apply :

  • Opportunity to build out a new division at the forefront of AI innovation
  • FAANG competitive salary & package
  • Work alongside superstars from FAANG labs & leading AI companies
  • Medical, Dental and Vision Insurance
  • Relocation package available
  • 🌎 San Francisco Bay Area, USA

    📧 Interested in applying? Please click on the ‘Easy Apply’ button or alternatively email me your resume at stefani.lukic@storm3.com

    Create a job alert for this search

    Research Scientist • Sonoma, CA, United States

    Related jobs
    Research Scientist

    Research Scientist

    kadence • Sonoma, CA, US
    Full-time
    We are a seed-stage AI company building the industry standard for evaluating and benchmarking large language models on real enterprise tasks. As a Research Scientist, you will develop new benchmarks...Show more
    Last updated: 4 days ago • Promoted
    Scientist

    Scientist

    Bruker • Emeryville, CA, United States
    Full-time
    At Bruker Cellular Analysis (BCA), we create products using advanced optofluidic technology that enable customers to interrogate and recover individual cells for downstream molecular analysis.Our p...Show more
    Last updated: 25 days ago • Promoted
    Senior Data Scientist

    Senior Data Scientist

    Califesciences • Richmond, CA, United States
    Full-time
    Americans access the care and coverage they need.Through employers, industry partners and government programs, Sun Life U. We have more than 6,400 employees and associates in our partner dental prac...Show more
    Last updated: 21 days ago • Promoted
    research scientist - RL

    research scientist - RL

    Cerebro • Sonoma, CA, United States
    Full-time
    Join a Leading Applied Research Lab Pushing the Boundaries of Reinforcement Learning.Are you passionate about advancing the frontiers of. An innovative AI research lab is seeking talented and ambiti...Show more
    Last updated: 10 hours ago • Promoted • New!
    Senior / Staff Scientist, Data Science Berkeley, CA

    Senior / Staff Scientist, Data Science Berkeley, CA

    Glyphic • Berkeley, CA, United States
    Full-time
    At Glyphic Biotechnologies, we plan create the protein revolution for which scientists and researchers have been waiting. We are developing a massively parallel, single-molecule proteome sequencing ...Show more
    Last updated: 7 days ago • Promoted
    Research Scientist

    Research Scientist

    Insight Recruitment • Sonoma, CA, United States
    Full-time
    Insight are representing an early-stage AI startup who are fundamentally changing how media content is produced.They are building a cutting-edge multimodal AI platform capable of generating complex...Show more
    Last updated: 4 hours ago • Promoted • New!
    Data Scientist

    Data Scientist

    TradeJobsWorkforce • 94710 Berkeley, CA, US
    Full-time
    Data Scientist Job Duties : Formulates and leads guided, multifaceted analytic studies again...Show more
    Last updated: 30+ days ago • Promoted
    Senior Data Scientist

    Senior Data Scientist

    Boeing • Berkeley, California, USA
    Full-time +2
    The Boeing Company is currently seeking a.Boeing Test & Evaluation (BT&E) Business Operations team in.The candidate will lead cross-functional teams to define build validate and deploy adva...Show more
    Last updated: 22 days ago • Promoted
    Staff Data Scientist

    Staff Data Scientist

    Quantix Search • Sonoma, CA, United States
    Full-time
    Staff Data Scientist | San Francisco | $250K–$300K + Equity.We’re partnering with one of the fastest-growing AI companies in the world to hire a Staff Data Scientist. Backed by over $230M from top-t...Show more
    Last updated: 10 hours ago • Promoted • New!
    Senior / Staff Scientist, Data Science

    Senior / Staff Scientist, Data Science

    MedTech Innovator • Berkeley, CA, United States
    Full-time
    Glyphic Biotechnologies is pursuing the protein revolution.We are developing a massively parallel, single-molecule proteome sequencing platform to transform life science discovery.M from venture pa...Show more
    Last updated: 30+ days ago • Promoted
    Data Scientist, Technical Lead - Voleon Securities

    Data Scientist, Technical Lead - Voleon Securities

    The Voleon Group • Berkeley, California, United States
    Full-time
    Voleon Securities, a new business within the Voleon Group, provides liquidity in securities markets.We apply state-of-the-art AI / ML techniques to construct our liquidity-provision strategies.For mo...Show more
    Last updated: 1 day ago • Promoted
    Founding Data Scientist (Sonoma)

    Founding Data Scientist (Sonoma)

    Intelletec • Sonoma, CA, US
    Part-time
    Fast-growing AI healthcare startup.Hands-on analytics and modeling : .Strong statistical modeling and ML knowledge (scikit-learn, StatsModels, PyTorch). Comfortable with ambiguity, able to own end-to-...Show more
    Last updated: 11 hours ago • Promoted • New!
    Data Scientist

    Data Scientist

    Hanalytica GmbH • Sonoma, CA, United States
    Full-time
    Data Scientist (San Francisco, CA).Our client is seeking a highly motivated and skilled Data Scientist to join a fast-paced, agile team focused on applying the latest advancements in artificial int...Show more
    Last updated: 10 hours ago • Promoted • New!
    Staff Data Scientist - Sales Analytics (Sonoma)

    Staff Data Scientist - Sales Analytics (Sonoma)

    Harnham • Sonoma, CA, US
    Part-time
    Staff Data Scientist Sales Analytics.This fast-growing Series E AI SaaS company is redefining how modern engineering teams build and deploy applications. Were looking for a Staff Data Scientist to ...Show more
    Last updated: 30+ days ago • Promoted
    Remote Data Scientist (Kaggle-Grandmaster) - AI Trainer ($56-$77 per hour)

    Remote Data Scientist (Kaggle-Grandmaster) - AI Trainer ($56-$77 per hour)

    Mercor • Napa, California, US
    Remote
    Full-time
    Role Description • • Mercor is hiring on behalf of a leading AI research lab to bring on a highly skilled • •Data Scientist • • with a • •Kaggle Grandmaster profile. In this role, you will transform compl...Show more
    Last updated: 10 hours ago • Promoted • New!
    Data Science AI Modeler (Life Sciences Biotech)

    Data Science AI Modeler (Life Sciences Biotech)

    Vida Group International • Sonoma, CA, United States
    Full-time
    The heart of our Biotech client is at the forefront of biotechnology, leveraging cutting-edge artificial intelligence and machine learning technologies to revolutionize drug discovery and developme...Show more
    Last updated: 10 hours ago • Promoted • New!
    Senior Data Scientist — Shape Product Analytics & ML

    Senior Data Scientist — Shape Product Analytics & ML

    The Rundown AI, Inc. • Berkeley, CA, United States
    Full-time
    A leading data and AI company is seeking an experienced Data Scientist to drive impactful solutions within a data-driven culture. The role involves developing data science strategies, collaborating ...Show more
    Last updated: 8 days ago • Promoted
    Senior Data Engineer / Scientist

    Senior Data Engineer / Scientist

    Zendar • Berkeley, CA, United States
    Full-time
    Are you tired of your good old corporate job, working on one project for half a year? Is your passion for making great infrastructure being kept in a box labeled "for later"? Do you enjoy working w...Show more
    Last updated: 1 day ago • Promoted