Talent.com
Research Scientist - Vision Data Infrastructure

Research Scientist - Vision Data Infrastructure

Storm3San Francisco, CA, United States
6 days ago
Job type
  • Full-time
Job description

⚡ Research Scientists / Engineers (all levels)

🔍 Focus on Vision Data Infrastructure

🤖 Fundamental AI Research Institute

🌎 San Francisco Bay Area, USA

💸 $250,000 - $600,000 salary + annual bonus

Come join one of the only research institutions globally with resources to compete with top AI companies =>

10s of 1000s of GPUs to explore state-of-the-art research in LLMs, Multimodal and Agentic AI.

Currently seeking AI talent with expertise in building scalable pipelines for vision data to support both image / video generative training and multi-modal alignment. You’ll design high-performance pipelines for large-scale image and video datasets , enabling efficient pretraining, alignment, and simulation-based data generation.

Responsibilities :

Vision Data Sourcing & Curation

  • Collect and organize image and video data from open datasets and the web.
  • Handle data cleaning, filtering, deduplication, and metadata generation.
  • Ensure ethical and compliant data collection at scale.

Processing & Augmentation

  • Build high-throughput pipelines for vision data preprocessing (frame extraction, resolution normalization, format conversion, latent caching).
  • Implement GPU-accelerated augmentation and distributed data loading (WebDataset, TFRecords, Parquet).
  • Synthetic & Simulation-Based Data Generation

  • Use simulation tools (e.g., Unreal Engine 5 , Isaac Sim, Unity) to generate high-quality synthetic vision data .
  • Create specialized datasets for VLM training , visual reasoning , and agent interaction .
  • Requirements :

  • Strong experience with data engineering , computer vision , or machine learning infrastructure .
  • Expertise in building and scaling ETL / data pipelines for large unstructured datasets.
  • Proficiency with Python , PyTorch , and distributed data frameworks (e.g., Ray , Spark , Dask ).
  • Experience with WebDataset , TFRecords , Parquet , or similar high-throughput data formats.
  • Familiarity with GPU-accelerated preprocessing , NVIDIA DALI , or equivalent systems.
  • Understanding of image / video codecs , data compression , and cloud storage optimization .
  • Preferred Experience :

  • Prior work with simulation-based or synthetic data generation using Unreal Engine , Isaac Sim , or Unity .
  • Experience curating datasets for multimodal or vision-language model training.
  • Knowledge of data ethics , privacy , and compliance frameworks for large-scale AI datasets.
  • Experience contributing to open datasets or data-centric AI research .
  • Why apply :

  • Opportunity to join a fast-growing core team that are already pushing AI breakthroughs
  • Highly competitive salary package
  • Work alongside ambitious and bright superstars from tech and academia
  • Medical, Dental and Vision Insurance
  • Relocation package available
  • 🌎 San Francisco Bay Area, USA

    📧 Interested in applying? Please click on the ‘Easy Apply’ button or alternatively email me your resume at stefani.lukic@storm3.com

    Create a job alert for this search

    Research Scientist • San Francisco, CA, United States

    Related jobs
    • Promoted
    Data Scientist, Innovation

    Data Scientist, Innovation

    PicarroSanta Clara, CA, United States
    Full-time
    Lead Data Scientist, Innovation.Job Location : Santa Clara, CA (preferred) or Remote- US-based.Picarro is transforming gas utility operations with innovative solutions for methane emissions manageme...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Data Scientist, Search and Recommender Systems- VenturaOS

    Senior Data Scientist, Search and Recommender Systems- VenturaOS

    The Trade DeskSan Jose, CA, United States
    Full-time
    We're a small, high-impact team building the next-generation TV operating system.Our flexible structure gives engineers direct influence over product direction-an opportunity rarely found in larger...Show moreLast updated: 30+ days ago
    • Promoted
    Research Data Scientist

    Research Data Scientist

    Stanford UniversityStanford, CA, United States
    Full-time +1
    Dean of Research, Stanford, California, United States.This is a 3-year fixed term appointment.This position is part of a new initiative incubated within Stanford Data Science, part of the Vice Prov...Show moreLast updated: 30+ days ago
    • Promoted
    Research Engineer - Vision

    Research Engineer - Vision

    ZyphraPalo Alto, CA, United States
    Full-time
    You will be a core contributor on Zyphra's Vision Team building the next generation of vision-language models which can understand natural scenes with a focus on web, desktop, and mobile UIs.You wi...Show moreLast updated: 1 day ago
    • Promoted
    AI Research Scientist / Engineer

    AI Research Scientist / Engineer

    PhizenixMenlo Park, CA, US
    Full-time +1
    AI Research Scientist / Engineer.Menlo Park, CA | On-Site | Full-Time / Direct Hire .Seeking top-tier PhDs (Bay Area preferred) with ICML / ICLR publications in LLM training and inference optimizati...Show moreLast updated: 30+ days ago
    • Promoted
    Data Scientist

    Data Scientist

    WaymoMountain View, CA, United States
    Full-time
    Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on buildin...Show moreLast updated: 1 day ago
    • Promoted
    Data Scientist II in Menlo Park

    Data Scientist II in Menlo Park

    Energy Jobline ZRMenlo Park, CA, United States
    Temporary
    Energy Jobline is the largest and fastest growing global Energy Job Board and Energy Hub.We have an audience reach of over 7 million energy professionals, 400,000+ monthly advertised global energy ...Show moreLast updated: 1 day ago
    • Promoted
    Research Data Scientist

    Research Data Scientist

    University of CaliforniaSan Francisco, CA, United States
    Full-time
    The research data analyst will join the Department of Ophthalmology and work closely with a collaborative and exciting team at the F. Proctor Foundation studying eye diseases in the U.Proctor Founda...Show moreLast updated: 1 day ago
    • Promoted
    Research Data Scientist

    Research Data Scientist

    University of California, San FranciscoSan Francisco, CA, United States
    Full-time
    The research data analyst will join the Department of Ophthalmology and work closely with a collaborative and exciting team at the F. Proctor Foundation studying eye diseases in the U.Proctor Founda...Show moreLast updated: 1 day ago
    • Promoted
    Data Scientist, Infrastructure

    Data Scientist, Infrastructure

    OpenAISan Francisco, CA, United States
    Full-time
    Our infrastructure team helps deliver OpenAI's most capable models and products to the world by scaling infrastructure and turning demand into useful FLOPS. We collaborate across research, engineeri...Show moreLast updated: 1 day ago
    • Promoted
    TikTok Shop - Data Scientist - Search

    TikTok Shop - Data Scientist - Search

    Tik TokSan Jose, CA, United States
    Full-time
    The e-commerce industry has seen tremendous growth in recent years and has become a hotly contested space amongst leading Internet companies, and its future growth cannot be underestimated.With mil...Show moreLast updated: 30+ days ago
    • Promoted
    Research Scientist - Vision Data Infrastructure

    Research Scientist - Vision Data Infrastructure

    Storm3San Francisco, CA, US
    Full-time
    Research Scientists / Engineers (all levels).Focus on Vision Data Infrastructure.Fundamental AI Research Institute.Come join one of the only research institutions globally with resources to compete w...Show moreLast updated: 5 days ago
    • Promoted
    Research Data Scientist

    Research Data Scientist

    TatariSan Francisco, CA, United States
    Full-time
    Tatari is on a mission to revolutionize TV advertising.We work with some of your favorite disruptor brands-like Calm, Vuori, Rocket Money, and hundreds more-to grow their business using linear and ...Show moreLast updated: 30+ days ago
    • Promoted
    Research Data Scientist

    Research Data Scientist

    RokuSan Jose, CA, United States
    Full-time
    Teamwork makes the stream work.Roku is changing how the world watches TV.Roku is the #1 TV streaming platform in the U.Canada, and Mexico, and we've set our sights on powering every television in t...Show moreLast updated: 1 day ago
    • Promoted
    Sr Research Data Scientist

    Sr Research Data Scientist

    RokuSan Jose, CA, United States
    Full-time
    Teamwork makes the stream work.Roku is changing how the world watches TV.Roku is the #1 TV streaming platform in the U.Canada, and Mexico, and we've set our sights on powering every television in t...Show moreLast updated: 1 day ago
    • Promoted
    • New!
    Senior Data Scientist / Analyst in Redwood City

    Senior Data Scientist / Analyst in Redwood City

    Energy Jobline ZRRedwood City, CA, United States
    Full-time
    Energy Jobline is the largest and fastest growing global Energy Job Board and Energy Hub.We have an audience reach of over 7 million energy professionals, 400,000+ monthly advertised global energy ...Show moreLast updated: 22 hours ago
    • Promoted
    Data Scientist - Collision Avoidance System in Foster City

    Data Scientist - Collision Avoidance System in Foster City

    Energy Jobline ZRFoster City, CA, United States
    Full-time
    Energy Jobline is the largest and fastest growing global Energy Job Board and Energy Hub.We have an audience reach of over 7 million energy professionals, 400,000+ monthly advertised global energy ...Show moreLast updated: 1 day ago
    • Promoted
    Data Scientist

    Data Scientist

    VisaFoster City, CA, United States
    Full-time
    Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more t...Show moreLast updated: 30+ days ago