No longer accepting applications

Research Scientist - Vision Data Infrastructure

Storm3San Francisco, CA, US

2 days ago

Job type

Full-time

Job description

Job Description

⚡ Research Scientists / Engineers (all levels)

🔍 Focus on Vision Data Infrastructure

🤖 Fundamental AI Research Institute

🌎 San Francisco Bay Area, USA

💸 $250,000 - $600,000 salary + annual bonus

Come join one of the only research institutions globally with resources to compete with top AI companies =>

10s of 1000s of GPUs to explore state-of-the-art research in LLMs, Multimodal and Agentic AI.

Currently seeking AI talent with expertise in building scalable pipelines for vision data to support both image / video generative training and multi-modal alignment. You’ll design high-performance pipelines for large-scale image and video datasets , enabling efficient pretraining, alignment, and simulation-based data generation.

Responsibilities :

Vision Data Sourcing & Curation

Collect and organize image and video data from open datasets and the web.
Handle data cleaning, filtering, deduplication, and metadata generation.
Ensure ethical and compliant data collection at scale.

Processing & Augmentation

Build high-throughput pipelines for vision data preprocessing (frame extraction, resolution normalization, format conversion, latent caching).

Implement GPU-accelerated augmentation and distributed data loading (WebDataset, TFRecords, Parquet).

Synthetic & Simulation-Based Data Generation

Use simulation tools (e.g., Unreal Engine 5 , Isaac Sim, Unity) to generate high-quality synthetic vision data .

Create specialized datasets for VLM training , visual reasoning , and agent interaction .

Requirements :

Strong experience with data engineering , computer vision , or machine learning infrastructure .

Expertise in building and scaling ETL / data pipelines for large unstructured datasets.

Proficiency with Python , PyTorch , and distributed data frameworks (e.g., Ray , Spark , Dask ).

Experience with WebDataset , TFRecords , Parquet , or similar high-throughput data formats.

Familiarity with GPU-accelerated preprocessing , NVIDIA DALI , or equivalent systems.

Understanding of image / video codecs , data compression , and cloud storage optimization .

Preferred Experience :

Prior work with simulation-based or synthetic data generation using Unreal Engine , Isaac Sim , or Unity .

Experience curating datasets for multimodal or vision-language model training.

Knowledge of data ethics , privacy , and compliance frameworks for large-scale AI datasets.

Experience contributing to open datasets or data-centric AI research .

Why apply :

Opportunity to join a fast-growing core team that are already pushing AI breakthroughs

Highly competitive salary package

Work alongside ambitious and bright superstars from tech and academia

Medical, Dental and Vision Insurance

Relocation package available

🌎 San Francisco Bay Area, USA

📧 Interested in applying? Please click on the ‘Easy Apply’ button or alternatively email me your resume at stefani.lukic@storm3.com

Create a job alert for this search

Research Scientist • San Francisco, CA, US

Related jobs

Promoted

Data Scientist, Innovation

PicarroSanta Clara, CA, United States

Full-time

Lead Data Scientist, Innovation.Job Location : Santa Clara, CA (preferred) or Remote- US-based.Picarro is transforming gas utility operations with innovative solutions for methane emissions manageme...Show moreLast updated: 30+ days ago

Promoted

AI Research Scientist / Engineer

PhizenixMenlo Park, CA, US

Full-time +1

AI Research Scientist / Engineer.Menlo Park, CA | On-Site | Full-Time / Direct Hire .Seeking top-tier PhDs (Bay Area preferred) with ICML / ICLR publications in LLM training and inference optimizati...Show moreLast updated: 30+ days ago

Promoted

Senior Data Scientist, Research, Real World Journeys

GoogleSan Francisco, CA, United States

Full-time

Senior Data Scientist, Research, Real World Journeys.Senior Data Scientist, Research, Real World Journeys.Applicants in San Francisco : Qualified applications with arrest or conviction records will ...Show moreLast updated: 2 days ago

Promoted

Research Data Scientist

University of California - San FranciscoSan Francisco, CA, United States

Full-time

The research data analyst will join the Department of Ophthalmology and work closely with a collaborative and exciting team at the F. Proctor Foundation studying eye diseases in the U.Sun's Lab Here...Show moreLast updated: 30+ days ago

Promoted

Research Scientist

Sei LabsSan Francisco, CA, United States

Full-time

Sei Labs builds open sourced technology for the high-performance Sei Blockchain, the first parallelized EVM Layer 1 blockchain designed to scale with the industry. The unique optimizations built int...Show moreLast updated: 30+ days ago

Promoted

Senior Data Scientist, Research, Operations Data Science

GoogleSan Francisco, CA, United States

Full-time

Senior Data Scientist, Research, Operations Data Science.Be among the first 25 applicants.Applicants in San Francisco : Qualified applications with arrest or conviction records will be considered fo...Show moreLast updated: 30+ days ago

Promoted

Senior Data Scientist, Search and Recommender Systems- VenturaOS

The Trade DeskSan Jose, CA, United States

Full-time

We're a small, high-impact team building the next-generation TV operating system.Our flexible structure gives engineers direct influence over product direction-an opportunity rarely found in larger...Show moreLast updated: 30+ days ago

Promoted

Research Scientist (diffusion)

GenmoSan Francisco, CA, United States

Full-time

We are Genmo, a research lab dedicated to building open, state-of-the-art models for video generation towards unlocking the right brain of AGI. Join us in shaping the future of AI and pushing the bo...Show moreLast updated: 30+ days ago

Research Scientist / Engineer – Training Infrastructure

IntelliPro Group Inc.Palo Alto, CA, US

Full-time

Quick Apply

Research Scientist / Engineer – Training Infrastructure Position Type : Full time Location : Palo Alto, CA • Remote - US • Remote - International Salary Range : $220,000 - $300...Show moreLast updated: 30+ days ago

Promoted

Data Scientist, Safety Systems

OpenAISan Francisco, CA, United States

Full-time

The Safety Systems team is dedicated to ensuring the safety, robustness, and reliability of AI models and their deployment in the real world. Building on the many years of our practical alignment wo...Show moreLast updated: 17 days ago

Promoted

Research Scientist

Menlo VenturesSan Francisco, CA, United States

Full-time

Behind our name : Like fire, AI holds the potential for both immense benefit and significant risk.Just as mastering fire transformed human history, we believe the safe and intentional development of...Show moreLast updated: 18 days ago

Promoted

Research Scientist - Vision Data Infrastructure

Storm3San Francisco, CA, US

Full-time

Research Scientists / Engineers (all levels).Focus on Vision Data Infrastructure.Fundamental AI Research Institute.Come join one of the only research institutions globally with resources to compete w...Show moreLast updated: 3 days ago

Promoted

Research Scientist

University of California - San FranciscoSan Francisco, CA, United States

Full-time

Sunday, Nov 30, 2025 at 11 : 59pm (Pacific Time).Apply by this date to ensure full consideration by the committee.Saturday, Oct 10, 2026 at 11 : 59pm (Pacific Time). Applications will continue to be acc...Show moreLast updated: 30+ days ago

Promoted

Data Scientist

WaymoSan Francisco, CA, United States

Full-time

Waymo is an autonomous driving technology company with the mission to be the most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building the Wa...Show moreLast updated: 9 days ago

Promoted

Data Scientist

VisaFoster City, CA, United States

Full-time

Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more t...Show moreLast updated: 30+ days ago

Promoted

Research Engineer – Synthetic Data for Vision

SesameSan Francisco, CA, United States

Full-time

Sesame believes in a future where computers are lifelike - with the ability to see, hear, and collaborate with us in ways that feel natural and human. With this vision, we're designing a new kind of...Show moreLast updated: 25 days ago

Promoted

Research Scientist, World Models

WaabiSan Francisco, CA, United States

Full-time

Waabi, founded by AI pioneer and visionary Raquel Urtasun, is an AI company building the next generation of self-driving technology. With a world class team and an innovative approach that unleashes...Show moreLast updated: 11 days ago

Promoted

Research Scientist - Applied AI

P-1 AISan Francisco, CA, United States

Full-time

We are building an engineering AGI.We founded P-1 AI with the conviction that the greatest impact of artificial intelligence will be on the built world—helping mankind conquer nature and bend it to...Show moreLast updated: 12 days ago