Talent.com
Software Engineer, Data Infrastructure

Software Engineer, Data Infrastructure

datologyaiRedwood City, CA, United States
30+ days ago
Job type
  • Full-time
Job description

About the Company

Companies want to train their own large models on their own data. The current industry standard is to train on a random sample of your data, which is inefficient at best and actively harmful to model quality at worst. There is compelling research showing that smarter data selection can train better models faster-we know because we did much of this research. Given the high costs of training, this presents a huge market opportunity. We founded DatologyAI to translate this research into tools that enable enterprise customers to identify the right data on which to train, resulting in better models for cheaper. Our team has pioneered deep learning data research, built startups, and created tools for enterprise ML. For more details, check out our recent blog posts sharing our high-level results for text models and image-text models.

We've raised over $57M in funding from top investors like Radical Ventures, Amplify Partners, Felicis, Microsoft, Amazon, and notable angels like Jeff Dean, Geoff Hinton, Yann LeCun and Elad Gil. We're rapidly scaling our team and computing resources to revolutionize data curation across modalities.

This role is based in Redwood City, CA. We are in office 4 days a week.

About the Role

We're looking for an experienced Data Platform Engineer to join as a member of our core Datology AI team. As one of our early senior hires, you will partner closely with our founders on the direction of our product and drive business-critical technical decisions. You will lead the development of our core product and data platform. These are key components of our stack that allow us to process customer data and apply state of the art research for identifying the most informative data points in large-scale datasets. You will have a broad impact over the technology, product, and our company's culture. We provide visa sponsorship for candidates selected for this role.

What You'll Work On

  • Design, build and maintain highly scalable data processing solutions, while ensuring scalability, reliability, and security
  • Architect, build, and deploy the back-end systems and services that power our data curation platform
  • Partner with researchers and engineers to bring new features and research capabilities to our customers
  • Ensure that our systems are reliable, secure, and worthy of our customers' trust

About You

  • Have meaningful experience with leading and building production data systems to deliver on major product initiatives.
  • You have built and managed highly scalable data processing solutions (e.g. Spark, Flink), data lakes or warehouses (e.g. Snowflake, Hive), authored queries (SQL), distributed storage systems (e.g., HDFS, S3), used workflow management (e.g. Airflow, Dagster), and have experience maintaining the infra that supports these.

  • Proficiency in at least one programming language commonly used within Data Engineering, such as Python, Scala, or Java.
  • Expertise with any of ETL schedulers such as Airflow, Dagster, or similar frameworks.
  • Experience maintaining a high quality bar for design, correctness, and testing.
  • Take pride in building and operating scalable, reliable, secure systems
  • Have a humble attitude, an eagerness to help your colleagues, and a desire to do whatever it takes to make the team succeed
  • Own problems end-to-end, and are willing to pick up whatever knowledge you're missing to get the job done
  • You have experience being the technical lead of a Data Engineering / Platform / Infrastructure Team.
  • Experience building ML / DL systems and / or data infrastructure that feeds into training large ML models
  • Don't meet every single requirement? We still encourage you to apply. If you're excited about our mission and eager to learn, we want to hear from you!

    Compensation

    At DatologyAI, we are dedicated to rewarding talent with highly competitive salary and significant equity. The base salary for this position ranges from $180,000 to $250,000.

  • The candidate's starting pay will be determined based on job-related skills, experience, qualifications, and interview performance.
  • We offer a comprehensive benefits package to support our employees' well-being and professional growth :

  • 100% covered health benefits (medical, vision, and dental).
  • 401(k) plan with a generous 4% company match.
  • Unlimited PTO policy
  • Annual $2,000 wellness stipend.
  • Annual $1,000 learning and development stipend.
  • Daily lunches and snacks are provided in our office!
  • Relocation assistance for employees moving to the Bay Area.
  • Create a job alert for this search

    Software Engineer Infrastructure • Redwood City, CA, United States

    Related jobs
    • Promoted
    • New!
    Software Engineer, Data Infrastructure

    Software Engineer, Data Infrastructure

    OpenAISan Francisco, CA, United States
    Full-time
    Data Platform at OpenAI owns the foundational data stack powering critical product, research, and analytics workflows.We operate some of the largest Spark compute fleets in production; design, and ...Show moreLast updated: 19 hours ago
    • Promoted
    • New!
    MTS, Data Infrastructure Engineer

    MTS, Data Infrastructure Engineer

    DelphinaSan Francisco, CA, United States
    Full-time
    Today's Data Scientists are in pain - spending their time manually wrangling data, building models through slow trial and error, taking on painstaking rewrites for deployment, and dealing with coun...Show moreLast updated: 19 hours ago
    • Promoted
    Software Engineer - Data Infrastructure (Pretraining Data)

    Software Engineer - Data Infrastructure (Pretraining Data)

    XaiSan Francisco, CA, United States
    Full-time
    Software Engineer - Data Infrastructure (Pretraining Data).AIs mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge.Our team is s...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Data Infrastructure Engineer

    Data Infrastructure Engineer

    zaimlerSan Mateo, CA, United States
    Full-time
    We're creating the foundation for AI systems that don't just generate, but retrieve, link, and reason over enterprise knowledge. In just over a year, we've begun partnering with Fortune 500 design p...Show moreLast updated: 19 hours ago
    • Promoted
    • New!
    Software Engineer II, Data Engineering & Infrastructure

    Software Engineer II, Data Engineering & Infrastructure

    Aurora COSan Francisco, CA, United States
    Full-time
    The Aurora Driver will create a new era in mobility and logistics, one that will bring a safer, more efficient, and more accessible future to everyone. At Aurora, you will tackle massively complex p...Show moreLast updated: 19 hours ago
    • Promoted
    • New!
    Senior Software Engineer, Data Infrastructure (RDBMS)

    Senior Software Engineer, Data Infrastructure (RDBMS)

    TRM LabsSan Francisco, CA, United States
    Full-time
    Senior or Staff Software Engineer, Database Engineer.Senior or Staff Software Engineer, Database Engineer.TRM Labs is a blockchain intelligence company committed to fighting crime and creating a sa...Show moreLast updated: 19 hours ago
    • Promoted
    • New!
    Senior Software Engineer - Data Infrastructure in San Francisco

    Senior Software Engineer - Data Infrastructure in San Francisco

    Energy Jobline ZRSan Francisco, CA, United States
    Full-time
    Energy Jobline is the largest and fastest growing global Energy Job Board and Energy Hub.We have an audience reach of over 7 million energy professionals, 400,000+ monthly advertised global energy ...Show moreLast updated: 19 hours ago
    • Promoted
    • New!
    Software Engineer, Data Infrastructure

    Software Engineer, Data Infrastructure

    FigmaSan Francisco, CA, United States
    Full-time
    Figma is growing our team of passionate creatives and builders on a mission to make design accessible to all.Figma's platform helps teams bring ideas to life-whether you're brainstorming, creating ...Show moreLast updated: 19 hours ago
    • Promoted
    Software Engineer, Data Infrastructure - Research

    Software Engineer, Data Infrastructure - Research

    OpenAISan Francisco, CA, United States
    Full-time
    The Workload team is responsible for designing and running OpenAI's LLM training and inference infrastructure that powers frontier models at massive scale. Our systems unify how researchers train an...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Staff+ Software Engineer - Data Infrastructure

    Staff+ Software Engineer - Data Infrastructure

    AnthropicSan Francisco, CA, United States
    Full-time
    Anthropic's mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...Show moreLast updated: 19 hours ago
    • Promoted
    Senior Software Engineer, Data Infrastructure

    Senior Software Engineer, Data Infrastructure

    LMArenaSan Francisco, CA, United States
    Full-time
    Software Engineer, Data Infrastructure at LMArena.LMArena is seeking a Software Engineer to join our team and build the data infrastructure that powers real-world AI evaluation.You'll play a crucia...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Senior Software Engineer - Business Systems Data Infrastructure

    Senior Software Engineer - Business Systems Data Infrastructure

    VerkadaSan Mateo, CA, United States
    Full-time
    Designed with simplicity in mind, Verkada's six product lines - video security cameras, access control, environmental sensors, alarms, workplace, and intercoms - provide unparalleled building secur...Show moreLast updated: 19 hours ago
    • Promoted
    Software Engineer, Data Infrastructure

    Software Engineer, Data Infrastructure

    Thinking Machines LabSan Francisco, CA, United States
    Full-time
    Thinking Machines Lab's mission is to empower humanity through advancing collaborative general intelligence.We're building a future where everyone has access to the knowledge and tools to make AI w...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Software Engineer - Data Infrastructure

    Senior Software Engineer - Data Infrastructure

    PlaidSan Francisco, CA, United States
    Full-time
    Senior Software Engineer - Data Infrastructure.Making data driven decisions is key to Plaid's culture.To support that, we need to scale our data systems while maintaining correct and complete data....Show moreLast updated: 30+ days ago
    • Promoted
    Software Engineer, Data Infrastructure & Acquisition - San Francisco, USA

    Software Engineer, Data Infrastructure & Acquisition - San Francisco, USA

    SpeechifySan Francisco, CA, United States
    Full-time
    PLEASE APPLY THROUGH THIS LINK : .The mission of Speechify is to make sure that reading is never a barrier to learning.Over 50 million people use Speechify's text-to-speech products to turn whatever ...Show moreLast updated: 30+ days ago
    • Promoted
    Software Engineer II, Data Engineering & Infrastructure

    Software Engineer II, Data Engineering & Infrastructure

    Aurora InnovationSan Francisco, CA, United States
    Full-time
    Aurora's mission is to deliver the benefits of self-driving technology safely, quickly, and broadly.The Aurora Driver will create a new era in mobility and logistics, one that will bring a safer, m...Show moreLast updated: 30+ days ago
    • Promoted
    Software Engineer, Infrastructure & Data

    Software Engineer, Infrastructure & Data

    LIGHTFIELD INCSan Francisco, CA, United States
    Full-time
    Lightfield is a next-generation CRM that automatically captures customer interactions like emails, meetings, and support tickets and organizes them into structured CRM data, enabling deep analysis ...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Software Engineer - Data Infrastructure (Pretraining Data)San Francisco & Palo Alto, CA

    Software Engineer - Data Infrastructure (Pretraining Data)San Francisco & Palo Alto, CA

    XaiSan Francisco, CA, United States
    Full-time
    Software Engineer - Data Infrastructure.AI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motiv...Show moreLast updated: 19 hours ago