Talent.com
No longer accepting applications
Principal Staff Software Engineer, AI Training Platform

Principal Staff Software Engineer, AI Training Platform

Collide Capital LLCMountain View, CA, United States
5 days ago
Job type
  • Full-time
Job description

Company Description

is the worlds largest professional network , built to create economic opportunity for every member of the global workforce. Our products help people make powerful connections, discover exciting opportunities, build necessary skills, and gain valuable insights every day. Were also committed to providing transformational opportunities for our own employees by investing in their growth. We aspire to create a culture thats built on trust, care, inclusion, and fun where everyone can succeed.

Job Description

This role will be based in Mountain View, CA.

At , we trust each other to do our best work where it works best for us and our teams. This role offers hybrid work options, meaning you can work from home and commute to a office, depending on what's best for you and when your team needs to be together.

As part of 's AI Platform group, the AI Training team is responsible for developing and maintaining highly available and scalable deep learning training solutions to power our rapidly growing AI use cases. The team is responsible for scaling 's AI model training with hundreds of billions of parameters for all AI use cases from recommendation models, large language models (Generative AI), to computer vision models. We optimize training performance across algorithms, AI frameworks, infrastructure software, and hardware to harness the power of our GPU fleet with thousands of latest GPU cards. The team also works closely with the open source community and has many open source committers (TensorFlow, Horovod, Ray, Hadoop, etc.) in the team. Additionally, this team focussed on technologies like LLMs, GNNs, Incremental Learning, Online Learning, and advanced LLM Agents work for Training infrastructure.

As a Principal Staff Software Engineer on the AI Training Infra team, you will play a crucial role in leading and building the next-gen training infrastructure to power AI use cases. You will design and implement high performance AI Training pipeline, data I / O, work with open source teams to identify and resolve issues in popular libraries like Huggingface, Horovod and PyTorch, debug and optimize deep learning training, and provide advanced support for internal AI teams in areas like model parallelism, data parallelism, Zero, automatic mixed precision and kernel fusion. Finally, you will assist in and guide the development of containerized pipeline orchestration infrastructure, including developing and distributing stable base container images, providing advanced profiling and observability, and updating internally maintained versions of deep learning frameworks and their companion libraries like Tensorflow, PyTorch, DeepSpeed, GNNs, Flash Attention and more.

Responsibilities

  • Owning the technical strategy for broad or complex requirements with insightful and forward-looking approaches that go beyond the direct team and solve large open-ended problems.
  • Designing, implementing, and optimizing the performance of large-scale distributed training for personalized recommendation as well as large language models.
  • Improving the observability and understandability of various systems with a focus on improving developer productivity and system sustenance.
  • Mentoring other engineers, defining our challenging technical culture, and helping to build a fast-growing team.
  • Working closely with the open-source community to participate and influence cutting edge open-source projects (e.g., PyTorch, GNNs, DeepSpeed, Huggingface, etc.).
  • Functioning as the tech-lead for several concurrent key initiatives for the Training Infrastructure and defining the future of AI training platforms.

Qualifications

Basic Qualifications

  • BS / BA in Computer Science or related technical field or equivalent technical experience
  • 7+ years of industry experience in software design, development, and algorithm related solutions
  • 7+ years of experience programming in object-oriented languages such as Python, C++, Java, Go, Rust, Scala
  • 5+ years of experience as an architect, or technical leadership position
  • 5+ years of experience in the industry with leading / building deep learning systems
  • Hands-on experience developing distributed systems or other large-scale systems
  • Preferred Qualifications

  • MS or PhD in Computer Science or related technical discipline.
  • 12+ years of experience in software design, development, and algorithm related solutions with at least 5 years of experience in a technical leadership position
  • 12+ years of experience in an object-oriented programming language such as Python, C++, Java, Go, Rust, Scala
  • 5+ years of experience with large-scale distributed systems and client-server architectures
  • Co-author or maintainer of any open-source projects
  • Expertise in machine learning infrastructure, including technologies like MLFlow, Kubeflow and large scale distributed systems
  • Familiarity with containers and container orchestration systems
  • Expertise in deep learning frameworks and tensor libraries like PyTorch, Tensorflow, JAX / FLAX
  • Suggested Skills

  • ML Algorithm Development
  • Machine Learning / Deep Learning
  • Big Data
  • Stakeholder Management
  • is committed to fair and equitable compensation practices. The pay range for this role is $207,000 to $340,000. Actual compensation packages are based on several factors that are unique to each candidate, including but not limited to skill set, depth of experience, certifications, and specific work location. This may be different in other locations due to differences in the cost of labor. The total compensation package for this position may also include annual performance bonus, stock, benefits and / or other applicable incentive compensation plans. For more information, visit

    Additional Information

    Equal Opportunity Statement

    We seek candidates with a wide range of perspectives and backgrounds and we are proud to be an equal opportunity employer. considers qualified applicants without regard to race, color, religion, creed, gender, national origin, age, disability, veteran status, marital status, pregnancy, sex, gender expression or identity, sexual orientation, citizenship, or any other legally protected class.

    is committed to offering an inclusive and accessible experience for all job seekers, including individuals with disabilities. Our goal is to foster an inclusive and accessible workplace where everyone has the opportunity to be successful.

    If you need a reasonable accommodation to search for a job opening, apply for a position, or participate in the interview process, connect with us at accommodations@.com and describe the specific accommodation requested for a disability-related limitation.

    Reasonable accommodations are modifications or adjustments to the application or hiring process that would enable you to fully participate in that process. Examples of reasonable accommodations include but are not limited to :

  • Documents in alternate formats or read aloud to you
  • Having interviews in an accessible location
  • Being accompanied by a service dog
  • Having a sign language interpreter present for the interview
  • A request for an accommodation will be responded to within three business days. However, non-disability related requests, such as following up on an application, will not receive a response.

    will not discharge or in any other manner discriminate against employees or applicants because they have inquired about, discussed, or disclosed their own pay or the pay of another employee or applicant. However, employees who have access to the compensation information of other employees or applicants as a part of their essential job functions cannot disclose the pay of other employees or applicants to individuals who do not otherwise have access to compensation information, unless the disclosure is (a) in response to a formal complaint or charge, (b) in furtherance of an investigation, proceeding, hearing, or action, including an investigation conducted by , or (c) consistent with 's legal duty to furnish information.

    San Francisco Fair Chance Ordinance

    Pursuant to the San Francisco Fair Chance Ordinance, will consider for employment qualified applicants with arrest and conviction records.

    Pay Transparency Policy Statement

    As a federal contractor, follows the Pay Transparency and non-discrimination provisions described at this link :

    Global Data Privacy Notice for Job Candidates

    Please follow this link to access the document that provides transparency around the way in which handles personal data of employees and job applicants :

    #J-18808-Ljbffr

    Create a job alert for this search

    Principal Engineer Ai • Mountain View, CA, United States

    Related jobs
    • Promoted
    AI Agent Engineer

    AI Agent Engineer

    VirtualVocationsFremont, California, United States
    Full-time
    A company is looking for an AI Agent Engineer to design, deploy, and manage AI agents across its operations.Key Responsibilities Develop, deploy, and maintain AI agents to enhance enterprise work...Show moreLast updated: 2 days ago
    • Promoted
    Senior Staff Software Engineer

    Senior Staff Software Engineer

    VirtualVocationsHayward, California, United States
    Full-time
    Key Responsibilities Design and develop low-latency request / response enforcement pipelines and real-time validation systems Integrate threat intelligence and develop semantic attack detection me...Show moreLast updated: 30+ days ago
    • Promoted
    Senior AI / ML Engineer

    Senior AI / ML Engineer

    VirtualVocationsFremont, California, United States
    Full-time
    A company is looking for a Staff Software Engineer - AI / ML, GenAI.Key Responsibilities Design, build, and deploy AI / ML models and solutions using Python and other scripting languages Develop and...Show moreLast updated: 30+ days ago
    • Promoted
    Staff AI Engineer

    Staff AI Engineer

    VirtualVocationsOakland, California, United States
    Full-time
    A company is looking for a Staff AI Engineer to develop advanced AI-powered mental health tools.Key Responsibilities Design, train, fine-tune, and evaluate machine learning and large language mod...Show moreLast updated: 30+ days ago
    • Promoted
    Principal AI Engineer

    Principal AI Engineer

    SynopsysMountain View, CA, United States
    Full-time
    You are a passionate and driven individual with a degree in Computer Science, Computer Engineering, or Electrical Engineering. With a strong foundation in Artificial Intelligence algorithms and expe...Show moreLast updated: 30+ days ago
    • Promoted
    AI-Native Engineer

    AI-Native Engineer

    VirtualVocationsFremont, California, United States
    Full-time
    A company is looking for an AI-Native Engineer (Backend / Full-stack).Key Responsibilities Develop and scale the web app and Word add-in for contract negotiation automation Integrate LLMs and desi...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Principal Software AI / ML Developer

    Principal Software AI / ML Developer

    VirtualVocationsOakland, California, United States
    Full-time
    A company is looking for a Remote Principal Software AI / ML Developer.Key Responsibilities Architect and implement scalable applications using large language models (LLMs) and develop effective re...Show moreLast updated: 5 hours ago
    • Promoted
    • New!
    Remote Principal AI Developer

    Remote Principal AI Developer

    VirtualVocationsFremont, California, United States
    Remote
    Full-time
    A company is looking for a Principal AI / ML Developer to lead the design and development of advanced AI applications.Key Responsibilities Architect and implement scalable applications using large ...Show moreLast updated: 1 hour ago
    • Promoted
    • New!
    Principal AI / ML Developer

    Principal AI / ML Developer

    VirtualVocationsOakland, California, United States
    Full-time
    A company is looking for a Principal Software AI / ML Developer.Key Responsibilities Architect and implement scalable applications using large language models (LLMs) and develop effective retrieval...Show moreLast updated: 5 hours ago
    • Promoted
    AI Engineer

    AI Engineer

    VirtualVocationsHayward, California, United States
    Full-time
    A company is looking for an AI Engineer who has experience in architecting and shipping robust multi-agent systems in production. Key Responsibilities Design, develop, and deploy AI Coach capabili...Show moreLast updated: 30+ days ago
    • Promoted
    Software Engineer, Enterprise AI

    Software Engineer, Enterprise AI

    Scale AI, Inc.San Francisco, CA, United States
    Full-time
    Scale GP (Scale Generative AI Platform) is an enterprise-grade Generative AI platform that provides APIs for knowledge retrieval, inference, evaluation, and more. We are looking for a strong enginee...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Senior Principal AI Developer

    Senior Principal AI Developer

    VirtualVocationsSanta Clara, California, United States
    Full-time
    A company is looking for a Senior Principal AI / ML Developer.Key Responsibilities Architect and implement scalable applications using large language models (LLMs) Design and optimize machine lear...Show moreLast updated: 1 hour ago
    • Promoted
    Principal Engineer, AI

    Principal Engineer, AI

    VirtualVocationsSanta Clara, California, United States
    Full-time
    A company is looking for a Principal Engineer, AI Agents.Key Responsibilities Architect foundational AI strategy and drive capabilities for agentic AI and advanced analytics Design scalable AI-d...Show moreLast updated: 27 days ago
    • Promoted
    Staff Software Engineer

    Staff Software Engineer

    VirtualVocationsSan Jose, California, United States
    Full-time
    A company is looking for a Staff Software Applied AI Engineer.Key Responsibilities Build and evolve the autonomous AI security agent "Hai" for vulnerability detection and automated security analy...Show moreLast updated: 30+ days ago
    • Promoted
    AI Software Engineer

    AI Software Engineer

    VirtualVocationsSan Jose, California, United States
    Full-time
    A company is looking for a Software Engineer, AI.Key Responsibilities Design and implement AI / ML solutions to enhance logistics platform Lead development of multi-agent AI systems for delivery o...Show moreLast updated: 30+ days ago
    • Promoted
    Staff Machine Learning Engineer

    Staff Machine Learning Engineer

    VirtualVocationsFremont, California, United States
    Full-time
    A company is looking for a Staff Machine Learning Engineer.Key Responsibilities Design, build, and deploy end-to-end machine learning systems from research through to production at scale Lead th...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Software Engineer, AI Systems

    Senior Software Engineer, AI Systems

    VirtualVocationsFremont, California, United States
    Full-time
    A company is looking for a Senior Software Engineer, AI Systems - vLLM and MLPerf.Key Responsibilities Design and implement efficient inference systems for generative AI models Define benchmarki...Show moreLast updated: 1 day ago
    • Promoted
    Gen AI Engineer

    Gen AI Engineer

    VirtualVocationsHayward, California, United States
    Full-time
    A company is looking for a Gen AI Engineer to oversee the technical development of capabilities within the GenAI universe. Key Responsibilities Design and own Python-based features for the Unified...Show moreLast updated: 1 day ago
    • Promoted
    Staff Software Engineer, AI and Data Technology

    Staff Software Engineer, AI and Data Technology

    Omada HealthSan Francisco, CA, United States
    Full-time
    Omada Health is on a mission to inspire and engage people in lifelong health, one step at a time.The Staff Software Engineer for AI and Data Technologies will play a critical role in advancing our ...Show moreLast updated: 30+ days ago
    • Promoted
    Senior AI Engineer

    Senior AI Engineer

    VirtualVocationsFremont, California, United States
    Full-time
    A company is looking for a Senior AI Agent Engineer (Go).Key Responsibilities Design and develop AI agents using Go programming language Collaborate with cross-functional teams to integrate AI s...Show moreLast updated: 30+ days ago