Search jobs > Mountain View, CA > Temporary > Linux system engineer

Linux System Engineer

Phantom AI
Mountain View, CA, United States
Full-time

About Us

At Phantom AI, we've built a team of incredibly talented and ambitious people challenging the norm in the automotive industry.

We are building cost-effective L2 / L3 solutions to reduce the burden of everyday driving and make the roads safe for everyone.

For instance, we believe democratizing technologies such as Automatic Emergency Braking and Emergency Lane Support is the first priority before tackling a fully self-driving vehicle.

Our main customers are Tier 1 automotive manufacturers who are focused on delivering L2 / L3 solutions and in the future will deliver full autonomy.

We differentiate ourselves from other autonomous driving startups through a combination of state-of-the-art technological know-how and real automotive experiences of shipping ADAS systems at a volume production scale.

If you feel that you have the passion, commitment, and drive to challenge the status quo within the automotive industry, we would love to hear from you.

Key Responsibilities

  • Support the AI / ML cluster infrastructure on GPU focusing on systems automation, configuration management and deployment at scale
  • Improve our cluster health monitoring and auto-recovery pipeline
  • Work with users on debugging application performance issues
  • Work with hardware and storage vendors to tune and optimize our servers, TrueNas storage and network
  • Automate and Deploy GPU cluster with Ansible
  • Performance tuning and OS provisioning on Linux systems
  • Manage HPC clusters, workloads and applications
  • Availability 24x7 on-call

Qualifications

  • Bachelor's degree in computer science, electrical engineering or related field
  • Strong understanding of Linux fundamentals and performance optimizations (Ubuntu)
  • Advanced experience with SLURM configuration management systems, starting from scratch
  • Demonstrable knowledge of TCP / IP, Linux operating system internals, filesystems, disk / storage technologies and storage protocols
  • Experience in collaborating with network and data center teams for large scale cluster builds
  • Experience with configuration management software systems monitoring and alerting (Prometheus, Grafana, Telegraf, Splunk, etc.

and / or administering HPC workload managers (SLURM)

  • Experience with high-throughput low-latency networks, GPU-based computing systems, and / or high performance storage systems
  • Experience with Slurm and storage management of distributed parallel file systems a plus
  • 3+ years of additional equivalent experience or evidence of exceptional ability related to the position

Benefits

  • This is a contract position
  • Office snacks & reimbursable meals* when in-office

Work Type

Remote or In-Office

Equal Opportunity for Diversity & Inclusion

Phantom AI provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.

24 days ago
Related jobs
Promoted
NIO
San Jose, California

Based on open source Linux kernel, design and customize Linux operating system that meets automotive safety and real-time level. More than 10 years of Linux kernel, BSP, system software development experience. Based on hardware and business requirements, continuously observe and optimize system perf...

Promoted
Ledgent Technology
CA, United States

The Senior Systems Engineer designs, builds, implements, and supports elegant systems and systems infrastructure solutions to advance company's organizational goals and objectives. Handle all aspects of systems deployment projects with a focus on systems infrastructure, including initiating purchase...

Promoted
Phantom AI
Mountain View, California

Demonstrable knowledge of TCP/IP, Linux operating system internals, filesystems, disk/storage technologies and storage protocols. Performance tuning and OS provisioning on Linux systems. Experience with high-throughput low-latency networks, GPU-based computing systems, and/or high performance storag...

KLA
Milpitas, California

Enabling the movement towards advanced chip design, KLA's Global Products Group (GPG), which is responsible for creating all of KLA's metrology and inspection products, is looking for the best and the brightest research scientist, software engineers, application development engineers, and senior pro...

1000 KLA Corporation
Milpitas, California

As a product design engineer, you would Design and Engineer an embedded HPC Cluster which is a critical sub-system in KLA Photo Mask inspection tool. In-depth knowledge of one or more flavors of Linux: SUSE, Redhat, CentOS, Alma, Ubuntu including experience in System-D, Net boot/PXE, Linux HA. ...

OSI Digital
Orange County, CA, US

Role: Unix/Linux System Engineer  Location: Irvine CA  Duration: Full time    Requirements:  BS Degree  8+ years sysadmin experience  Skills and Experience :  Expert in Solaris and Redhat Linux OS, in both physical and virtual implemen...

OSI Digital
Orange County, CA

Role: Unix/Linux System Engineer . Expert in Solaris and Redhat Linux OS, in both physical and virtual implementations . Managing file systems, local, SAN/NAS, NFS. ...

ESG Consulting, Inc
Fremont, California

Strong Administration of Red Hat/Oracle Enterprise Linux operating system and various Linux OS· Create, Manage, and Apply automation for deployment and configuration· Provision server build out within virtualization such as SAN· Assist with WebLogic, Tomcat and JBoss installation and administration ...

Cisco Systems, Inc.
San Jose, California

As a software engineering technical leader, you will be a key member of a team of skilled engineers crafting, maintaining and supporting embedded chassis management software for Cisco's Unified Computing System (UCS) family of products. Cisco UCS brings together compute, networking, and storage, all...

TEKsystems
San Jose, California

Volkswagen has a new Desktop/Linux Engineering team opening and is looking to add an expert-level engineer. Most of these individuals are focused on mobility & engineering initiatives for VW and will be requiring Linux & Ubuntu-based devices with very high-end modifications (IE, Graphics cards, proc...