Talent.com
Sr. System Engineer/Rack Solution (27692)
Sr. System Engineer/Rack Solution (27692)Supermicro • San Jose, CA, United States
Sr. System Engineer / Rack Solution (27692)

Sr. System Engineer / Rack Solution (27692)

Supermicro • San Jose, CA, United States
3 days ago
Job type
  • Full-time
Job description

Job Req ID : 27692

About Supermicro :

Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers worldwide. We are the #5 fastest growing company among the Silicon Valley Top 50 technology firms. Our unprecedented global expansion has provided us with the opportunity to offer a large number of new positions to the technology community. We seek talented, passionate, and committed engineers, technologists, and business leaders to join us.

Job Summary :

As a Sr. System Engineer, you'll be the go-to person to roll out and maintain business critical applications and services for Supermicro. You are also responsible for resolving escalated service issues, coaching other engineers to resolutions, engineering and implementing complex projects. You will be a person who is independent with leadership to drive the technical development and with excellent communication skills.

Essential Duties and Responsibilities :

Includes the following essential duties and responsibilities (other duties may also be assigned) :

  • Execute comprehensive system-level rack tests on latest NVidia and AMD GPUs, ARM-based, Intel Xeon, and AMD EPYC processors, encompassing functionality, compatibility, performance, stress, and reliability testing, leveraging proprietary in-house tools.
  • Establish expertise in HPC / AI applications and benchmarks, delivering impactful training sessions to customers and partners, while addressing complex customer support issues, demonstrating innovative problem-solving skills and building robust processes and procedures for HPC / AI solutions.
  • Conduct proof of concept design and testing, providing optimized benchmarks for HPC / AI applications in a timely manner. Fine-tune BIOS settings, optimize OS / network configurations, and develop diverse simulation configurations to enhance efficiency across various workloads.
  • Deliver on-site deployment services, ensuring customer acceptance verification and providing post-level 1&2 support. Create and maintain technical documentation, including technical notes, blogs, and diagrams, to facilitate knowledge dissemination.
  • Identify and document hardware and software quality issues and collaborate with Product Management and other Engineering teams to integrate customer feedback into future product enhancements.
  • Proactively engage in HPC roadmap development, planning software and hardware upgrades to sustain exceptional HPC infrastructure performance.
  • Document and analyze test plans, reports, logs, and actively contribute to the development of test utilities and automation scripts to streamline testing processes.

Qualifications :

  • BS / MS in Electrical Engineering, Computer Engineering or Computer Science
  • 8+ years of work-related experience in Deep Learning and Machine Learning
  • 8+ years of Linux / networking debugging / testing or relevant experience preferred
  • Experience with leading AI / ML frameworks such as PyTorch, TensorFlow, ONNX, etc.
  • Experience with DevOps or in cloud environments, including but not limited to Docker / Containers and Kubernetes
  • Hands-on experience with workload / scheduler Managers (Slurm) for rack / cluster
  • Familiar with MLPerf Training / Inference benchmark, LLM, HPL-AI or RCCL / NCCL
  • Programming experience with windows and Linux shell scripting
  • Strong sense of teamwork and good team player, strong communication skills
  • Familiar with Intel / AMD / NVIDIA development tool kits such as CUDA, oneAPI, ROCm is a plus
  • Experience with server / network hardware debugging and troubleshooting is a plus
  • CCNA, OpenStack, OpenShift, Azure or AWS is a plus
  • Please note that this position requires regular in-office attendance. The successful candidate is expected to be present in the office during standard working hours as determined by the company. In-office collaboration and participation in team meetings, training sessions, and other on-site activities are essential aspects of this role. Candidates should consider the commuting distance and be prepared to fulfill their responsibilities in the designated office location.

    Salary Range

    $137,000 - $156,000

    The salary offered will depend on several factors, including your location, level, education, training, specific skills, years of experience, and comparison to other employees already in this role. In addition to a comprehensive benefits package, candidates may be eligible for other forms of compensation, such as participation in bonus and equity award programs.

    EEO Statement

    Supermicro is an Equal Opportunity Employer and embraces diversity in our employee population. It is the policy of Supermicro to provide equal opportunity to all qualified applicants and employees without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, protected veteran status or special disabled veteran, marital status, pregnancy, genetic information, or any other legally protected status.

    Create a job alert for this search

    Sr • San Jose, CA, United States

    Related jobs
    System Integration Engineer

    System Integration Engineer

    Reliable Robotics • Mountain View, CA, United States
    Permanent
    We're building safety-enhancing technology for aviation that will save lives.Automated aviation systems will enable a future where air transportation is safer, more convenient and fundamentally tra...Show more
    Last updated: 30+ days ago • Promoted
    Sr. System HV Engineer

    Sr. System HV Engineer

    ASML US, LLC • San Jose, CA, United States
    Full-time
    We are looking for a skilled and innovative system HV engineer to lead the design of our high voltage systems ( more than-30KV). Understanding the mechanisms and physics of robust HV design and arci...Show more
    Last updated: 30+ days ago • Promoted
    Sr. Systems Engineer

    Sr. Systems Engineer

    Archer • San Jose, CA, United States
    Full-time
    Archer is an aerospace company based in San Jose, California building an all-electric vertical takeoff and landing aircraft with a mission to advance the benefits of sustainable air mobility.We are...Show more
    Last updated: less than 1 hour ago • Promoted • New!
    Sr. Systems Engineer - ZTLAN

    Sr. Systems Engineer - ZTLAN

    Versa Networks • Santa Clara, CA, United States
    Full-time
    At Versa Networks, we're revolutionizing the way businesses connect, secure, and optimize their networks.Our mission is to secure anywhere, anytime access to anything. As a leader in Secure SD-WAN, ...Show more
    Last updated: less than 1 hour ago • Promoted • New!
    Sr. Hardware System Engineer

    Sr. Hardware System Engineer

    Apple • Cupertino, CA, United States
    Full-time
    One of Apple’s R&D groups is currently seeking a senior system hardware engineer.This engineer will be responsible for system level architecture and design of sensing systems from early concept pro...Show more
    Last updated: 30+ days ago • Promoted
    Senior System Engineer I

    Senior System Engineer I

    iRhythm Technologies • San Francisco, CA, United States
    Full-time
    At iRhythm, you'll have the opportunity to grow your skills and your career while impacting the lives of people around the world. Rhythm is shaping a future where everyone, everywhere can access the...Show more
    Last updated: less than 1 hour ago • Promoted • New!
    Sr. Engineer, Seating Systems DRE

    Sr. Engineer, Seating Systems DRE

    DBSI Services, Inc. • Newark, CA, United States
    Full-time
    Experienced in using CAD to create 2D / 3D concept drawings, seating components, packaging layouts, and DFM studies, specifically with CATIA 3DX. MAKING THE INDUSTRY'S BEST MATCHES.DBSI Services is wi...Show more
    Last updated: less than 1 hour ago • Promoted • New!
    Sr. System Engineer / Rack Solution (27694)

    Sr. System Engineer / Rack Solution (27694)

    Supermicro • San Jose, CA, United States
    Full-time
    Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...Show more
    Last updated: 3 days ago • Promoted
    Sr. System Engineer

    Sr. System Engineer

    Supermicro • San Jose, CA, United States
    Full-time
    Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...Show more
    Last updated: 30+ days ago • Promoted
    Sr. System Engineer (27299)

    Sr. System Engineer (27299)

    Supermicro • San Jose, CA, United States
    Full-time
    Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...Show more
    Last updated: 3 days ago • Promoted
    SR Continuation Hardware System Engineer

    SR Continuation Hardware System Engineer

    Advanced Micro Devices, Inc. • Santa Clara, CA, United States
    Full-time
    WHAT YOU DO AT AMD CHANGES EVERYTHING.We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that ...Show more
    Last updated: less than 1 hour ago • Promoted • New!
    Sr. System Engineer / Rack Solution (27693)

    Sr. System Engineer / Rack Solution (27693)

    Supermicro • San Jose, CA, United States
    Full-time
    Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...Show more
    Last updated: 3 days ago • Promoted
    Sr. System Engineer (27294)

    Sr. System Engineer (27294)

    Supermicro • San Jose, CA, United States
    Full-time
    Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...Show more
    Last updated: 3 days ago • Promoted
    Sr. Systems Engineer

    Sr. Systems Engineer

    Apple • Cupertino, CA, United States
    Full-time
    Apple’s Ecosystem Products & Technologies team is looking for a skilled and motivated Systems Engineer with a passion for creating great user experiences and high quality products.The ideal candida...Show more
    Last updated: 30+ days ago • Promoted
    Sr. Systems Engineer

    Sr. Systems Engineer

    Cytek Biosciences • Fremont, CA, United States
    Full-time
    Are you passionate about cutting-edge technology and scientific innovation? We are looking for a highly motivated.If you thrive in a dynamic environment and enjoy solving complex technical challeng...Show more
    Last updated: less than 1 hour ago • Promoted • New!
    Sr. System Integration Engineer

    Sr. System Integration Engineer

    Reliable Robotics • Mountain View, CA, United States
    Permanent
    We're building safety-enhancing technology for aviation that will save lives.Automated aviation systems will enable a future where air transportation is safer, more convenient and fundamentally tra...Show more
    Last updated: 30+ days ago • Promoted
    Sr. System Engineer

    Sr. System Engineer

    Support Revolution • San Jose, CA, United States
    Full-time
    Select how often (in days) to receive an alert : Create Alert.San Jose, California, United States.Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Cen...Show more
    Last updated: 30+ days ago • Promoted
    Sr. System Test Engineer

    Sr. System Test Engineer

    Yantran LLC • Fremont, CA, United States
    Full-time
    Connected World, Connected Experiences ), we live the philosophy of connected world and connected experiences.We thrive on change that is powered by the intelligent symphony of technology and human...Show more
    Last updated: less than 1 hour ago • Promoted • New!