Talent.com
Datacenter Engineer- Site Lead
Datacenter Engineer- Site LeadSustainable Talent • Santa Clara, CA, United States
No se aceptan más aplicaciones
Datacenter Engineer- Site Lead

Datacenter Engineer- Site Lead

Sustainable Talent • Santa Clara, CA, United States
Hace 12 días
Tipo de contrato
  • A tiempo completo
Descripción del trabajo

Sustainable Talent is partnering with Nvidia a global leader who's been transforming computer graphics, PC gaming, and accelerated computing for over 25 years. We are looking for a Datacenter Engineer to support our client's on-premise, private cloud infrastructure team. This is a W-2 full-time contract based in Santa Clara, CA. We offer competitive pay $65-$85/hr based on factors like experience, education, location, etc. and provide full benefits, PTO, and amazing company culture!

In this role, you will be faced with the challenge of providing and maintaining a compute farm of systems which includes Builders, Packagers, and Testers that act as a test-bed for our developers worldwide to test various Nvidia hardware and software prior to release. The environment is huge, the scale massive, and the ask enormous! We need YOU to help US maintain and drive our world-class DCs/Labs to produce timely, deterministic results for our Engineers and expectant Users worldwide!

What You'll Do:

  • Collaborate closely with engineering teams (system architects, hardware/software engineers, QA, and more) to design, develop, debug, and release next-generation products.
  • Manage and maintain a high-performing Compute Farm of builders, packagers, testers, and core infrastructure.
  • Ensure availability targets are consistently met and lead system recovery efforts.
  • Deploy and qualify systems while supporting exciting new technology bring-ups.
  • Oversee inventory and lifecycle management for NVIDIA's assets across data centers and labs.
  • Gather critical metrics and create Standard Operating Procedures (SOPs) documentation.
  • Maintain a world-class, safe, and well-organized environment in our data centers and labs.
  • Troubleshoot Linux/Windows, hardware, and infrastructure issues alongside engineers and platform operations teams.
  • Plan, deploy, and maintain on-premises private cloud infrastructure, collaborating with datacenter and network engineering teams.
  • Implement efficiency improvements to maximize availability, throughput, and test accuracy while meeting SLAs and KPIs.
  • Represent the team in meetings with internal stakeholders and contribute to global operations.
What We Need to See:
  • Associate's or Bachelor's Degree in Engineering/Technical Major (or equivalent experience).
  • 5+ years of experience in data centers or large engineering labs.
  • Familiarity with SCMs like GIT/Perforce.
  • Proficiency in DCIM (Nautobot, etc.) and scripting (shell, Python, Ansible).
  • Working knowledge of protocols/services like TCP/IP, DNS, NFS, SSL, etc.
  • Experience with Windows, Linux, and Mac operating systems.
  • Hands-on experience with PCBs, GPUs, and system deployments.
  • Exceptional communication skills, both written and verbal.
  • Ability to explain technical concepts to non-technical audiences.
  • Strong problem-solving skills and a collaborative spirit.
What Makes You Stand Out:
  • Experience managing HPC clusters using tools like BCM and Slurm.
  • Hands-on knowledge of OpenStack.
  • Relevant certifications such as CCNA or equivalent.
  • Strong background in Windows and Linux administration, with an understanding of dense datacenter design, including compute, storage, and networking.
  • Experience with hypervisors and VM applications.
  • Knowledge of DC infrastructure with an emphasis on liquid cooling.
  • A track record of technical curiosity and innovation.
  • Mechanically inclined and comfortable with tools and physical tasks.
  • Energetic, enthusiastic, and the understanding of what it takes to get the team to the finish line.
  • Willing to go the extra mile to get the job done!
  • This is an onsite contract position, and will require local travel to DCs within Santa Clara.


Sustainable Talent is a M/F+, disabled, and veteran equal employment opportunity and affirmative action employer.
Crear una alerta de empleo para esta búsqueda

Datacenter Engineer- Site Lead • Santa Clara, CA, United States

Ofertas similares
Site Reliability Engineer, Cloud Infrastructure - USDS

Site Reliability Engineer, Cloud Infrastructure - USDS

TikTok • San Jose, CA, United States
A tiempo completo
Site Reliability Engineer, Cloud Infrastructure - USDS.Get AI-powered advice on this job and more exclusive features.The Systems and Networking team is committed to ensuring the seamless operation ...Mostrar más
Última actualización: hace más de 30 días • Oferta promocionada
Data Center Cooling and Thermodynamics Engineer

Data Center Cooling and Thermodynamics Engineer

ECL Services • Mountain View, CA, United States
A tiempo completo
ECL has just changed the GAME with the introduction of.Hydrogen powered off grid green data center.There is massive demand for high-end Data Centers that can meet the demands that come with AI and ...Mostrar más
Última actualización: hace 5 días • Oferta promocionada
Data Center Deployment Engineer L4/L5

Data Center Deployment Engineer L4/L5

Netflix • Los Gatos, CA, United States
A tiempo completo
At Netflix, our mission is to entertain the world.Together, we are writing the next episode - pushing the boundaries of storytelling, global fandom and making the unimaginable a reality.We are a dr...Mostrar más
Última actualización: hace más de 30 días • Oferta promocionada
Senior Data Center Performance Engineer - Benchmarking

Senior Data Center Performance Engineer - Benchmarking

NVIDIA • Santa Clara, CA, United States
A tiempo completo
A leading technology company in Santa Clara seeks a senior engineer to lead performance benchmarking and optimization efforts for their data center products.The ideal candidate should have a M.Resp...Mostrar más
Última actualización: hace 9 días • Oferta promocionada
Senior Site Reliability Engineer (SRE)

Senior Site Reliability Engineer (SRE)

ACL Digital • Mountain View, CA, United States
A tiempo completo
Title: Senior Site Reliability Engineer (SRE).Design, develop, and maintain automation frameworks for performance testing and monitoring of QuickBooks infrastructure.Ensure the scalability and reli...Mostrar más
Última actualización: hace 5 días • Oferta promocionada
Lead Systems Engineer

Lead Systems Engineer

CoStar Realty Information, Inc. • Sunnyvale, CA, United States
A tiempo completo
NASDAQ: CSGP) is a leading global provider of commercial and residential real estate information, analytics, and online marketplaces.Included in the S&P 500 Index and the NASDAQ 100, CoStar Group i...Mostrar más
Última actualización: hace 5 días • Oferta promocionada
Project Engineer - Modular - Data Center

Project Engineer - Modular - Data Center

Cupertino Electric • San Jose, California, United States
A tiempo completo
Final determination of a successful candidate's starting pay will vary based on a number of factors, including market location and may vary depending on job-related knowledge, skills, education and...Mostrar más
Última actualización: hace 4 días • Oferta promocionada
Hardware Systems Engineer - Data Center HW

Hardware Systems Engineer - Data Center HW

Apple • Sunnyvale, CA, United States
A tiempo completo
Hardware Engineering is looking for a system validation engineer to develop, implement and execute validation plans for datacenter systems designed for large scale deployment and integrated use.Thi...Mostrar más
Última actualización: hace 5 días • Oferta promocionada
Senior Data Center Performance Engineer - Benchmarking

Senior Data Center Performance Engineer - Benchmarking

NVIDIA Corporation • Santa Clara, CA, United States
A tiempo completo
A leading technology firm in Santa Clara, CA is seeking a Senior Data Center Performance Engineer to lead benchmarking and optimization for their data center products.The role requires extensive ex...Mostrar más
Última actualización: hace 14 días • Oferta promocionada
Delivery Engineer - Data Center - Plano, TX

Delivery Engineer - Data Center - Plano, TX

Delta Electronics (Americas) • Fremont, CA, United States
A tiempo completo
Delta, founded in 1971, is a global leader in switching power supplies and thermal management products with a thriving portfolio of smart energy-saving systems and solutions in the fields of indust...Mostrar más
Última actualización: hace 5 días • Oferta promocionada
Principal Site Reliability Engineer Cloud Identity & Trust SPIFFE/SPIRE

Principal Site Reliability Engineer Cloud Identity & Trust SPIFFE/SPIRE

ESR Healthcare • San Jose, CA, United States
A tiempo completo
Experience level: Mid-senior Experience required: 10 Years Education level: Bachelors degree Job function: Information Technology Industry: Information Technology and Services Pay rate : $60 per ho...Mostrar más
Última actualización: hace 5 días • Oferta promocionada
Area Schedule Lead - Data Center Design, Engineering and Construction

Area Schedule Lead - Data Center Design, Engineering and Construction

META • Fremont, CA, United States
A tiempo completo
We are seeking a candidate for a key leadership role in scheduling for a portfolio of Data Center projects of strategic importance to Meta.The Schedule lead will act as a technical Schedule subject...Mostrar más
Última actualización: hace más de 30 días • Oferta promocionada
LLM AIOps Development Engineer - Data Center Networking

LLM AIOps Development Engineer - Data Center Networking

Tik Tok • San Jose, CA, United States
A tiempo completo
About the team Networking brings together innovative ideas and technologies from network architecture, software defined networking (SDN), network virtualization, switch software and hardware co-des...Mostrar más
Última actualización: hace 5 días • Oferta promocionada
Datacenter / GPU Infrastructure Capacity Planning Tech PgM

Datacenter / GPU Infrastructure Capacity Planning Tech PgM

US Tech Solutions • Mountain View, CA, United States
A tiempo completo
Location: Mountain View, CA (onsite in a hybrid model).The client is looking for an experienced Infrastructure Capacity Planning & Order Management TPM to support our Compute Resources Order Manage...Mostrar más
Última actualización: hace 5 días • Oferta promocionada
(ASIC - Data Center), Design Engineer

(ASIC - Data Center), Design Engineer

Socionext US • Milpitas, CA, United States
A tiempo completo
System-on-Chip custom silicon solutions to global customers.The company is focused on datacenter, compute server, networking, storage, artificial intelligence, automotive and industrial automation ...Mostrar más
Última actualización: hace más de 30 días • Oferta promocionada
Head of Datacenter & Compute Business Unit

Head of Datacenter & Compute Business Unit

Power Integrations • San Jose, CA, United States
A tiempo completo
Head of Datacenter & Compute Business Unit.Lead Power Integrations' Datacenter & Compute Business Unit, owning full P&L responsibility and end-to-end strategy across one of the company's highest-gr...Mostrar más
Última actualización: hace 5 días • Oferta promocionada
Staff Data Center Field Engineer

Staff Data Center Field Engineer

Super Micro Computer • San Jose, CA, United States
A tiempo completo
Supermicro® is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop/ Big Data, Hyperscale, HPC and IoT/Embedded customer...Mostrar más
Última actualización: hace 5 días • Oferta promocionada
Site Reliability Engineer - SRE at Descope Los Altos, CA

Site Reliability Engineer - SRE at Descope Los Altos, CA

Itlearn360 • Los Altos, CA, United States
A tiempo completo
Site Reliability Engineer - SRE job at Descope.Descope R&D group is a skilled team of developers with a unique DNA of creativity,flexibility,anopen mindset.We are looking for a passionate SRE to jo...Mostrar más
Última actualización: hace más de 30 días • Oferta promocionada