Talent.com
AI & HPC Infrastructure Engineer

AI & HPC Infrastructure Engineer

AccentureWalnut Creek, CA, United States
22 hours ago
Job type
  • Full-time
Job description

We Are :

The Global Infrastructure Engineering AI & HPC team is at the center of enabling infrastructure reinvention for the next era of digital solutions powered by AI and High-Performance Computing (HPC). We bring together deep technical expertise across cloud, on-prem, and hybrid environments to design, build, and operate accelerated infrastructure that powers high-performance workloads at scale. Our solutions enable some of our most strategic and mission-critical clients to unlock new levels of performance, efficiency, and innovation. Our remit spans the full lifecycle-from strategy and architecture through implementation and operations-driving modernization across the entire infrastructure stack. We collaborate across the ecosystem to harness emerging technologies, fuel growth, and transform industries. In this rapidly growing market, our team is leading the way in shaping how enterprises leverage AI and HPC to drive breakthrough innovation and reimagine what's possible in infrastructure.

Key Responsibilities :

Design and implement HPC and AI infrastructure solutions, aligning system architecture and deployment roadmaps to industry-specific performance and scalability needs

Deploy, configure, and manage XPU-based clusters (CPU / GPU / accelerators) using schedulers, VM / K8s orchestration platforms, Slurm, and containerized platforms in scalable designs to provide Metal as a Service (MaaS), GPUaaS, AIaaS, and other offerings

Optimize cluster performance, scalability, energy, and cost efficiency across on-premises, cloud, and hybrid environments

Integrate AI and HPC platforms with existing IT systems, data pipelines, and security frameworks

Monitor, troubleshoot, and tune infrastructure to ensure high availability, low-latency networking, and workload resiliency

Develop and maintain documentation including architecture diagrams, configuration baselines, and operational runbooks

Provide technical guidance and support to users, enabling efficient execution of HPC / AI workloads, large-scale models, and simulations.

Travel may be required for this role. The amount of travel will vary from 25% to 100% depending on business need and client requirements.

Required Skills and Qualifications :

Minimum 4+ year of hands-on experience designing, deploying, and managing HPC and AI infrastructure across on-premises, cloud, and hybrid environments in 2 or more segments : hyperscaler, neocloud, large Enterprise, Telco / Mobile, supporting key industries such as Financial Services, Life Sciences, Manufacturing, and Retail

Minimum 4+ years' experience of accelerated computing architectures (GPUs, XPUs, DPUs), high-performance fabrics (InfiniBand, Ethernet), SONiC, networking, and modern storage / data platforms (e.g. NVMe-oF, Lustre, GPFS, BeeGFS, VAST, DDN, Weka) to build robust solutions

Minimum 4+ year experience with cluster management and orchestration (e.g. Slurm, Run : ai, Kubernetes, Docker), real-time performance monitoring, and observability frameworks

Minimum 4+ years' experience with cloud and virtualization platforms (e.g. AWS, Azure, GCP, VMware, Nutanix) and expertise in automation and optimization using scripting (Python, AI tools) with foundational Infrastructure-as-Code tools such as Terraform and Ansible.

Minimum 4+ year experience implementing MLOps and DevSecOps frameworks to enable secure, automated, and reproducible workflows

Bachelor's degree or equivalent (minimum 12 years) work experience. (If Associate's Degree, must have minimum 6 years work experience)

Preferred Skills and Qualifications :

Experience managing the deployment of 1,000+ GPU clusters for HPC and AI workloads with various infrastructure services enabled

Experience with GPU computing libraries and accelerators (e.g., NVIDIA CUDA, Dynamo, AMD ROCm).

Experience with AI and HPC Networking (e.g., RoCE, InfiniBand, muti-planar / multi-rail designs, platform buffer architectures)

Knowledge of Machine Learning and AI frameworks (e.g., TensorFlow, PyTorch, JAX), Jupyter notebooks / Google Colab environments

Experience with HPC & AI workload management and optimization techniques

Familiarity with DevOps practices and tools (e.g., Ansible, Terraform) for infrastructure automation

Industry certifications in NVIDIA infrastructure, public cloud providers, Data Science, etc. are a plus

Compensation at Accenture varies depending on a wide array of factors, which may include but are not limited to the specific office location, role, skill set, and level of experience. As required by local law, Accenture provides a reasonable range of compensation for roles that may be hired as set forth below.We accept applications on an on-going basis and there is no fixed deadline to apply.

Information on benefits is here. ()

Role Location Annual Salary Range

California $73,800 to $218,800

Cleveland $68,300 to $175,000

Colorado $73,800 to $189,000

District of Columbia $78,500 to $201,300

Illinois $68,300 to $189,000

Maryland $73,800 to $189,000

Massachusetts $73,800 to $201,300

Minnesota $73,800 to $189,000

New York / New Jersey $68,300 to $218,800

Washington $78,500 to $201,300

Requesting an Accommodation

Accenture is committed to providing equal employment opportunities for persons with disabilities or religious observances, including reasonable accommodation when needed. If you are hired by Accenture and require accommodation to perform the essential functions of your role, you will be asked to participate in our reasonable accommodation process. Accommodations made to facilitate the recruiting process are not a guarantee of future or continued accommodations once hired.

If you would like to be considered for employment opportunities with Accenture and have accommodation needs such as for a disability or religious observance, please call us toll free at 1 (877) 889-9009 or send us an email or speak with your recruiter.

Equal Employment Opportunity Statement

We believe that no one should be discriminated against because of their differences.?All employment decisions shall be made without regard to age, race, creed, color, religion, sex, national origin, ancestry, disability status, veteran status, sexual orientation, gender identity or expression, genetic information, marital status, citizenship status or any other basis as protected by federal, state, or local law.?Our rich diversity makes us more innovative, more competitive, and more creative, which helps us better serve our clients and our communities.

For details, view a copy of the?Accenture Equal Opportunity Statement ()

Accenture is an EEO and Affirmative Action Employer of Veterans / Individuals with Disabilities.

Accenture is committed to providing veteran employment opportunities to our service men and women.

Other Employment Statements

Applicants for employment in the US must have work authorization that does not now or in the future require sponsorship of a visa for employment authorization in the United States.

Candidates who are currently employed by a client of Accenture or an affiliated Accenture business may not be eligible for consideration.

Job candidates will not be obligated to disclose sealed or expunged records of conviction or arrest as part of the hiring process. Further, at Accenture a criminal conviction history is not an absolute bar to employment.

The Company will not discharge or in any other manner discriminate against employees or applicants because they have inquired about, discussed, or disclosed their own pay or the pay of another employee or applicant. Additionally, employees who have access to the compensation information of other employees or applicants as a part of their essential job functions cannot disclose the pay of other employees or applicants to individuals who do not otherwise have access to compensation information, unless the disclosure is (a) in response to a formal complaint or charge, (b) in furtherance of an investigation, proceeding, hearing, or action, including an investigation conducted by the employer, or (c) consistent with the Company's legal duty to furnish information.

California requires additional notifications for applicants and employees. If you are a California resident, live in or plan to work from Los Angeles County upon being hired for this position, please click here for additional important information.

Please read Accenture's Recruiting and Hiring Statement for more information on how we process your data during the Recruiting and Hiring process.

Create a job alert for this search

Ai Infrastructure Engineer • Walnut Creek, CA, United States

Related jobs
  • Promoted
AI Infra Engineer

AI Infra Engineer

Pantera CapitalSan Francisco, CA, United States
Full-time
We are looking for an AI Infra engineer to join our growing team.We work with Kubernetes, Slurm, Python, C++, PyTorch, and primarily on AWS. As an AI Infrastructure Engineer, you will be partnering ...Show moreLast updated: 1 day ago
  • Promoted
AI / ML Infrastructure Engineer

AI / ML Infrastructure Engineer

RIT Solutions, Inc.Concord, CA, United States
Full-time
Title : AI / ML Infrastructure Engineer, 3 days onsite, locals only.Grant St Concord California 94520 United States.Lead and design the platform and infrastructure architecture for AIML and NLP in mod...Show moreLast updated: 30+ days ago
  • Promoted
AI Infrastructure Engineer - PlayerZero

AI Infrastructure Engineer - PlayerZero

HireOTSSan Francisco, CA, United States
Full-time
The platform is used by engineering and support teams to : .Autonomously debug problems in production software.Fix issues directly in the codebase. Prevent recurring issues through intelligent root-ca...Show moreLast updated: 30+ days ago
  • Promoted
AI / ML Infrastructure Engineer

AI / ML Infrastructure Engineer

Syntricate TechnologiesConcord, CA, United States
Full-time
Grant St Concord California 94520 (3 days onsite in week).Lead and design the platform and infrastructure architecture for AIML and NLP in modern hybrid cloud computing. Participate in day-to-day st...Show moreLast updated: 30+ days ago
  • Promoted
  • New!
AI Infrastructure Engineer, Core Infrastructure

AI Infrastructure Engineer, Core Infrastructure

Scale AISan Francisco, CA, United States
Full-time
As a Software Engineer on the ML Infrastructure team, you will design and build the next generation of foundational systems that power all ML Infrastructure compute at Scale - from model training a...Show moreLast updated: 22 hours ago
  • Promoted
AI Platform Engineer, Infrastructure

AI Platform Engineer, Infrastructure

Brain Co.San Francisco, CA, United States
Full-time
Applied AI startup founded by Elad Gil and Jared Kushner, and backed by many of Silicon Valley’s leading builders — including Patrick Collison (CEO of Stripe), Andrej Karpathy (Cofounder of OpenAI)...Show moreLast updated: 4 days ago
  • Promoted
Cloud HPC Engineer

Cloud HPC Engineer

Mat3raWalnut Creek, CA, United States
Full-time +1
We are building the real-world J.RnD () and looking for Senior Engineers excited about bridging the gap between materials / chemistry, data science, and computer science to help us develop a software...Show moreLast updated: 30+ days ago
  • Promoted
  • New!
AI & High Performance Computing (HPC) Network Senior Engineer

AI & High Performance Computing (HPC) Network Senior Engineer

AccentureWalnut Creek, CA, United States
Full-time
We are looking for a Network Engineer to design, deploy, and troubleshoot high-throughput, low-latency networks that support large-scale AI training and inference workloads.In this role, you'll wor...Show moreLast updated: 22 hours ago
  • Promoted
AI Infrastructure Engineer, Model Serving Platform

AI Infrastructure Engineer, Model Serving Platform

Scale AI, Inc.San Francisco, CA, United States
Full-time
As a Software Engineer on the ML Infrastructure team, you will design and build platforms for scalable, reliable, and efficient serving of LLMs. Our platform powers cutting-edge research and product...Show moreLast updated: 30+ days ago
  • Promoted
AI Infra Engineer

AI Infra Engineer

Perplexity AI Inc.San Francisco, CA, United States
Full-time
We are looking for an AI Infra engineer to join our growing team.We work with Kubernetes, Slurm, Python, C++, PyTorch, and primarily on AWS. As an AI Infrastructure Engineer, you will be partnering ...Show moreLast updated: 4 days ago
  • Promoted
Senior Infrastructure Software Engineer, Enterprise AI

Senior Infrastructure Software Engineer, Enterprise AI

Scale AI, Inc.San Francisco, CA, United States
Full-time
Scale GP is building the next generation of enterprise-grade Generative AI products.Our platform provides APIs for knowledge retrieval, inference, and evaluation, enabling customers to build and de...Show moreLast updated: 30+ days ago
  • Promoted
  • New!
AI & HPC Infrastructure Engineer

AI & HPC Infrastructure Engineer

AccentureSan Francisco, CA, United States
Full-time
The Global Infrastructure Engineering AI & HPC team is at the center of enabling infrastructure reinvention for the next era of digital solutions powered by AI and High-Performance Computing (HPC)....Show moreLast updated: 22 hours ago
  • Promoted
AI Engineer - LLM Infra

AI Engineer - LLM Infra

YutoriSan Francisco, CA, United States
Full-time
Yutori is reimagining how people interact with the web by building AI agents that can reliably do everyday digital tasks. We are building the entire stack to be agent-first, from training our own mo...Show moreLast updated: 30+ days ago
  • Promoted
Lead HPC Infrastructure Engineer

Lead HPC Infrastructure Engineer

Referrals OnlySan Francisco, CA, United States
Full-time
We are seeking a highly accomplished engineer to take ownership of the operations and optimization of next-generation NVIDIA GB200 and GB300 GPU clusters. This role sits at the intersection of high-...Show moreLast updated: 30+ days ago
  • Promoted
  • New!
AI Infrastructure Engineer, Model Serving Platform

AI Infrastructure Engineer, Model Serving Platform

Scale AISan Francisco, CA, United States
Full-time
As a Software Engineer on the ML Infrastructure team, you will design and build platforms for scalable, reliable, and efficient serving of LLMs. Our platform powers cutting-edge research and product...Show moreLast updated: 22 hours ago
  • Promoted
  • New!
Infrastructure Engineer - Supercomputing

Infrastructure Engineer - Supercomputing

XaiSan Francisco, CA, United States
Full-time
Infrastructure Engineer - Supercomputing.San Francisco & Palo Alto, CA - Apply.AIs mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of kno...Show moreLast updated: 22 hours ago
  • Promoted
  • New!
Tech Lead, AI Compute Infrastructure

Tech Lead, AI Compute Infrastructure

HeyGenSan Francisco, California, United States
Full-time
At HeyGen, our mission is to make visual storytelling accessible to all.Over the last decade, visual content has become the preferred method of information creation, consumption, and retention.But ...Show moreLast updated: 12 hours ago
  • Promoted
Distinguished AI Engineer (Agentic AI Platform Infrastructure)

Distinguished AI Engineer (Agentic AI Platform Infrastructure)

Capital OneSan Francisco, CA, United States
Full-time +1
Distinguished AI Engineer (Agentic AI Platform Infrastructure).At Capital One, we are creating responsible and reliable AI systems, changing banking for good. For years, Capital One has been an indu...Show moreLast updated: 30+ days ago