Talent.com
Site Reliability Engineer
Site Reliability EngineerAkkodis • San Jose, CA, United States
No longer accepting applications
Site Reliability Engineer

Site Reliability Engineer

Akkodis • San Jose, CA, United States
1 day ago
Job type
  • Full-time
Job description

Akkodis is seeking a Site Reliability Engineer for a Contract with a client in San Jose, CA(Remote). You will manage and optimize high-performance AI infrastructure using NVIDIA DGX systems and Cisco UCS platforms.

Rate Range : $60 / hour to $64 / hour; The rate may be negotiable based on experience, education, geographic location, and other factors.

Site Reliability Engineer job responsibilities include :

  • Manage and optimize NVIDIA DGX systems and Cisco UCS infrastructure to ensure high availability, scalability, and fault tolerance.
  • Automate operational tasks using tools like Python, Terraform, Ansible, and Go to improve system reliability and efficiency.
  • Implement and maintain CI / CD pipelines using GitLab, Jenkins, or GitHub Actions for seamless deployment and integration.
  • Deploy and manage enterprise-grade Kubernetes clusters, preferably RedHat OpenShift or Google Anthos, for AI workloads.
  • Conduct capacity planning, performance analysis, and instrumentation to meet service quality and reliability targets.
  • Collaborate with cross-functional teams to deliver robust AI infrastructure solutions and support continuous improvement initiatives.

Required Qualifications :

  • Bachelor's or master's degree in computer science, Engineering, or a related technical field.
  • Minimum 5 years of experience in Site Reliability Engineering or DevOps roles, with a focus on AI infrastructure.
  • Hands-on experience with NVIDIA DGX systems (A100 / H100 / H200) and Cisco UCS-C885A platforms.
  • Proficiency in DevOps automation tools including Terraform, Ansible, Python, and CI / CD systems like GitLab or Jenkins.
  • If you are interested in this role, then please click APPLY NOW. For other opportunities available at Akkodis, or any questions, feel free to contact me at Shashank.Tewari@ akkodisgroup.com.

    Pay Details : $60.00 to $64.00 per hour

    Benefit offerings available for our associates include medical, dental, vision, life insurance, short-term disability, additional voluntary benefits, EAP program, commuter benefits and a 401K plan. Our benefit offerings provide employees the flexibility to choose the type of coverage that meets their individual needs. In addition, our associates may be eligible for paid leave including Paid Sick Leave or any other paid leave required by Federal, State, or local law, as well as Holiday pay where applicable.

    Equal Opportunity Employer / Veterans / Disabled

    Military connected talent encouraged to apply

    To read our Candidate Privacy Information Statement, which explains how we will use your information, please navigate to

    The Company will consider qualified applicants with arrest and conviction records in accordance with federal, state, and local laws and / or security clearance requirements, including, as applicable :

  • The California Fair Chance Act
  • Los Angeles City Fair Chance Ordinance
  • Los Angeles County Fair Chance Ordinance for Employers
  • San Francisco Fair Chance Ordinance
  • Massachusetts Candidates Only : It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability.

    Create a job alert for this search

    Site Reliability Engineer • San Jose, CA, United States

    Related jobs
    Technical Field Specialist

    Technical Field Specialist

    Cognizant • Portola Valley, CA, US
    Full-time
    Cognizant is one of the world’s leading professional services companies, we help our clients modernize technology, reinvent processes and transform experiences, so they can stay ahead in our consta...Show more
    Last updated: 1 hour ago • Promoted • New!
    Site Reliability Engineer

    Site Reliability Engineer

    Compunnel • San Leandro, CA, United States
    Full-time
    We are seeking a Site Reliability Engineer (SRE) with a strong focus on observability as part of the Data Center exit program. The ideal candidate will have a passion for building and maintaining re...Show more
    Last updated: 1 day ago • Promoted
    Senior / Lead Site Reliability Engineer - Federal

    Senior / Lead Site Reliability Engineer - Federal

    C3.ai, Inc. • Redwood City, CA, United States
    Full-time
    C3 AI (NYSE : AI), is the Enterprise AI application software company.C3 AI delivers a family of fully integrated products including the C3 Agentic AI Platform, an end-to-end platform for developing,...Show more
    Last updated: 1 day ago • Promoted
    Senior AI Platform Engineer

    Senior AI Platform Engineer

    University of California - Riverside • Oakland, CA, United States
    Full-time
    The Senior AI Platform Engineer is responsible for the technical design, development, and implementation of a comprehensive and scalable Generative AI platform for UC Riverside's faculty, staff, an...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineering (SRE)

    Site Reliability Engineering (SRE)

    Syntricate Technologies • Santa Clara, CA, United States
    Full-time
    Position : Site Reliability Engineering (SRE).Location : Santa Clara, CA (Onsite).WS application and CI / CD pipelines, Microsoft Server admin and workload support (Data Center and AWS).Initial respons...Show more
    Last updated: 1 day ago • Promoted
    Senior Site Reliability Engineer Cloud Platform

    Senior Site Reliability Engineer Cloud Platform

    Zilliz • Redwood City, CA, United States
    Full-time
    Zilliz is a fast-growing startup developing the industry's leading vector database company for enterprise-grade AI.Founded by the engineers behind Milvus, the world's most popular open-source vecto...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer - Openstack

    Site Reliability Engineer - Openstack

    Fortinet • Sunnyvale, California, US
    Full-time
    Fortinet is recruiting a Site Reliability Engineer- OPENSTACK to join our FortiStack team.This team is responsible for the management, operation and continued development of our Openstack-based pri...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    PsiQuantum • Palo Alto, CA, United States
    Full-time
    Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Show more
    Last updated: 30+ days ago • Promoted
    Senior Site Reliability Engineer - Storage

    Senior Site Reliability Engineer - Storage

    NVIDIA • Santa Clara, CA, United States
    Full-time
    NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing, and Visualization. The GPU, our invention, serves as the visual cortex of modern comp...Show more
    Last updated: 1 day ago • Promoted
    Forward Deployed Engineer

    Forward Deployed Engineer

    Lamar Health • San Mateo, CA, US
    Full-time
    We take care of shitty paperwork for very expensive drugs.Competitive + mission-driven impact.Our CEO worked on the AI model for the COVID Pfizer vaccine. Our CTO built an AI model for optimizing cl...Show more
    Last updated: 15 days ago • Promoted
    Principal Site Reliability Engineer Cloud Identity & Trust (SPIFFE / SPIRE) San Jose,

    Principal Site Reliability Engineer Cloud Identity & Trust (SPIFFE / SPIRE) San Jose,

    ESR Healthcare • San Jose, CA, United States
    Full-time
    Principal Site Reliability Engineer Cloud Identity & Trust (SPIFFE / SPIRE) (8671-1) San Jose, CA.Experience level : Mid-senior Experience required : 10 Years Education level : Bachelors degree Job func...Show more
    Last updated: 9 hours ago • Promoted • New!
    Senior Site Reliability Engineer (SRE) CloudVision as a Service (CVaaS)

    Senior Site Reliability Engineer (SRE) CloudVision as a Service (CVaaS)

    Arista Networks, Inc. • Santa Clara, CA, United States
    Full-time
    Senior Site Reliability Engineer (SRE) CloudVision as a Service (CVaaS).Arista Networks is an industry leader in data-driven, client-to-cloud networking for large data center, campus and routing en...Show more
    Last updated: 7 hours ago • Promoted • New!
    Senior Site Reliability Engineer (Senior SRE)

    Senior Site Reliability Engineer (Senior SRE)

    Ciroos • Pleasanton, CA, United States
    Full-time
    Senior Site Reliability Engineer (Senior SRE).Be among the first 25 applicants.Ciroos (pronounced Sai rose) is a seed?stage startup founded in February 2025 by a team of experienced executives and ...Show more
    Last updated: 1 day ago • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Signify Technology • Atherton, CA, United States
    Full-time
    Senior Site Reliability Engineer.Competitive, based on experience.Join our innovative technology startup that is revolutionizing healthcare with a safety-focused AI platform.Our platform assists me...Show more
    Last updated: 1 day ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Signify Technology • Palo Alto, CA, US
    Full-time
    Competitive, based on experience.We are a technology startup advancing healthcare with a safety-focused AI platform that assists medical professionals by managing patient communications, including ...Show more
    Last updated: 19 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Rockwoods Inc • Pleasanton, CA, US
    Full-time
    Note : Candidates must have relevant experience in Medical / Healthcare domains, this is mandatory.Senior SRE Engineer - Pleasanton, 5 days office. Primary work : 24x7 On-call support and setting up mo...Show more
    Last updated: 19 days ago • Promoted
    Machine Learning Infrastructure Engineer

    Machine Learning Infrastructure Engineer

    Character.ai • Redwood City, CA, United States
    Full-time
    We’re looking for seasoned ML Infrastructure engineers with experience designing, building and maintaining training and serving infrastructure for ML research. Provide infrastructure support to our ...Show more
    Last updated: 6 hours ago • Promoted • New!
    Site Reliability engineering (SRE)

    Site Reliability engineering (SRE)

    TechDigital Group • San Leandro, CA, United States
    Permanent
    Java Dev background interested in this role with strong hands-on experience in building dashboards and setting up alerts using Splunk, Grafana and GCL. Software Engineering experience, or equivalent...Show more
    Last updated: 1 day ago • Promoted