Talent.com
Senior HPC Cluster Systems Administrator

Senior HPC Cluster Systems Administrator

Lawrence Berkeley National LaboratoryBerkeley, CA, United States
8 days ago
Job type
  • Full-time
Job description

Berkeley Lab's ( LBNL ) Information Technology Division ( IT ) has an opening for a Senior HPC Cluster Systems Administrator to join their ScienceIT Team !

In this exciting role, you will support the Berkeley Lab research community by building, integrating, and maintaining Linux-based resources, high-performance computing cluster systems, and Kubernetes clusters. This role provides extensive expertise in High Performance Computing infrastructure and delivers advanced Linux solutions to further scientific endeavors at Berkeley Lab. The mission of Scientific Computing under ScienceIT is to facilitate groundbreaking fundamental research globally by providing essential computing tools, networks, and expertise to enable pioneering science.

This position has an anticipated start date of January 5, 2026.

We're here for the same mission, to bring science solutions to the world. Join our team and YOU will play a supporting role in our goal to address global challenges! Have a high level of impact and work for an organization associated with 17 Nobel Prizes!

Why join Berkeley Lab?

We invest in our employees by offering a total rewards package you can count on :

  • Exceptional health and retirement benefits , including pension or 401K-style plans
  • Opportunities to grow in your career - check out our Tuition Assistance Program
  • A culture where you'll belong - we are invested in our teams!
  • In addition to accruing vacation and sick time, we also have an annual Winter Holiday Shutdown
  • Parental bonding leave (for both mothers and fathers)
  • Pet insurance

What You Will Do :

  • Perform Linux system and HPC cluster maintenance and installations, operating system upgrades, system security hardening and intrusion detection, storage and file system management, system hardware, customization of user group working environment, troubleshooting, network monitoring, and crash recovery.
  • Design, deploy, and manage scalable applications using Kubernetes, ensuring the availability, performance, and readiness of the Kubernetes infrastructure.
  • Automate deployment, scaling, and management of containerized applications, and collaborating with DevOps and development teams to streamline CI / CD pipelines.
  • Design, deploy, and manage the global storage platform to ensure high performance, massive scalability, reliability, and future-proof solutions.
  • Support storage technologies such as Lustre, VAST, and networks.
  • Resolve I / O issues related to business applications, including diagnosing and resolving complex storage, Linux, and networking challenges in a fast-paced environment.
  • Research new storage management technologies, techniques, and provide recommendations.
  • Participate in developing system administration, security, and network policies, documentation, and tools oriented towards efficient systems management.
  • Participate in cluster support to staff and researchers, including initial installation, integration, and ongoing maintenance of Linux High-Performance Computing cluster systems. This includes travel to remote sites if as needed.
  • Co-leading technical efforts with other senior system administrators in areas of HPC technologies such as job schedulers, high-performance interconnects, parallel file systems, cybersecurity, cluster management, container orchestration, VM infrastructure, networking, performance tuning, or data center planning.
  • Co-leading group projects of small to medium size and complexity, to implement and deploy new computing technologies and associated services to the research community.
  • What We Are Looking For :

  • A Bachelor's Degree (or equivalent knowledge / training) in Computer Science, Engineering, or a related discipline, and a minimum of 12 years of relevant experience in Linux system administration within a large distributed computing environment, including experience providing systems and end-user support for multiple scientific or computational research groups or an equivalent combination of education and experience.
  • Demonstrated ability to manage large-scale, performance-critical environments, including capacity planning, scaling, and optimization.
  • Significant experience deploying, scaling, and managing Kubernetes clusters, with a strong understanding of its architecture (pods, deployments, services, ingress) and container orchestration. Proven proficiency with CI / CD tools like Jenkins or GitLab CI.
  • Proven experience with Red Hat derivatives (CentOS, Scientific Linux, Rocky Linux), Debian, Ubuntu, and large-scale system and configuration management tools (Kickstart, Ansible, Puppet, Chef, Warewulf). Expertise in supporting standard services (NFS, LDAP, SMB, MySQL, Apache / Nginx HTTPD).
  • Strong HPC expertise, including Linux, job schedulers, high-performance interconnects, parallel file systems, cybersecurity, container orchestration, cluster management, VM infrastructure, networking, performance tuning, scientific application support, and data center planning.
  • Proficiency in Python and Bash for building, optimizing, and debugging scientific codes (C, C++, Fortran, Java), including experience with compilers (GCC, Intel), debuggers, Makefiles, and version-control (git, Subversion).
  • Expertise in storage system design and optimization (Lustre, S3, VAST, Weka, Ceph, DDN), including a deep understanding of the storage stack (kernel to user space, including file systems, block storage, I / O schedulers, VFS), storage benchmarking, and performance tuning (throughput, latency, IOPS, workload-specific optimizations).
  • Excellent oral and written communication skills including experience organizing and presenting customer focused technical data, reports, and projects to audiences with varying degrees of technical expertise.
  • Strong interpersonal skills including experience with research facilitation and project management in a multidisciplinary team environment.
  • Desired Qualifications :

  • An Advanced Degree (or equivalent knowledge / training) in Computer Science, Engineering, or a related discipline.
  • Experience with software engineering and / or software development.
  • Familiarity with Kubernetes-related tools like Helm, Istio, and Prometheus.
  • Demonstrated experience supporting research at a National Lab and / or in an academic or research environment.
  • Additional Information :

  • Application Deadline : For full consideration, please apply with a resume and a cover letter describing your interest by November 30, 2025 .
  • Appointment type : This is a full-time, career appointment, exempt (monthly paid) from overtime pay.
  • Salary Information : This position is expected to pay $178,644 - $218,364 annually, which fits within the full salary range of $158,808 - $267,996 annually for job code C70.4. It is not typical for an individual to be offered a salary at or near the top of the range for a position. Salary for this position will be commensurate with the final candidate's qualification and experience, including skills, knowledge, relevant education, certifications, and aligned with the internal peer group.
  • Background Check : This position may be subject to a background check. Any convictions will be evaluated to determine if they directly relate to the responsibilities and requirements of the position. Having a conviction history will not automatically disqualify an applicant from being considered for employment.
  • Work Modality : This position is eligible for a hybrid work schedule - a combination of teleworking and performing work on site at Lawrence Berkeley National Lab, 1 Cyclotron Road, Berkeley, CA 94720. Work schedules are dependent on business needs. Individuals working a hybrid schedule must reside within 150 miles of Berkeley Lab. Starting May 7, a REAL ID or other acceptable form of identification is required to access Berkeley Lab sites (for more information click here ).
  • Relocation : This position is eligible for relocation assistance.
  • Work Authorization : Applicants must be legally authorized to work in the United States. Berkeley Lab does not provide visa sponsorship for this position.
  • Want to learn more about working at Berkeley Lab? Please visit : careers.lbl.gov

    Equal Employment Opportunity Employer : The foundation of Berkeley Lab is our Stewardship Values : Team Science, Service, Trust, Innovation, and Respect; and we strive to build community with these shared values and commitments. Berkeley Lab is an Equal Opportunity Employer. We heartily welcome applications from all who could contribute to the Lab's mission of leading scientific discovery, excellence, and professionalism. In support of our rich global community, all qualified applicants will be considered for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, age, protected veteran status, or other protected categories under State and Federal law.

    Berkeley Lab is a University of California employer. It is the policy of the University of California to undertake affirmative action and anti-discrimination efforts, consistent with its obligations as a Federal and State contractor.

    Misconduct Disclosure Requirement : As a condition of employment, the finalist will be required to disclose if they are subject to any final administrative or judicial decisions within the last seven years determining that they committed any misconduct, are currently being investigated for misconduct, left a position during an investigation for alleged misconduct, or have filed an appeal with a previous employer.

    Create a job alert for this search

    System Administrator • Berkeley, CA, United States

    Related jobs
    • Promoted
    IT Systems Administrator

    IT Systems Administrator

    The Rundown AI, Inc.San Francisco, CA, United States
    Full-time
    Perplexity is seeking a highly skilled and experienced IT Systems Administrator to join our dynamic team, revolutionizing the way people search and interact with the internet.As an early hire in th...Show moreLast updated: 4 days ago
    • Promoted
    Customer Solutions Representative

    Customer Solutions Representative

    Intelliswift Software, IncNovato, CA, US
    Full-time
    Customer Solutions Representative.Our team of rich experts from diverse backgrounds contributes to making Intelliswift one of the most reliable partners in IT and Talent solutions.We specialize in ...Show moreLast updated: 15 days ago
    • Promoted
    MEP Superintendent - Mission Critical

    MEP Superintendent - Mission Critical

    Metric GeoSonoma, CA, US
    Full-time
    Mission Critical MEP Superintendent – San Francisco, CA.Join a top-tier national GC delivering cutting-edge.Lead field coordination of MEP systems across large-scale data center projects.Driv...Show moreLast updated: 30+ days ago
    • Promoted
    Senior HPC Cluster Systems Administrator

    Senior HPC Cluster Systems Administrator

    Lawrence Berkeley National LaboratoryBerkeley, CA, United States
    Full-time
    Information Technology Division (.Senior HPC Cluster Systems Administrator to join their.In this exciting role, you will support the Berkeley Lab research community by building, integrating, and ma...Show moreLast updated: 9 days ago
    • Promoted
    IT Systems Administrator

    IT Systems Administrator

    Menlo VenturesSan Francisco, CA, United States
    Full-time
    San Francisco Bay Area $110K – $125K.Envoy's compensation package includes a market-competitive salary, equity for all full-time roles, and excellent benefits. Final offers may vary within the provi...Show moreLast updated: 23 days ago
    • Promoted
    • New!
    Principal Security Architect, Software EngineeringSoftware Engineering • Berkeley, CA • Full time • On-site

    Principal Security Architect, Software EngineeringSoftware Engineering • Berkeley, CA • Full time • On-site

    Form EnergyBerkeley, CA, United States
    Full-time
    Are you ready to build America's energy future? Form Energy is an American manufacturing and energy technology company.We're revolutionizing energy storage with cost-effective, multi-day technology...Show moreLast updated: 1 hour ago
    • Promoted
    Earn $50,000–$100,000 as a Surrogate – Must Have Healthy Prior Delivery

    Earn $50,000–$100,000 as a Surrogate – Must Have Healthy Prior Delivery

    Ivy SurrogacyMoss Beach, CA, US
    Full-time +1
    Becoming a surrogate mother is one of the greatest gifts of life!.Ivy Surrogacy is a third-party reproductive agency for parents all over the world seeking. At Ivy Surrogacy, we genuinely believe we...Show moreLast updated: 1 day ago
    • Promoted
    Proposal Administrator

    Proposal Administrator

    Nova Group, Inc.Napa, CA, US
    Full-time
    Under the direction of the Vice President of Pre-Construction Services and the company, the Proposal Administrator will manage the proposal activities, including maintaining proposal schedules and ...Show moreLast updated: 30+ days ago
    • Promoted
    Systems Administrator

    Systems Administrator

    University of California - San FranciscoSan Francisco, CA, United States
    Full-time
    The Storage and Backup Engineer occupies a critical position within the Data Center Services team in Platform Services, reporting directly to the Lead Storage and Backup Administrator.The Engineer ...Show moreLast updated: 30+ days ago
    • Promoted
    IT Systems Administrator

    IT Systems Administrator

    EnvoySan Francisco, CA, United States
    Full-time
    Envoy builds workspace management technology that makes it simple to run secure, compliant and connected workplaces across every location. Over 16,000 workplaces and properties worldwide rely on Env...Show moreLast updated: 1 day ago
    • Promoted
    Senior Systems Support Analyst

    Senior Systems Support Analyst

    I.T. Solutions, Inc.San Carlos, CA, US
    Full-time
    Senior Systems Support Analyst.General IT Support & Administration : .Provide advanced troubleshooting and technical support for hardware, software, and network-related issues.Ensure system secur...Show moreLast updated: 1 day ago
    • Promoted
    MEP Superintendent, Data Centers

    MEP Superintendent, Data Centers

    Suffolk ConstructionSonoma, CA, US
    Full-time
    Suffolk is a national enterprise that builds, innovates and invests.Suffolk is an end-to-end business that provides value throughout the entire project lifecycle by leveraging its core construction...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Relativity Senior Systems Administrator

    Senior Relativity Senior Systems Administrator

    CGS Federal (Contact Government Services)San Francisco, CA, United States
    Full-time
    Senior Relativity Senior Systems Administrator.We are seeking a Senior Relativity Sr.Systems Administrator to join our team. You will handle a variety of projects to support and improve the organiza...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    2kNovato, California, United States
    Full-time
    On-Call Requirement : Yes (Periodic Rotation).K is headquartered in Novato, California and is a wholly owned label of Take-Two Interactive Software, Inc. Founded in 2005, 2K Games is a global video g...Show moreLast updated: 30+ days ago
    • Promoted
    IT Systems Administrator

    IT Systems Administrator

    Serve RoboticsSan Francisco, CA, United States
    Full-time
    We are tech industry veterans in software, hardware, and design who are pooling our skills to build the future we want to live in. We are solving real-world problems leveraging robotics, machine lea...Show moreLast updated: 27 days ago
    • Promoted
    • New!
    Surgery Center Administrator

    Surgery Center Administrator

    Webster Outpatient Surgery Center (11656)San Ramon, CA, US
    Full-time
    United Surgical Partners International (USPI).Ambulatory Surgery Center platform, is seeking a .Webster Outpatient Surgery Center. Greater San Francisco Bay Area).Webster Outpatient Surgery Center i...Show moreLast updated: 2 hours ago
    • Promoted
    Systems Administrator Sr - San Francisco

    Systems Administrator Sr - San Francisco

    Shiva IT ServicesSan Francisco, CA, United States
    Full-time
    Provide Tier 1 & 2 desktop user support.Provide support for HUD Baseline desktop and Microsoft Office Suites.Perform network printer / Multi-Function Device installs, upgrades & problem diagnosis.Per...Show moreLast updated: 30+ days ago
    • Promoted
    IT Systems Administrator

    IT Systems Administrator

    Envoy Inc.San Francisco, CA, United States
    Full-time
    Envoy builds workspace management technology that makes it simple to run secure, compliant, and connected workplaces across every location. Over 16,000 workplaces and properties around the world rel...Show moreLast updated: 22 days ago