Talent.com
Site Reliability Engineer
Site Reliability EngineerBits to Atoms • San Francisco, CA, United States
Site Reliability Engineer

Site Reliability Engineer

Bits to Atoms • San Francisco, CA, United States
24 days ago
Job type
  • Full-time
Job description

Join to apply for the Site Reliability Engineer role at Bits to Atoms

The Site Reliability Engineer (SRE) will ensure the reliability, scalability, and performance of a hybrid (cloud + on-prem) platform. You’ll work at the intersection of infrastructure, AI / ML systems, and mission-critical physical operations. You’ll collaborate directly with engineering, AI, and operations teams to design resilient systems and bring cutting-edge AI models into production. This is a high-impact role — your work will directly shape how the world’s most advanced data centers operate.

Company Overview

Bits to Atoms has partnered with Fluix AI to fill its Site Reliability Engineer role. Fluix is building the AI operating system that plans, designs, and optimizes AI infrastructure. Based in Silicon Valley, Fluix specializes in AI-driven solutions for data centers and power providers, leveraging machine learning and AI technologies. Our mission is bold : to double America’s compute capacity without building new data centers.

Position Overview

The Site Reliability Engineer will ensure reliability, scalability, and performance across cloud and on-prem environments. You’ll work at the intersection of infrastructure, AI / ML systems, and mission-critical physical operations. You’ll collaborate with engineering, AI, and operations teams to design resilient systems and deploy AI models into production.

Who You’ll Work With

  • Chase Overcash – CTO

Responsibilities

  • Design, implement, and maintain scalable, fault-tolerant infrastructure across cloud and on-prem environments.
  • Build automation to streamline operations, reduce toil, and increase reliability.
  • Integrate ML / AI models into production systems and optimize their performance at scale.
  • Improve system resilience through monitoring, observability, and incident management.
  • Lead post-incident reviews, drive root-cause analysis, and implement preventative fixes.
  • Manage multi-environment cloud setups (dev, staging, prod) and optimize data center operations.
  • Ensure compliance and security across all infrastructure and applications.
  • Partner with engineering and data science teams to continuously improve deployment practices.
  • Qualifications

  • Bachelor’s degree in Computer Science, Engineering, or equivalent experience.
  • Proven experience as an SRE, DevOps engineer, or similar role in a SaaS or infrastructure-heavy environment.
  • Strong expertise with cloud platforms (AWS preferred; GCP / Azure also valuable).
  • Proficiency in Python or similar scripting / programming languages.
  • Hands-on experience with containerization and orchestration (Kubernetes).
  • Solid understanding of networking, security, and performance optimization.
  • Familiarity with ML / AI infrastructure and data center operations is a strong plus.
  • Experience with CI / CD pipelines and infrastructure-as-code (Terraform, Ansible, etc.).
  • Excellent problem-solving skills and the ability to thrive in a fast-paced startup environment.
  • Culture Fit

  • Obsessed with solving hard problems and willing to dig deep.
  • Hands-on, comfortable with both physical and software systems.
  • Value being on-site and with clients, understanding impact of mission-critical work.
  • Embrace flexibility — supporting teammates during weekends, holidays, or emergencies when needed.
  • Over-communicate, collaborate openly, and take ownership.
  • Why Fluix?

  • Competitive salary and equity package.
  • Comprehensive health, dental, and vision insurance.
  • Opportunities to shape the future of AI infrastructure and data center technology.
  • A collaborative, fast-paced environment in the San Francisco Bay Area.
  • Referrals increase your chances of interviewing at Bits to Atoms.

    Get notified about new Site Reliability Engineer jobs in San Francisco Bay Area .

    #J-18808-Ljbffr

    Create a job alert for this search

    Site Reliability Engineer • San Francisco, CA, United States

    Related jobs
    Site Reliability Engineer

    Site Reliability Engineer

    ConductorOne • San Francisco, CA, United States
    Full-time
    Shape the future of identity with the highest-caliber team.If you’re amazing at what you do and want to solve big challenges in identity and security, come on board. Identity is how companies are be...Show more
    Last updated: 9 days ago • Promoted
    Principal Site Reliability Engineer

    Principal Site Reliability Engineer

    Fortinet • Santa Clara, CA, United States
    Full-time
    At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers. Our team members enjoy solving complex problems, and obsess...Show more
    Last updated: 7 days ago • Promoted
    Site Reliability Engineer I

    Site Reliability Engineer I

    prosper.com • San Francisco, CA, United States
    Full-time
    As a Site Reliability Engineer I at Prosper, you will play a crucial role in enhancing the reliability, scalability, and maintainability of our technology platform. This entry-level position is desi...Show more
    Last updated: 4 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Latent • San Francisco, CA, United States
    Full-time
    Latent is building the intelligence infrastructure for American healthcare.Our products are already helping hospitals and clinics dramatically increase workflow output, speed up patient access to m...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    PsiQuantum • Palo Alto, CA, United States
    Full-time
    Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Redwood Materials, Inc. • San Francisco, CA, United States
    Full-time
    Redwood is localizing a global battery supply chain that seamlessly integrates recovery, reuse, and recycling—keeping critical minerals in circulation and driving the energy transition.Founded in 2...Show more
    Last updated: 29 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Sigmaways Inc • San Francisco, CA, United States
    Full-time
    As a Site reliability engineer, you will partner with development and IT teams to implement CI / CD pipelines, develop automation and monitoring solutions to ensure our platforms are secure, scalable...Show more
    Last updated: 22 hours ago • Promoted • New!
    Site Reliability Engineer

    Site Reliability Engineer

    Fortinet • Sunnyvale, CA, United States
    Full-time
    At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers. Our team members enjoy solving complex problems, and obsess...Show more
    Last updated: 7 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    WorkOS • San Francisco, CA, United States
    Full-time
    About WorkOS 🚀 WorkOS builds tools and services for developers to help them implement authentication, identity, authorization, and overall enterprise readiness. We’re a fully distributed team with ...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer I

    Site Reliability Engineer I

    Prosper Marketplace • San Francisco, CA, United States
    Full-time
    As a Site Reliability Engineer I at Prosper, you will play a crucial role in enhancing the reliability, scalability, and maintainability of our technology platform. This entry-level position is desi...Show more
    Last updated: 22 hours ago • Promoted • New!
    Site Reliability Engineer

    Site Reliability Engineer

    Alchemy • San Francisco, CA, United States
    Full-time
    Our mission is to bring web3 to a billion people, by providing builders with the tools they need to build exceptional onchain products. Alchemy is the only complete developer platform that offers th...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Together AI • San Francisco, CA, United States
    Full-time
    As a Site Reliability Engineer (SRE) at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a soft...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer - Technical Lead

    Site Reliability Engineer - Technical Lead

    ZipRecruiter • San Francisco, CA, United States
    Full-time
    Veryon is a leading software and technology company that enables aviation teams around the world to improve efficiency and safety. Our products maximize uptime for aircraft maintenance teams through...Show more
    Last updated: 18 days ago • Promoted
    Site Reliability Engineer - Openstack

    Site Reliability Engineer - Openstack

    Fortinet • Sunnyvale, CA, United States
    Full-time
    Fortinet is recruiting a Site Reliability Engineer- OPENSTACK to join our FortiStack team.This team is responsible for the management, operation and continued development of our Openstack-based pri...Show more
    Last updated: 7 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Redwood Materials • San Francisco, CA, United States
    Full-time
    Redwood is localizing a global battery supply chain that seamlessly integrates recovery, reuse, and recycling — keeping critical minerals in circulation and driving the energy transition.Founded in...Show more
    Last updated: 27 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Fractal • San Francisco, CA, United States
    Full-time
    This range is provided by Fractal.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Fractal Analytics is a strategic AI partner to Fortune 500 com...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Primer • San Francisco, CA, United States
    Full-time
    Primer helps B2B products break out of the B2C-centric marketing box.Our platform turns consumer ad channels, data streams, and emerging AI workflows into measurable growth engines for go-to-market...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Signify Technology • Palo Alto, CA, United States
    Full-time
    Competitive, based on experience.We are a technology startup advancing healthcare with a safety-focused AI platform that assists medical professionals by managing patient communications, including ...Show more
    Last updated: 3 days ago • Promoted