Talent.com
Site Reliability EngineerSan Francisco

Site Reliability EngineerSan Francisco

Together AISan Francisco, CA, United States
Hace 4 días
Tipo de contrato
  • A tiempo completo
Descripción del trabajo

Site Reliability Engineer

As a Site Reliability Engineer (SRE) at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a software engineer that applies sound engineering principles, operational discipline, and mature automation to our operating environments and codebase.

You specialize in systems (operating systems, storage subsystems, networking), while implementing best practices for availability, reliability and scalability, with varied interests in algorithms and distributed systems.

Requirements

  • 5+ years of professional SRE or related experience
  • Bachelor's degree in Computer Science or a related field or equivalent work experience
  • Expert knowledge of Ansible (roles, playbooks), Terraform, and Kubernetes
  • Proficiency in programming / scripting languages
  • Direct experience in monitoring and observability practices
  • Advanced knowledge of cloud services
  • Ability to thrive in a collaborative environment involving different stakeholders and subject matter experts

Responsibilities

  • Be on an on-call (PagerDuty) rotation to respond to incidents that impact availability
  • Build and run our infrastructure with Ansible, Terraform, and Kubernetes to enable scaling to a massive number of concurrent users
  • Build monitoring systems to ensure the highest quality service for our customers
  • Design and implement operational processes (such as deployments and upgrades)
  • Debug production issues across all services and levels of the stack
  • Identify improvements for the product architecture from the reliability, performance and availability perspectives
  • Plan the growth of Together AI's infrastructure
  • About Together AI

    Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure.

    Compensation

    We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is : $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.

    Equal Opportunity

    Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.

    Crear una alerta de empleo para esta búsqueda

    Site Reliability • San Francisco, CA, United States

    Ofertas relacionadas
    • Oferta promocionada
    Site Reliability Engineer in San Francisco

    Site Reliability Engineer in San Francisco

    Energy Jobline ZRSan Francisco, CA, United States
    A tiempo completo
    Energy Jobline is the largest and fastest growing global Energy Job Board and Energy Hub.We have an audience reach of over 7 million energy professionals, 400,000+ monthly advertised global energy ...Mostrar másÚltima actualización: hace 7 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    ID.meMountain View, CA, US
    A tiempo completo
    Consumers can verify their identity with ID.Over 152 million users experience streamlined login and identity verification with ID. More than 600+ consumer brands use ID.Commerce Department and is ap...Mostrar másÚltima actualización: hace 19 días
    • Oferta promocionada
    Principal Site Reliability Engineer - Americas in San Francisco

    Principal Site Reliability Engineer - Americas in San Francisco

    Energy Jobline ZRSan Francisco, CA, United States
    A tiempo completo
    Energy Jobline is the largest and fastest growing global Energy Job Board and Energy Hub.We have an audience reach of over 7 million energy professionals, 400,000+ monthly advertised global energy ...Mostrar másÚltima actualización: hace 7 días
    • Oferta promocionada
    Site Reliability Engineer I in San Francisco

    Site Reliability Engineer I in San Francisco

    Energy Jobline ZRSan Francisco, CA, United States
    A tiempo completo
    As a Site Reliability Engineer I at Prosper, you will play a crucial role in enhancing the reliability, scalability, and maintainability of our technology platform. This entry‑level position is desi...Mostrar másÚltima actualización: hace 2 días
    • Oferta promocionada
    Site Reliability Engineer San Francisco

    Site Reliability Engineer San Francisco

    Perplexity AISan Francisco, CA, United States
    A tiempo completo
    Site Reliability Engineer (SRE).Perplexity is seeking a Site Reliability Engineer (SRE) to join our small team in revolutionizing the way people search and interact with the internet.You will be re...Mostrar másÚltima actualización: hace 1 día
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    LTD GlobalBerkeley, CA, US
    A tiempo completo
    We are seeking a Site Reliability Engineer to join our Operations Group.This role plays a key part in advancing scientific discovery by supporting high-performance computing (HPC) and data analysis...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    PsiQuantumPalo Alto, CA, United States
    A tiempo completo
    Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    Rethink recruitSan Francisco, CA, United States
    A tiempo completo
    Runloop is building the foundational infrastructure for the next generation of AI development.We provide AI engineers and data scientists with lightning-fast, secure, and reproducible code sandboxe...Mostrar másÚltima actualización: hace 7 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    Runloop AI, IncSan Francisco, CA, United States
    A tiempo completo
    Runloop is building the foundational infrastructure for the next generation of AI development.We provide AI engineers and data scientists with lightning-fast, secure, and reproducible code sandboxe...Mostrar másÚltima actualización: hace 7 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    Redwood MaterialsSan Francisco, CA, United States
    A tiempo completo
    Redwood is localizing a global battery supply chain that seamlessly integrates recovery, reuse, and recycling.We are seeking a highly skilled and motivated Site Reliability Engineer to collect requ...Mostrar másÚltima actualización: hace 7 días
    • Oferta promocionada
    Site Reliability Engineer - Supercomputing

    Site Reliability Engineer - Supercomputing

    XaiSan Francisco, CA, United States
    A tiempo completo
    Site Reliability Engineer - Supercomputing.We are seeking a talented Site Reliability Engineer (SRE) to join our SuperComputing team. In this role, you'll ensure the reliability, scalability, and pe...Mostrar másÚltima actualización: hace 1 día
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    ConductorOneSan Francisco, CA, United States
    A tiempo completo
    ConductorOne is the first AI-native identity security platform that protects every identity : human, non-human, and AI.With powerful automation, platform-level AI, and out-of-the-box connectors, it ...Mostrar másÚltima actualización: hace 7 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    Runloop AISan Francisco, CA, United States
    A tiempo completo
    Runloop is building the foundational infrastructure for the next generation of AI development.We provide AI engineers and data scientists with lightning-fast, secure, and reproducible code sandboxe...Mostrar másÚltima actualización: hace 16 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    PSI QuantumPalo Alto, CA, United States
    A tiempo completo
    Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    Foxconn Industrial Internet - FIISan Jose, CA, US
    A tiempo completo +1
    Foxconn Industrial Internet (Fii), is a world leading professional design and manufacturing service provider of communication network equipment, cloud service equipment, precision tools and industr...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    Flexton, Inc.San Francisco, CA, United States
    A tiempo completo
    Skill : You have excellent written and verbal communication skills.You have experience managing large websites or services within the context of a large scale web environment.You are able to execute...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer I

    Site Reliability Engineer I

    Prosper.comSan Francisco, CA, United States
    A tiempo completo
    As a Site Reliability Engineer I at Prosper, you will play a crucial role in enhancing the reliability, scalability, and maintainability of our technology platform. This entry-level position is desi...Mostrar másÚltima actualización: hace 7 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    Stefanini GroupSan Francisco, CA, United States
    A tiempo completo
    Location : San Francisco, CA (Hybrid).Work with development and DevOps teams to create and implement STAR's Continuous Integration and Continuous Delivery (CI / CD) pipeline.Collaborate with National ...Mostrar másÚltima actualización: hace 21 días