Talent.com
No se aceptan más aplicaciones
Site Reliability Engineer

Site Reliability Engineer

Together AISan Francisco, CA, United States
Hace 5 días
Tipo de contrato
  • A tiempo completo
Descripción del trabajo

Overview

As a Site Reliability Engineer (SRE) at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a software engineer that applies sound engineering principles, operational discipline, and mature automation to our operating environments and codebase.

Qualifications

  • 5+ years of professional SRE or related experience
  • Bachelor's degree in Computer Science or a related field or equivalent work experience
  • Expert knowledge of Ansible (roles, playbooks), Terraform, and Kubernetes
  • Proficiency in programming / scripting languages
  • Direct experience in monitoring and observability practices
  • Advanced knowledge of cloud services
  • Ability to thrive in a collaborative environment involving different stakeholders and subject matter experts

Responsibilities

  • Be on an on-call (PagerDuty) rotation to respond to incidents that impact availability
  • Build and run our infrastructure with Ansible, Terraform, and Kubernetes to enable scaling to a massive number of concurrent users
  • Build monitoring systems to ensure the highest quality service for our customers
  • Design and implement operational processes (such as deployments and upgrades)
  • Debug production issues across all services and levels of the stack
  • Identify improvements for the product architecture from the reliability, performance and availability perspectives
  • Plan the growth of Together AI’s infrastructure
  • About Together AI

    Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure.

    Compensation

    We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is : $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.

    Equal Opportunity

    Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.

    Please see our privacy policy at https : / / www.together.ai / privacy

    #J-18808-Ljbffr

    Crear una alerta de empleo para esta búsqueda

    Site Reliability Engineer • San Francisco, CA, United States

    Ofertas relacionadas
    • Oferta promocionada
    Senior Engineer, Site Reliability

    Senior Engineer, Site Reliability

    VirtualVocationsSan Francisco, California, United States
    A tiempo completo
    A company is looking for a Senior Engineer in Site Reliability Engineering for Digital Banking.Key Responsibilities Ensure the reliability, availability, and performance of applications in produc...Mostrar másÚltima actualización: hace 3 días
    Site Reliability Engineer

    Site Reliability Engineer

    DTEX SystemsFremont, CA, US
    A tiempo completo
    Quick Apply
    We are excited that you’ve taken the time to explore our business and potentially join us on this incredible journey.We are already the leader in the Insider Risk Management, but our story do...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer - SRE at Descope Los Altos, CA

    Site Reliability Engineer - SRE at Descope Los Altos, CA

    Itlearn360Los Altos, CA, United States
    A tiempo completo
    Site Reliability Engineer - SRE job at Descope.Descope R&D group is a skilled team of developers with a unique DNA of creativity,flexibility,anopen mindset. We are looking for a passionate SRE to jo...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    ZapierSan Francisco, CA, United States
    A tiempo completo
    We're humans who simply think computers should do more work.At Zapier, we’re not just making software—we’re building a platform to help millions of businesses globally scale with automation and AI....Mostrar másÚltima actualización: hace 7 días
    • Oferta promocionada
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Rollbar, Inc.San Francisco, CA, United States
    A tiempo completo
    Wikimedia Foundation is hiring a Senior Site Reliability Engineer (SRE) to join our Service Operations SRE team, where we take care of the infrastructure that runs wikipedia.The SRE team at Wikimed...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Senior / Staff Site Reliability Engineer, Storage

    Senior / Staff Site Reliability Engineer, Storage

    FluidstackSan Francisco, CA, United States
    A tiempo completo
    Fluidstack is building GPU supercomputers for top AI labs, governments, and enterprises.Our customers include Mistral, Poolside, Black Forest Labs, Meta, and more. Our team is small, highly motivate...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    PsiQuantumPalo Alto, CA, United States
    A tiempo completo
    Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Senior Site Reliability Engineer, Storage

    Senior Site Reliability Engineer, Storage

    Epoch BiodesignSan Francisco, CA, United States
    A tiempo completo
    Crusoe Energy is on a mission to unlock value in stranded energy resources through the power of computation.Take a look at what we do! - https : / / www. We aim to align the long term interests of the c...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer - Technical Lead

    Site Reliability Engineer - Technical Lead

    ZipRecruiterSan Francisco, CA, United States
    A tiempo completo
    Veryon is a leading software and technology company that enables aviation teams around the world to improve efficiency and safety. Our products maximize uptime for aircraft maintenance teams through...Mostrar másÚltima actualización: hace 12 días
    Site Reliability Engineer

    Site Reliability Engineer

    Foxconn Industrial Internet - FIISan Jose, CA, US
    A tiempo completo +1
    Quick Apply
    Site Reliability Engineer Foxconn Industrial Internet (Fii), is a world leading professional design and manufacturing service provider of communication network equipment, cloud service equipment, p...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Software Engineer (Site Reliability Engineer)

    Software Engineer (Site Reliability Engineer)

    CerebrasSan Francisco, CA, United States
    A tiempo completo
    San Francisco or Palo Alto, CA.At Anyscale, we take a market-based approach to compensation.We are data-driven, transparent, and consistent. As the market data changes over time, the target salary f...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    FortinetSunnyvale, California, United States
    A tiempo completo
    At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers. Our team members enjoy solving complex problems, and obsess...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    CanonicalSan Francisco, CA, United States
    A tiempo completo
    Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is widely used in breakthrough enterprise initiat...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer (SRE) - grok.com & API

    Site Reliability Engineer (SRE) - grok.com & API

    Pantera CapitalPalo Alto, CA, United States
    A tiempo completo
    AI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excelle...Mostrar másÚltima actualización: hace 10 días
    • Oferta promocionada
    Site Reliability Engineer

    Site Reliability Engineer

    WritemedSan Francisco, CA, United States
    A tiempo completo
    Would you like to join one of the fastest-growing organizations with a goal of using the latest AI, GenAI, LLM, Cloud, and Digital Technologies to advance drug development and improve patient care ...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    Air AppsSan Francisco, CA, United States
    A tiempo completo
    At Air Apps, we believe in thinking bigger—and moving faster.We’re a family-founded company on a mission to create the world’s first AI-powered Personal & Entrepreneurial Resource Planner (PRP), an...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer II

    Site Reliability Engineer II

    PinterestSan Francisco, CA, United States
    A tiempo completo
    Millions of people around the world come to our platform to find creative ideas, dream about new possibilities and plan for memories that will last a lifetime. At Pinterest, we're on a mission to br...Mostrar másÚltima actualización: hace 10 días
    Site Reliability Engineer

    Site Reliability Engineer

    LTD GlobalBerkeley, CA, US
    A tiempo completo
    Quick Apply
    We are seeking a Site Reliability Engineer to join our Operations Group.This role plays a key part in advancing scientific discovery by supporting high-performance computing (HPC) and data analysis...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    BasetenSan Francisco, CA, United States
    A tiempo completo
    Site Reliability Engineer (SRE).Baseten powers inference for the world's most dynamic AI companies, like OpenEvidence, Clay, Mirage, Gamma, Sourcegraph, Writer, Abridge, Bland, and Zed.By uniting a...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    • Nueva oferta
    Founding Site Reliability Engineer

    Founding Site Reliability Engineer

    Relevance AISan Francisco, CA, United States
    A tiempo completo
    San Francisco, USA (Hybrid 3 days / week).At Relevance AI, our mission is to empower anyone to delegate work to the AI workforce. We’re building a new category of AI automation, enabling teams to crea...Mostrar másÚltima actualización: hace 1 hora