Talent.com
Site Reliability Engineer (Space Communications)
Site Reliability Engineer (Space Communications)Northwood • Torrance, CA, United States
Site Reliability Engineer (Space Communications)

Site Reliability Engineer (Space Communications)

Northwood • Torrance, CA, United States
2 days ago
Job type
  • Permanent
Job description

Overview

Site Reliability Engineer (Space Communications) at Northwood. Join to help build and maintain observability infrastructure and ensure the global space communications network operates reliably as we scale ground stations around the world.

Responsibilities

  • Build and maintain observability stack with tools like Grafana, Prometheus, Loki, Vector, CloudWatch, VictoriaMetrics, etc. for metrics and log ingestion across environments
  • Support and improve CI / CD pipelines using GitLab and ArgoCD, collaborating with development teams on deployment best practices
  • Help build and maintain cloud infrastructure using Terraform on AWS, contributing to the scalability and reliability of space communication systems
  • Work with senior engineers to establish monitoring strategies, alerting, and incident response procedures
  • Deploy and manage Kubernetes applications using Helm charts, focusing on reliability and developer experience
  • Collaborate with engineering teams to implement performance monitoring and troubleshooting across microservices
  • Support identity and access management integration with Okta and HashiCorp Vault
  • Assist in managing NixOS-based infrastructure for reproducible system configurations
  • Participate in incident response efforts and contribute to post-incident reviews and improvements

Basic Qualifications

  • 2-4 years of hands-on experience with infrastructure tools and monitoring systems in production environments
  • Experience with containerization (Docker, Kubernetes) and basic container orchestration
  • Familiarity with CI / CD tools (GitLab, Jenkins, or similar) and infrastructure as code concepts
  • Experience with cloud platforms (AWS preferred) and basic infrastructure automation
  • Programming skills in Python or similar language and experience with configuration management
  • Startup mentality with ability to work in fast-paced, high-growth environments and take on diverse responsibilities
  • Experience with logging and metrics collection for production systems
  • Understanding of system reliability principles and interest in learning SRE practices
  • Preferred Qualifications

  • Some exposure to observability tools like Vector, Loki, Grafana, Prometheus, or similar monitoring systems
  • Experience with Terraform or other infrastructure as code tools
  • Familiarity with NixOS or other declarative system configuration approaches
  • Basic knowledge of HashiCorp Vault, Okta, or similar identity / secrets management tools
  • Interest in distributed systems and troubleshooting complex technical issues
  • Previous startup experience or demonstrated ability to learn quickly and adapt
  • Linux system administration experience
  • AWS certification or demonstrated cloud platform knowledge
  • Additional Information

    To conform to U.S. Government space technology export regulations, including the International Traffic in Arms Regulations (ITAR) you must be a U.S. citizen, lawful permanent resident of the U.S., protected individual as defined by 8 U.S.C. 1324b(a)(3), or eligible to obtain the required authorizations from the U.S. Department of State.

    Northwood is an Equal Opportunity Employer; employment with Northwood is governed on the basis of merit, competence and qualifications and will not be influenced in any manner by race, color, religion, gender, national origin / ethnicity, veteran status, disability status, age, sexual orientation, gender identity, marital status, mental or physical disability or any other legally protected status.

    #J-18808-Ljbffr

    Create a job alert for this search

    Site Reliability Engineer • Torrance, CA, United States

    Related jobs
    Site Reliability Engineer

    Site Reliability Engineer

    Diverse Lynx • Los Angeles, CA, United States
    Full-time
    Must Have Technical / Functional Skills.Experience in Cloud platforms (AWS, Azure, Google Cloud) and hybrid environments. Proficiency in container technologies (Docker, Container, Podman).Strong knowl...Show more
    Last updated: 30+ days ago • Promoted
    Lead Site Reliability Engineer - Federal Team in Los Angeles

    Lead Site Reliability Engineer - Federal Team in Los Angeles

    Energy Jobline ZR • Los Angeles, CA, United States
    Full-time
    Energy Jobline is the largest and fastest growing global Energy Job Board and Energy Hub.We have an audience reach of over 7 million energy professionals, 400,000+ monthly advertised global energy ...Show more
    Last updated: 2 days ago • Promoted
    Senior Site Reliability Engineer (SRE)

    Senior Site Reliability Engineer (SRE)

    StubHub • Los Angeles, CA, United States
    Full-time
    StubHub is on a mission to redefine the live event experience on a global scale.Whether someone is looking to attend their first event or their hundredth, we're here to delight them all the way fro...Show more
    Last updated: 2 days ago • Promoted
    Site Reliability Engineer (Space Communications)

    Site Reliability Engineer (Space Communications)

    Northwoodspace • Torrance, CA, United States
    Permanent
    Northwood is on a mission to transform connectivity between earth and space and bring the benefits of space to the masses through innovations in space communications technologies.If you like buildi...Show more
    Last updated: 2 days ago • Promoted
    Senior Site Reliability Engineer / Los Angeles, CA / Hybrid

    Senior Site Reliability Engineer / Los Angeles, CA / Hybrid

    Motion Recruitment • Los Angeles, CA, United States
    Full-time
    A large gaming company is looking for a Senior Site Reliability Engineer to come join their team based in Los Angeles!.This person will be apart of a team of Site Reliability Engineers that leverag...Show more
    Last updated: 2 days ago • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    K2 Space • Los Angeles, CA, United States
    Permanent
    K2 Space is building large, high-powered spacecraft for the next generation of space development.Backed by Lightspeed Venture Partners, Altimeter Capital, and many others ($200M raised to date), we...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer in Los Angeles

    Site Reliability Engineer in Los Angeles

    Energy Jobline ZR • Los Angeles, CA, United States
    Full-time
    Energy Jobline is the largest and fastest growing global Energy Job Board and Energy Hub.We have an audience reach of over 7 million energy professionals, 400,000+ monthly advertised global energy ...Show more
    Last updated: 2 days ago • Promoted
    Lead Site Reliability Engineer

    Lead Site Reliability Engineer

    Disqo • Los Angeles, CA, United States
    Full-time
    DISQO's mission is to build the world's most trusted ad measurement platform that fuels brand growth.The world's largest brands, agencies, and media companies trust DISQO for expert insight and AI-...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer II

    Site Reliability Engineer II

    AEG • Los Angeles, CA, United States
    Full-time
    In order to be considered for this role, after clicking "Apply Now" above and being redirected, you must fully complete the application process on the follow-up screen. AXS connects fans with the ar...Show more
    Last updated: 2 days ago • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    anduril • Costa Mesa, CA, United States
    Full-time
    Senior Site Reliability Engineer.Anduril Industries is a defense technology company with a mission to transform U.By bringing the expertise, technology, and business model of the 21st century's mos...Show more
    Last updated: 1 day ago • Promoted
    Senior Site Reliability Engineer (Remote)

    Senior Site Reliability Engineer (Remote)

    Experian • Costa Mesa, CA, United States
    Remote
    Full-time
    Experian is a global data and technology company, powering opportunities for people and businesses around the world.We help to redefine lending practices, uncover and prevent fraud, simplify health...Show more
    Last updated: 30+ days ago • Promoted
    DevOps / Site Reliability Engineer (SRE) US

    DevOps / Site Reliability Engineer (SRE) US

    Channelwill • Pasadena, CA, United States
    Full-time
    Pasadena, California (Remote or Hybrid).SaaS company based in Pasadena, California, providing innovative post-purchase solutions for eCommerce brands. Our products help merchants improve customer ex...Show more
    Last updated: 2 days ago • Promoted
    Site Reliability Engineer, GNC (Falcon)

    Site Reliability Engineer, GNC (Falcon)

    SpaceX • Inglewood, CA, United States
    Full-time
    Site Reliability Engineer, GNC (Falcon).SpaceX was founded under the belief that a future where humanity is out exploring the stars is fundamentally more exciting than one where we are not.Today Sp...Show more
    Last updated: 1 day ago • Promoted
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    Diverse Lynx • Newport Coast, CA, United States
    Full-time
    DevOps Engineer With Strong Site Reliability Engineering Capabilities.Experienced DevOps Engineers with strong Site Reliability Engineering (SRE) capabilities who can work independently, think crit...Show more
    Last updated: 2 days ago • Promoted
    Lead Site Reliability Engineer - Federal Team

    Lead Site Reliability Engineer - Federal Team

    Saviynt • Los Angeles, CA, United States
    Full-time
    Lead Site Reliability Engineer - Federal Team.Saviynt is an identity authority platform built to power and protect the world at work. In a world of digital transformation, where organizations are fa...Show more
    Last updated: 1 day ago • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Mango, Inc. • Los Angeles, CA, United States
    Full-time
    We are seeking a Senior Site Reliability Engineer to own and evolve the infrastructure that supports our on-premise instruments, data systems, and machine learning pipelines.This role combines syst...Show more
    Last updated: 1 day ago • Promoted
    Sr. Site Reliability Engineer

    Sr. Site Reliability Engineer

    Kesta IT • Culver City, CA, United States
    Full-time +1
    Come build, innovate, disrupt, and thrive!.Site Reliability Engineer for an immediate full-time opportunity with our industry leading client. Are you on the lookout for a unique career opportunity t...Show more
    Last updated: 2 days ago • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Mango • Los Angeles, CA, United States
    Full-time
    We are seeking a Senior Site Reliability Engineer to own and evolve the infrastructure that supports our on-premise instruments, data systems, and machine learning pipelines.This role combines syst...Show more
    Last updated: 14 days ago • Promoted