Talent.com
Site Reliability Engineer, Kubernetes Platform (Starshield)

Site Reliability Engineer, Kubernetes Platform (Starshield)

SpacexWashington, DC, United States
30+ days ago
Job type
  • Full-time
  • Permanent
Job description

SpaceX was founded under the belief that a future where humanity is out exploring the stars is fundamentally more exciting than one where we are not. Today SpaceX is actively developing the technologies to make this possible, with the ultimate goal of enabling human life on Mars.

SITE RELIABILITY ENGINEER, KUBERENTES PLATFORM (STARSHIELD)

At SpaceX we’re leveraging our experience in building rockets and spacecraft to deploy the Starshield constellation. Starshield is the world’s largest US government satellite constellation and is tasked with providing immediate access to critical intelligence and national security data for the US government anywhere on the globe. We design, build, test, and operate all parts of the system – receivers that allow users to connect within minutes, and the software that brings it all together. We’ve only begun to scratch the surface of Starshield's global impact and are looking for best-in-class engineers to help us further our ambitious goals.

As an engineer focused on Starshield's software and network infrastructure, you will design, operate and scale the infrastructure we use to run the world’s largest government satellite constellation. These positions cover a variety of areas ranging from Site Reliability Engineering, Developer Operations, and our internal Kubernetes platforms. You will develop automation to deploy and manage on-premise compute resources, create highly scalable and maintainable software products, and directly collaborate with engineering across the board.

RESPONSIBILITES :

  • Develop automation to deploy and manage on-premise Kubernetes clusters
  • Deploy and manage core infrastructure such as databases, monitoring and distributed storage
  • Closely collaborate with software engineers to create highly scalable, operable, and maintainable products
  • Engage in and improve the whole lifecycle of services from inception and design, through deployment, operation and refinement
  • Monitoring and alerting supporting systems to have high availability
  • Hands-on integration and troubleshooting across the entire Starshield stack
  • Identify areas for improvement and create innovative solutions that enable high system availability

BASIC QUALIFICATIONS :

  • Bachelor’s degree in computer science, information systems / IT, or an engineering discipline and 1+ years of professional experience in site reliability engineering or DevOps; OR 3+ years of professional experience in site reliability engineering or DevOps in lieu of a degree
  • 1+ years of professional experience with Linux operating systems
  • Experience with Terraform, Ansible, or other infrastructure tools
  • Experience with containerization technologies (i.e. OCI containers, Kubernetes)
  • Experience scripting in Bash, Python, or other similar languages
  • Development experience in Python, C++, or Go
  • PREFERRED SKILLS AND EXPERIENCE :

  • 1+ years of experience with Python and Python-based development frameworks
  • Experience managing Kubernetes clusters, not just using them
  • Knowledge of Linux boot process and systems configuration
  • Deep understanding of testing, continuous integration, build, deployment & continuous monitoring
  • Understanding of relevant build technologies, such as Bazel and Makefiles
  • Focus on performance bottlenecks and performance improvement techniques
  • Understanding of distributed databases and data modeling
  • Experience with automatically managing dozens, hundreds, or thousands of servers (eg : Terraform or Ansible)
  • Strong networking knowledge of TCP / IP
  • Excellent communications skills with the ability to communicate with customers, peers, management etc. in both formal and informal situations
  • ADDITIONAL REQUIREMENTS :

  • Note that an active clearance may provide the opportunity for you to work on sensitive SpaceX missions; if so, you will be subject to pre-employment drug and random drug and alcohol testing
  • Must be willing to work extended hours and weekends as needed
  • ITAR REQUIREMENTS :

  • To conform to U.S. Government export regulations, applicant must be a (i) U.S. citizen or national, (ii) U.S. lawful, permanent resident (aka green card holder), (iii) Refugee under 8 U.S.C. § 1157, or (iv) Asylee under 8 U.S.C. § 1158, or be eligible to obtain the required authorizations from the U.S. Department of State. Learn more about the ITAR here .
  • SpaceX is an Equal Opportunity Employer; employment with SpaceX is governed on the basis of merit, competence and qualifications and will not be influenced in any manner by race, color, religion, gender, national origin / ethnicity, veteran status, disability status, age, sexual orientation, gender identity, marital status, mental or physical disability or any other legally protected status.

    Applicants wishing to view a copy of SpaceX’s Affirmative Action Plan for veterans and individuals with disabilities, or applicants requiring reasonable accommodation to the application / interview process should reach out to  EEOCompliance@spacex.com .

    Create a job alert for this search

    Site Reliability Engineer • Washington, DC, United States

    Related jobs
    • Promoted
    Staff Site Reliability Engineer

    Staff Site Reliability Engineer

    VisaAshburn, VA, United States
    Full-time
    Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more t...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer - Developer, Connected Warfare

    Site Reliability Engineer - Developer, Connected Warfare

    Anduril IndustriesWashington, DC, United States
    Full-time
    Site Reliability Engineer, Connected Warfare.Washington, District of Columbia, United States.Anduril Industries is a defense technology company with a mission to transform U.By bringing the experti...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Leidos IncReston, VA, United States
    Full-time
    The Multi Domain Solutions Division at Leidos is looking for a.This role involves supporting the delivery of comprehensive IT and support services to ensure mission success while adhering to DoD st...Show moreLast updated: 17 days ago
    • Promoted
    Cloud Site Reliability Engineer (SRE) (Azure / AWS)

    Cloud Site Reliability Engineer (SRE) (Azure / AWS)

    Leidos IncAlexandria, VA, United States
    Full-time
    Join us in transforming how technology serves those who serve.At Leidos, we're not just delivering solutions - we're pioneering the future of defense and intelligence technology.Our diverse teams o...Show moreLast updated: 14 days ago
    • Promoted
    Sr. DevSecOps Engineer

    Sr. DevSecOps Engineer

    Leidos IncAshburn, VA, United States
    Full-time
    Leidos is seeking a skilled Sr DevSecOps Engineer to lead efforts in automating and optimizing our cloud integration pipelines for the Passenger Systems Program Directorate (PSPD) within Customs an...Show moreLast updated: 30+ days ago
    • Promoted
    Sr. Manager - Site Reliability Engineer

    Sr. Manager - Site Reliability Engineer

    VisaAshburn, VA, United States
    Full-time
    Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more t...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer III

    Site Reliability Engineer III

    VerisignReston, VA, United States
    Full-time
    Verisign helps enable the security, stability, and resiliency of the internet.We are a trusted provider of internet infrastructure services for the networked world and deliver unmatched performance...Show moreLast updated: 30+ days ago
    Site Reliability Engineer

    Site Reliability Engineer

    Tax AnalystsFalls Church, VA, US
    Full-time
    Quick Apply
    Tax Analysts is seeking a Site Reliability Engineer (SRE) to help establish and shape our reliability engineering practice from the ground up. This is a unique opportunity to join a mission-driven o...Show moreLast updated: 30+ days ago
    • Promoted
    Lead Site Reliability Engineer

    Lead Site Reliability Engineer

    Federated ITWashington, DC, United States
    Full-time
    Bridge Defense is redefining how modern defense technology is delivered.Department of Defense, the Intelligence Community, and federal law enforcement agencies. We provide full-spectrum national sec...Show moreLast updated: 6 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    CanonicalWashington, DC, United States
    Full-time
    Canonical is a leading provider of open source software and operating systems.Our platform, Ubuntu, is used across enterprise initiatives in public cloud, data science, AI, engineering innovation, ...Show moreLast updated: 5 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    CSCI ConsultingQuantico, VA, United States
    Full-time
    CSCI Consulting is looking for a.Site Reliability Engineer (SRE).This role combines deep systems engineering knowledge with DevOps automation, proactive monitoring, and incident response practices....Show moreLast updated: 30+ days ago
    • Promoted
    Senior Reliability Engineer

    Senior Reliability Engineer

    The Johns Hopkins University Applied Physics LaboratoryLaurel, MD, United States
    Full-time
    Are you passionate about applying reliability and system engineering principles to analyze and assess the resilience of future strategic weapon systems?. Do you have a strong technical background in...Show moreLast updated: 8 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Powder River IndustriesWashington, DC, United States
    Full-time
    Conduct analysis of alternatives for configuration tools, make recommendations, work with team to design, develop, test, implement, and maintain tool choice. Responsible for the administration, moni...Show moreLast updated: 5 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    EngFlowWashington, DC, United States
    Full-time
    Join to apply for the Site Reliability Engineer role at EngFlow.At EngFlow, we help developers save time by accelerating software builds and tests. Our cloud-based, distributed service optimizes dev...Show moreLast updated: 5 days ago
    • Promoted
    Deployment Site Reliability Engineer - Connected Warfare

    Deployment Site Reliability Engineer - Connected Warfare

    Anduril Industries, Inc.Washington, DC, United States
    Full-time
    Senior Deployed Site Reliability Engineer, Connected Warfare.Washington, District of Columbia, United States.Anduril Industries is a defense technology company with a mission to transform U.By brin...Show moreLast updated: 5 days ago
    • Promoted
    Site Reliability Engineer — Scale mission-critical platforms

    Site Reliability Engineer — Scale mission-critical platforms

    Anduril IndustriesWashington, DC, United States
    Full-time
    A defense technology company is seeking a Site Reliability Engineer in Washington, DC.The role involves solving challenges in networking and systems integration while working with cross-functional ...Show moreLast updated: 1 day ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    CapeWashington, DC, United States
    Full-time
    Cape was founded in early 2022 by Palantir and Anduril alums with deep expertise in privacy and national security.While running Palantir’s US national security business, our CEO became passionate a...Show moreLast updated: 5 days ago
    • Promoted
    Staff Site Reliability Engineer (Federal)

    Staff Site Reliability Engineer (Federal)

    OktaWashington, DC, United States
    Full-time
    Okta is The World's Identity Company.We free everyone to safely use any technology, anywhere, on any device or app.Our flexible and neutral products, Okta Platform and Auth0 Platform, provide secur...Show moreLast updated: 30+ days ago