Talent.com
Site Reliability Engineer (req-174)

Site Reliability Engineer (req-174)

CathexisTysons, Virginia, United States
30+ days ago
Job type
  • Full-time
Job description

Team CATHEXIS elevates the government contracting experience through rapid response, deep skill, and thoughtful problem-solving and communication. Our core capabilities are our top-tier program and project management, data analytics, and audit services, the backbone of which is our integrated approach to operational excellence.

You worked hard to get to where you are. You strive to make every day better than the day before. So do we. Team CATHEXIS operates with an all-in mindset. We are working together to create a company that supports our shared values and individual goals. Our values are centered around Respect, Engagement, Customer Service, Integrity, Teamwork, and Excellence in everything we do for our employees, clients, partners, and communities. We believe success is best when we listen and lead with empathy; model high standards of ethics to provide a rewarding candidate experience; work hard, have fun, and appreciate the strengths we all bring to the team; and empower our employees to create innovative and trusted results.

We are looking for a dynamic  Site Reliability Engineer (SRE)  to join our team.  The Site Reliability Engineer (SRE) will manage, monitor, and optimize our clusters on Kubernetes. Together, we’re accelerating our clients’ digital transformation through the building and deployment of data-driven, scalable AI solutions.  The ideal candidate will have a deep understanding of Kubernetes, Cloud Infrastructure, and Infrastructure as Code (IaC) practices. You will be responsible for ensuring the reliability and scalability of our Kubernetes clusters and Cloud Infrastructure.

Responsibilities :

  • Monitor and Manage Kubernetes Clusters : Ensure the stability, health, and scalability of Kubernetes Clusters, deploying applications and services on Kubernetes
  • Kubernetes Management : Deploy, monitor, and scale applications on Kubernetes clusters. Maintain Helm charts, manage services, and ensure resource allocation for optimal cluster performance
  • Cloud Infrastructure Management : Work with leading Cloud Platforms (AWS, GCP, Azure) to set up, configure, and manage infrastructure resources using Infrastructure as Code (Terraform, CloudFormation, etc.)
  • Monitoring & Incident Response : Set up monitoring solutions, define alerts, and manage the incident response process for any issues related to Jenkins, or Kubernetes clusters
  • Automate Infrastructure Processes : Build automation tools for scaling, monitoring, and maintaining infrastructure using modern tools like Terraform, Ansible, or equivalent
  • Collaborate Across Teams : Work closely with development, services, and operations teams to ensure a seamless integration between application development and infrastructure
  • Security & Compliance : Ensure all systems follow best practices in terms of security and compliance with relevant regulations. This includes role-based access, encryption, and automated vulnerability scanning

Requirements :

  • Active Secret Clearance is required
  • Bachelor’s degree (or equivalent) in computer science or related discipline
  • A minimum of two(2) years of experience working with on-premise and off-premise cloud environments
  • Experience with AWS, Azure and / or GCP
  • Ability to program (structured and OOP) using one or more high-level languages, such as Python, Java, C / C++, Ruby, and JavaScript
  • Experience with distributed storage technologies such as NFS, HDFS, Ceph, and Amazon S3, as well as dynamic resource management frameworks (Apache Mesos, Kubernetes, Yarn)
  • Proactive approach to identifying problems, performance bottlenecks, and areas for improvement
  • Agile / Scrum experience
  • CATHEXIS offers competitive compensation packages to all eligible employees. Our goal is to provide a compensation package that reflects the value you bring to our team, is competitive with market rates, and promotes your financial security and personal well-being. The annual salary range for this role is $136,000 - $170,000. Please note that the salary information provided is a general guideline. CATHEXIS considers various factors in its final offer, including location, qualifications, experience, and skills.

    Benefits

  • Performance Bonuses
  • Medical Insurance
  • Dental Insurance
  • Vision Insurance
  • 401(k) Plan (Traditional and ROTH)
  • Life Insurance (Basic, Voluntary & AD&D)
  • Paid Time Off
  • 11 Federal Holidays
  • Parental Leave
  • Commuter Benefits
  • Short Term & Long Term Disability
  • Training & Development
  • Wellness Program
  • Community Outreach Initiatives
  • CATHEXIS is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to sex, gender identity, sexual orientation, race, color, religion, national origin, disability, protected Veteran status, age, or any other characteristic protected by law. If you are an individual with a disability and would like to request a reasonable accommodation as part of the employment selection process, please contact the Recruiting@cathexiscorp.com.

    Create a job alert for this search

    Site Reliability Engineer • Tysons, Virginia, United States

    Related jobs
    • Promoted
    Staff Site Reliability Engineer

    Staff Site Reliability Engineer

    VisaAshburn, VA, United States
    Full-time
    Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more t...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Leidos IncReston, VA, United States
    Full-time
    The Multi Domain Solutions Division at Leidos is looking for a.This role involves supporting the delivery of comprehensive IT and support services to ensure mission success while adhering to DoD st...Show moreLast updated: 17 days ago
    • Promoted
    Cloud Site Reliability Engineer (SRE) (Azure / AWS)

    Cloud Site Reliability Engineer (SRE) (Azure / AWS)

    Leidos IncAlexandria, VA, United States
    Full-time
    Join us in transforming how technology serves those who serve.At Leidos, we're not just delivering solutions - we're pioneering the future of defense and intelligence technology.Our diverse teams o...Show moreLast updated: 14 days ago
    • Promoted
    Sr. Manager - Site Reliability Engineer

    Sr. Manager - Site Reliability Engineer

    VisaAshburn, VA, United States
    Full-time
    Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more t...Show moreLast updated: 30+ days ago
    • Promoted
    Site Implementation Engineer

    Site Implementation Engineer

    Leidos IncReston, VA, United States
    Full-time
    Leidos Digital Modernization Sector is looking for a Site Implementation Engineer to work on the Army Global Unified Network (AGUN) - Increment 1 (INC1) program. The Global Enterprise Network Modern...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer III

    Site Reliability Engineer III

    VerisignReston, VA, United States
    Full-time
    Verisign helps enable the security, stability, and resiliency of the internet.We are a trusted provider of internet infrastructure services for the networked world and deliver unmatched performance...Show moreLast updated: 30+ days ago
    Site Reliability Engineer

    Site Reliability Engineer

    Tax AnalystsFalls Church, VA, US
    Full-time
    Quick Apply
    Tax Analysts is seeking a Site Reliability Engineer (SRE) to help establish and shape our reliability engineering practice from the ground up. This is a unique opportunity to join a mission-driven o...Show moreLast updated: 30+ days ago
    • Promoted
    Lead Site Reliability Engineer

    Lead Site Reliability Engineer

    Federated ITWashington, DC, United States
    Full-time
    Bridge Defense is redefining how modern defense technology is delivered.Department of Defense, the Intelligence Community, and federal law enforcement agencies. We provide full-spectrum national sec...Show moreLast updated: 6 days ago
    • Promoted
    Site Reliability Engineer (SRE) – TS / SCI Clearance

    Site Reliability Engineer (SRE) – TS / SCI Clearance

    Tech CraticWashington, DC, United States
    Full-time
    Site Reliability Engineer (SRE) – TS / SCI Clearance.Technology has revolutionized how we approach job hunting, and this book streamlines the process into a fast, efficient system that works.Instead ...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Reliability Engineer

    Senior Reliability Engineer

    The Johns Hopkins University Applied Physics LaboratoryLaurel, MD, United States
    Full-time
    Are you passionate about applying reliability and system engineering principles to analyze and assess the resilience of future strategic weapon systems?. Do you have a strong technical background in...Show moreLast updated: 8 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    CSCI ConsultingQuantico, VA, United States
    Full-time
    CSCI Consulting is looking for a.Site Reliability Engineer (SRE).This role combines deep systems engineering knowledge with DevOps automation, proactive monitoring, and incident response practices....Show moreLast updated: 30+ days ago
    • Promoted
    Site Implementation Engineer, Senior

    Site Implementation Engineer, Senior

    Leidos IncReston, VA, United States
    Full-time
    Leidos Digital Modernization Sector is looking for a Site Implementation Engineer to work on the Army Global Unified Network (AGUN) - Increment 1 (INC1) program. The Global Enterprise Network Modern...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Powder River IndustriesWashington, DC, United States
    Full-time
    Conduct analysis of alternatives for configuration tools, make recommendations, work with team to design, develop, test, implement, and maintain tool choice. Responsible for the administration, moni...Show moreLast updated: 4 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    EngFlowWashington, DC, United States
    Full-time
    Join to apply for the Site Reliability Engineer role at EngFlow.At EngFlow, we help developers save time by accelerating software builds and tests. Our cloud-based, distributed service optimizes dev...Show moreLast updated: 4 days ago
    • Promoted
    Principal Site Reliability Engineer (SRE) at Jobgether Washington DC

    Principal Site Reliability Engineer (SRE) at Jobgether Washington DC

    JobgetherWashington, DC, United States
    Full-time
    Principal Site Reliability Engineer (SRE) job at Jobgether.This position is posted by Jobgether on behalf of.We are currently looking for a. Principal Site Reliability Engineer (SRE).Join a high-imp...Show moreLast updated: 30+ days ago
    • Promoted
    Deployment Site Reliability Engineer - Connected Warfare

    Deployment Site Reliability Engineer - Connected Warfare

    Anduril Industries, Inc.Washington, DC, United States
    Full-time
    Senior Deployed Site Reliability Engineer, Connected Warfare.Washington, District of Columbia, United States.Anduril Industries is a defense technology company with a mission to transform U.By brin...Show moreLast updated: 4 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    CapeWashington, DC, United States
    Full-time
    Cape was founded in early 2022 by Palantir and Anduril alums with deep expertise in privacy and national security.While running Palantir’s US national security business, our CEO became passionate a...Show moreLast updated: 4 days ago
    • Promoted
    Site Reliability Engineer — Scale mission-critical platforms

    Site Reliability Engineer — Scale mission-critical platforms

    Anduril IndustriesWashington, DC, United States
    Full-time
    A defense technology company is seeking a Site Reliability Engineer in Washington, DC.The role involves solving challenges in networking and systems integration while working with cross-functional ...Show moreLast updated: 1 day ago