Staff Site Reliability Engineer
We're looking for an experienced Staff Site Reliability Engineer to join our Government Cloud team, reporting to the Director-Site Reliability Engineering. This is a hybrid role, reporting into the San Jose, CA or Bellevue, WA office 3 days a week. You will :
- Perform operational duties for FedRAMP cloud products, including deployments, on-call support, and incident management.
- Join deployment sync calls and conduct Operations hand-offs.
- Manage cloud infrastructure elements, including AWS GovCloud, private cloud, containers, and VMs.
- Operate and enhance monitoring systems, and drive automation, scripting, and Infrastructure as Code (IaC) efforts.
- Write and maintain documentation, resolve escalations, prevent incident recurrence, and promote DevOps best practices.
What We're Looking For (Minimum Qualifications) :
5+ years of experience as a Site Reliability Engineer with expertise in Operations and Engineering.Experience with FedRAMP compliance (High / Moderate levels), vulnerability management, and continuous monitoring, including scanning, patching, and reporting.Proficiency in Linux administration, network troubleshooting, and infrastructure as code (Ansible, Terraform) in cloud environments.Experience in large-scale distributed systems, containerized architectures (AWS ECS, Kubernetes), and cloud services, with a strong foundation in web security, networking, and coding (Python).U.S. citizenship due to the nature of the customers assigned to this role.What Will Make You Stand Out (Preferred Qualifications) :
Experience with containerized architectures such as AWS ECS and Kubernetes.Knowledge of web security protocols including HTTP, SSL / TLS, DNS, SQL, and networking fundamentals.Coding experience in Python, for automation and software integration.