Cloud Infrastructure / Site Reliability Engineer
As a Cloud Infrastructure / Site Reliability Engineer, you will operate at the intersection of development and operations. You will engage and enhance all aspects of the cloud services lifecycle from design through deployment, operation, and refinement. You will be responsible for maintaining these services by measuring and monitoring their availability, latency, and overall system health and building automation for efficient cloud operations management. You will play a crucial role in sustainably scaling systems through automation and driving changes that improve reliability and velocity. As part of your responsibilities, you will administer cloud-based environments that support our SaaS / IaaS offerings implemented on a microservices, container-based architecture (Kubernetes). In addition, you will oversee a portfolio of customer-centric cloud services (SaaS / IaaS), ensuring their overall availability, performance, and security. You will work closely with NetApp and cloud service provider teams (to include Azure) from NetApp sites in Research Triangle Park (RTP), NC; Vienna, VA; Waltham, MA; or Pittsburgh, PA. Due to the critical nature of the services we support, this position involves participation in a rotation-based on-call schedule as part of our global team. This role offers the opportunity to work in a dynamic, global environment, ensuring the smooth operation of vital cloud services. To be successful in this role, you should be a motivated self-starter and self-learner, possess strong problem-solving skills, and be someone who embraces challenges.
Key Responsibilities
Job Requirements
8+ years experience in scripting and infrastructure automation using tools such as PowerShell, Python, or Go. Deep working knowledge of Containers, Kubernetes, Serverless computing implementation, and distributed systems design patterns. Knowledge of DevOps / SRE development methodologies. Proficiency in Linux / Unix and CoreOS. Experience with cloud platforms such as AWS, Azure, or Google Cloud. Ability to lead a scrum team, influence stakeholders to effectively maintain a product backlog, manage sprints. Must be a US Citizen or Green Card holder. This position will have ON-CALL rotations as well as an ask to work odd hours. Preference if you possess either an interim Secret clearance (or above) or have recently undergone a Criminal Justice Information Services (CJIS) background check to verify criminal history, employment history, and financial / credit history. Education A Bachelor of Science Degree in Computer Science, a master's degree; or equivalent experience is required.
Senior Product Engineer • Waltham, MA, United States