Site Reliability Engineer
CLS Group
Iselin, NJ, United States
$140K-$170K a year
Full-time
Job Purpose
The role is primarily responsible for ensuring that SRE methodologies are applied to the cloud hosted environment. In addition, the role will act as a central point of expertise for SRE automation across the Platform Operations team.
Essential Job Functions
- Responsible for implementing SRE methodologies within the cloud hosted environment, including automation of TOIL and definition / implementation of SLOs and SLAs.
- Establish SRE as a practice within the Cloud team and working closely with Infrastructure Engineering enhance observability and telemetry to ensure the Cloud hosted services have appropriate Service metrics and monitoring.
- Build out GitOps practices for use in the cloud hosted environment using tooling such as Terraform and Ansible. Provide a liaison between Engineering and Cloud Operations to fully embed Infrastructure as Code for all new cloud hosted deployments.
- Provide support escalation for Cloud and Automation related issues ensuring that Production stability is always the primary requirement.
- Ensure risks and stability issues in the cloud hosted environment are understood and addressed where possible through SRE best practice as part of any incident postmortems.
Minimum Education Required
- Bachelor’s degree educated or equivalent
- Industry standard IT certification desired e.g. AWS / Microsoft / VMware / Redhat Linux
Minimum job-related Experience Required
- Must have strong technical operational support experience within an infrastructure services team preferably supporting either on-premise compute or cloud hosted environments.
- Strong understanding of automation technologies ideally including Terraform and Ansible and ability to use these tools to drive Infrastructure as Code through GitOps methodologies.
- Knowledge of at least 1 scripting language, preferably either Python or Powershell.
- Minimum of 2 years experience applying SRE methodologies within a support team and an understanding of Service Level metrics associated with this.
- Experience of APM tools (e.g. Grafana / Datadog / Dynatrace).
- Experience of working in a regulated financial services / banking organization.
Special Skills / Knowledge
- Able to understand and use at least one cloud hosted services including either AWS or Azure.
- Possesses a strong service-orientated mindset, can consistently deliver a high level of service to the business.
- Able to make and influence decisions with peers, stakeholders and management.
- Able to communicate effectively with both business and technical staff at all levels. This includes communicating complex technical issues to different levels of management.
- Able to work proactively, own complex deliveries and provide regular updates to management and stakeholders.
Expected full-time salary range between $140,000 - $170,000 + variable compensation + 401(k) match + benefits.
- Note : Disclosure as required by NY Pay Transparency Law of the expected salary compensation range for this role
30+ days ago