Site Reliability Engineer

CLS Group
Iselin, NJ, United States
$140K-$170K a year
Full-time
We are sorry. The job offer you are looking for is no longer available.

Job Purpose

The role is primarily responsible for ensuring that SRE methodologies are applied to the cloud hosted environment. In addition, the role will act as a central point of expertise for SRE automation across the Platform Operations team.

Essential Job Functions

  • Responsible for implementing SRE methodologies within the cloud hosted environment, including automation of TOIL and definition / implementation of SLOs and SLAs.
  • Establish SRE as a practice within the Cloud team and working closely with Infrastructure Engineering enhance observability and telemetry to ensure the Cloud hosted services have appropriate Service metrics and monitoring.
  • Build out GitOps practices for use in the cloud hosted environment using tooling such as Terraform and Ansible. Provide a liaison between Engineering and Cloud Operations to fully embed Infrastructure as Code for all new cloud hosted deployments.
  • Provide support escalation for Cloud and Automation related issues ensuring that Production stability is always the primary requirement.
  • Ensure risks and stability issues in the cloud hosted environment are understood and addressed where possible through SRE best practice as part of any incident postmortems.

Minimum Education Required

  • Bachelor’s degree educated or equivalent
  • Industry standard IT certification desired e.g. AWS / Microsoft / VMware / Redhat Linux

Minimum job-related Experience Required

  • Must have strong technical operational support experience within an infrastructure services team preferably supporting either on-premise compute or cloud hosted environments.
  • Strong understanding of automation technologies ideally including Terraform and Ansible and ability to use these tools to drive Infrastructure as Code through GitOps methodologies.
  • Knowledge of at least 1 scripting language, preferably either Python or Powershell.
  • Minimum of 2 years experience applying SRE methodologies within a support team and an understanding of Service Level metrics associated with this.
  • Experience of APM tools (e.g. Grafana / Datadog / Dynatrace).
  • Experience of working in a regulated financial services / banking organization.

Special Skills / Knowledge

  • Able to understand and use at least one cloud hosted services including either AWS or Azure.
  • Possesses a strong service-orientated mindset, can consistently deliver a high level of service to the business.
  • Able to make and influence decisions with peers, stakeholders and management.
  • Able to communicate effectively with both business and technical staff at all levels. This includes communicating complex technical issues to different levels of management.
  • Able to work proactively, own complex deliveries and provide regular updates to management and stakeholders.

Expected full-time salary range between $140,000 - $170,000 + variable compensation + 401(k) match + benefits.

  • Note : Disclosure as required by NY Pay Transparency Law of the expected salary compensation range for this role
  • 30+ days ago
Related jobs
Promoted
PulsePoint
Newark, New Jersey

We are looking for a Senior Site Reliability Engineer to join our team. Ensure reliability and scalability of our multi datacenter and hybrid Linux environments. Performance and reliability testing. Maintain documentation, build tooling, and create alerts to both identify and address infrastructure ...

Promoted
Iris Software Inc.
Jersey City, New Jersey

Site Reliability Engineer - Jersey City, NJ (Hybrid). A matrixed SRE will be provided the Reliability Engineering role in the accounts they are responsible for. ...

Trigyn Technologies
Jersey City, New Jersey

Trigyn’s financial services client has an immediate need for a Site Reliability Engineer in Jersey City. Site Reliability Engineers (SRE) to help their internal team provide production support in a public cloud environment. Demonstrated experience as a Site Reliability Engineer. Location: Must be ab...

JPMorgan Chase & Co.
Jersey City, New Jersey

Lead Site Reliability Engineer. Deep proficiency in reliability, scalability, performance, security, enterprise system architecture, toil reduction, and other site reliability best practices with the ability to implement these practices within an application or platform. Assume a critical role in de...

Brains Workgroup, Inc.
Woodbridge Township, New Jersey

Lead Site Reliability Engineer. Keywords: SRE site reliability splunk dynatrace new relic grafana datadog kubernetes aws azure gcp cloud. Software Engineering, and Architecture experience with at least 5+ years on SRE focused experience in Production Support, Application Support and DevOps implement...

Capital One
NJ, United States

Lead Software Engineer, DevOps - Site Reliability Engineering. In this role, you will act as a Lead DevOps Engineer on a Site Reliability team in Bank Tech. Understand business requirements for system reliability and translate them into implementations such as scaling, failover, timeouts and health ...

JPMorgan Chase Bank, N.A.
Jersey City, New Jersey

Job responsibilities * Demonstrates and champions site reliability culture and practices and exerts technical influence throughout your team * Leads initiatives to improve the reliability and stability of your team's applications and platforms using data-driven analytics to impro...

Mizuho Bank
Woodbridge Township, New Jersey

Join the Mizuho team as a Lead Site Reliability Engineer (SRE)!   The successful candidate will bring the following: •        10+ years of Software Engineering, and Architecture experience with at least 5+ years on SRE focused experience in Production Support, Application Support and ...

tanishasystems
Jersey City, New Jersey

Site Reliability Engineer w/d Hadoop and Spark Exp Location: Jersey City, NJ (3 Days Onsite)Hire Type: FulltimeExperience: 9 - 11 years Job Description: " Overall, 8 + Years of experience, with hands-on SRE experience. ...

Devexperts
Jersey City, New Jersey

We are looking for a Senior Site Reliability Engineer (SRE) to fill the open position in a team that develops and supports proprietary trading platforms for large scale clients. ...