Lead Site Reliability Engineer - Cloud Operations. - US Citizen RequiredArizona Staffing • Phoenix, AZ, United States

Lead Site Reliability Engineer - Cloud Operations. - US Citizen Required

Arizona Staffing • Phoenix, AZ, United States

1 day ago

Job type

Full-time

Job description

Principal Site Reliability Engineer (SRE)

Oracle Health (OHAI) is a leader in generative AI for healthcare, focusing on cutting-edge cloud services that streamline healthcare operations. Our EHR and Clinical AI Agent platforms help healthcare providers reduce manual tasks and improve patient care. We are expanding our OCI Cloud Operations team and are seeking a Principal Site Reliability Engineer (SRE) to ensure the reliability, performance, and scalability of these services in a production environment.

Role Overview

As a key member of Oracle Health's Cloud Operations team, you will be responsible for the operational health, reliability, and performance of our EHR and Clinical AI Agent services. You will ensure these customer-facing platforms meet the highest standards of scalability, availability, and security, while developing and executing strategies for continuous optimization. Your focus will include cloud service performance monitoring, incident management, and operational efficiency across Kubernetes-based environments. In this role, you will serve as a technical team lead, providing mentorship to Site Reliability Engineers (SREs).

Key Responsibilities

Service Ownership & Leadership : Own operational aspects of the EHR and Clinical AI Agent cloud services, ensuring their reliability and performance. Serve as a technical lead within the Site Reliability Engineering (SRE) team, promoting adherence to best practices in incident management, service design, operational excellence, and automation. Mentor and support team members, fostering professional growth, technical proficiency, and a culture of accountability and continuous improvement in service operations.

Operations Engineering : Oversee deployment and operations of cloud services across commercial and government data centers, ensuring compliance with corporate and regulatory standards. Monitor and optimize resource utilization, performance, and scalability to maintain high availability and reliability. Ensure security and compliance of all services in alignment with organizational and governmental requirements. Manage incidents proactively, identifying and resolving issues in real time to minimize downtime and maintain system stability. Lead critical incident response efforts, collaborating with development teams to implement corrective and preventive measures.

Service Design & Optimization : Design and implement zero-downtime deployment strategies for software and security updates. Work with cross-functional teams to enhance service stability and ensure predictable, efficient operation. Implement proactive measures for system failure analysis and rapid issue resolution.

Automation & Continuous Improvement : Lead automation initiatives to streamline operational tasks and reduce manual intervention. Drive improvements to monitoring and alerting frameworks using tools like Prometheus and Grafana. Implement Infrastructure as Code (IaaC) practices using Terraform and Shepherd. Assist in the optimization of cloud-based services, ensuring smooth performance, scalability, and operational efficiency.

Qualifications

Experience : 8+ years in Site Reliability Engineering, DevOps, or Cloud Operations. Experienced in managing customer-facing, Kubernetes-based Cloud services.

Cloud & Container Technologies : Experience with OCI, Kubernetes, Docker, Prometheus, Grafana, and cloud-native solutions.

Scripting & Automation : Proficiency in Python, Perl, Shell Scripting, and tools like Terraform.

Incident Management : Strong troubleshooting skills for resolving complex issues in production systems.

Cloud Platforms : Experience with OCI, AWS, GCP, or Azure.

Version Control : Familiarity with Git.

Operating Systems : Extensive experience with Linux / Unix environments in a Cloud Production environment.

Security & Compliance : Knowledge of cloud security best practices, particularly in regulated industries like healthcare.

US Citizenship on US soil is required. This position requires you to be eligible to receive a federal security clearance which requires you to be a US Citizen.

Disclaimer

Certain US customer or client-facing roles may be required to comply with applicable requirements, such as immunization and occupational health mandates.

Range and benefit information provided in this posting are specific to the stated locations only

US Hiring Range

Hiring Range in USD from : $86,400 to $199,500 per annum. May be eligible for bonus and equity.

Oracle maintains broad salary ranges for its roles in order to account for variations in knowledge, skills, experience, market conditions and locations, as well as reflect Oracle's differing products, industries and lines of business. Candidates are typically placed into the range based on the preceding factors as well as internal peer equity.

Oracle US offers a comprehensive benefits package which includes the following :

Medical, dental, and vision insurance, including expert medical opinion
Short term disability and long term disability
Life insurance and AD&D
Supplemental life insurance (Employee / Spouse / Child)
Health care and dependent care Flexible Spending Accounts
Pre-tax commuter and parking benefits
401(k) Savings and Investment Plan with company match
Paid time off : Flexible Vacation is provided to all eligible employees assigned to a salaried (non-overtime eligible) position. Accrued Vacation is provided to all other employees eligible for vacation benefits. For employees working at least 35 hours per week, the vacation accrual rate is 13 days annually for the first three years of employment and 18 days annually for subsequent years of employment. Vacation accrual is prorated for employees working between 20 and 34 hours per week. Employees working fewer than 20 hours per week are not eligible for vacation.
11 paid holidays
Paid sick leave : 72 hours of paid sick leave upon date of hire. Refreshes each calendar year. Unused balance will carry over each year up to a maximum cap of 112 hours.
Paid parental leave
Adoption assistance
Employee Stock Purchase Plan
Financial planning and group legal
Voluntary benefits including auto, homeowner and pet insurance

The role will generally accept applications for at least three calendar days from the posting date or as long as the job remains posted.

Career Level - IC4

Create a job alert for this search

Site Reliability Engineer • Phoenix, AZ, United States

Related jobs

Site Reliability Engineering Manager-Middleware

PNC • Phoenix, AZ, United States

Full-time +1

Site Reliability Engineering Manager-Middleware.Be among the first 25 applicants.At PNC, our people are our greatest differentiator and competitive advantage in the markets we serve.We are all unit...Show more

Last updated: 30+ days ago • Promoted

Site Reliability Engineer

TWO95 International • Phoenix, AZ, United States

Full-time

Title : Site Reliability Engineer.BS or MS degree in computer science, computer engineering, or other technical discipline, or equivalent 3-6 years of work experience in DevOps - Java / J2EE / REACT JS ...Show more

Last updated: 30+ days ago • Promoted

Bomb Technical

U.S. Navy • Paradise Valley, AZ, US

Full-time +1

To be eligible to enlist in the U.Navy, candidates must be between the ages of 18-34.Americans live for fireworks on the Fourth of July. The other 364 days of the year, Explosive Ordnance Disposal (...Show more

Last updated: 1 day ago • Promoted

Principal Site Reliability Developer

Oracle • Phoenix, AZ, United States

Full-time

We are looking for a Principal Site Reliability Engineer to join our OCI team.This role is part of a globally distributed team responsible for detecting, triaging, and mitigating OCI service-impact...Show more

Last updated: 1 day ago • Promoted

Lead Site Reliability Engineer (SRE)

Lumen Inc • Phoenix, AZ, United States

Full-time

We are igniting business growth by connecting people, data and applications - quickly, securely, and effortlessly.Together, we are building a culture and company from the people up - committed to t...Show more

Last updated: 1 day ago • Promoted

Site Reliability Engineer

Diverse Lynx • Phoenix, AZ, United States

Full-time

Looking for skilled and motivated Site Reliability Engineering (SRE) Production Support Engineer to join our dynamic team in the banking industry. This role is critical to ensuring the stability, re...Show more

Last updated: 30+ days ago • Promoted

Site Reliability Engineer

KUBRA • Tempe, Arizona, US

Permanent

Are you passionate about transforming and optimizing complex infrastructures? Do you thrive on solving challenging technical problems and ensuring high availability, security, and performance in cl...Show more

Last updated: 3 days ago • Promoted

Senior Site Reliability Developer

Oracle • Phoenix, AZ, United States

Full-time

Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence.Design, write, and deploy software to improve the availability, scalability, and e...Show more

Last updated: 30+ days ago • Promoted

Site Reliability Engineer

Arizona Staffing • Phoenix, AZ, United States

Full-time

At American Express, our culture is built on a 175-year history of innovation, shared values and Leadership Behaviors, and an unwavering commitment to back our customers, communities, and colleague...Show more

Last updated: 1 day ago • Promoted

Sr Site Reliability Developer

Arizona Staffing • Phoenix, AZ, United States

Full-time

Join our dynamic automation team as an experienced Site Reliability Developer focused on deploying and managing automation capabilities and platforms for Linux, Windows, and cloud native systems an...Show more

Last updated: 21 hours ago • Promoted • New!

Sr Site Reliability Engineer

Early Warning Services, LLC • Scottsdale, AZ, United States

Full-time

At Early Warning, we've powered and protected the U.Zelle®, Paze℠, and so much more.As a trusted name in payments, we partner with thousands of institutions to increase access to financial services...Show more

Last updated: 30+ days ago • Promoted

Linux Site Reliability Engineer

Nutanix • Phoenix, AZ, United States

Full-time

Hungry, Humble, Honest, with Heart.Are you a detail-oriented problem solver with a passion for optimizing cloud operations and a knack for writing efficient scripts? If so, you'll thrive in our dyn...Show more

Last updated: 1 day ago • Promoted

Sr Site Reliability Developer

Oracle • Phoenix, AZ, United States

Full-time

Last updated: 1 day ago • Promoted

Site Reliability Engineer Advanced Software Engineer

ClearanceJobs • Scottsdale, AZ, United States

Full-time

As a Site Reliability Engineer (SRE), you will be a member of a cross functional team responsible for maintaining survivability and reliability of mission critical resources.SREs monitor high prior...Show more

Last updated: 1 day ago • Promoted

Mainframe System Programmer / Hardware Configuration

Ensono • Phoenix, AZ, United States

Full-time

Mainframe System Programmer / Hardware ConfigurationRemote - United StatesJR011978.Mainframe System Programmer / Hardware Configuration. Purpose is to be a relentless ally, disrupting the status quo and...Show more

Last updated: 30+ days ago • Promoted

Site Reliability Engineer Upskilling Start Date Jan 27th!

TEKsystems • Phoenix, AZ, United States

Temporary

Hello I have opportunities for new grads / alumni interested in moving forward within your IT career!.TEKsystems is a leader in IT solutions and is working with a top American bank holding company an...Show more

Last updated: 25 days ago • Promoted

Undergrad Site Reliability Engineer - Full-time Intern Conversion

Oracle • Phoenix, AZ, United States

Full-time

This FTE conversion requisition is ONLY for current Oracle PD interns (non-OCI) to be rehired for full-time roles.Intended for students graduating with their Bachelors degree by, or have graduated ...Show more

Last updated: 1 day ago • Promoted

Site Reliability Engineer in Tempe

Energy Jobline ZR • Tempe, AZ, United States

Permanent

Energy Jobline is the largest and fastest growing global Energy Job Board and Energy Hub.We have an audience reach of over 7 million energy professionals, 400,000+ monthly advertised global energy ...Show more

Last updated: 1 day ago • Promoted