Talent.com
Lead Site Reliability Engineer - Cloud Operations. - US Citizen Required
Lead Site Reliability Engineer - Cloud Operations. - US Citizen RequiredArizona Staffing • Phoenix, AZ, United States
Lead Site Reliability Engineer - Cloud Operations. - US Citizen Required

Lead Site Reliability Engineer - Cloud Operations. - US Citizen Required

Arizona Staffing • Phoenix, AZ, United States
1 day ago
Job type
  • Full-time
Job description

Principal Site Reliability Engineer (SRE)

Oracle Health (OHAI) is a leader in generative AI for healthcare, focusing on cutting-edge cloud services that streamline healthcare operations. Our EHR and Clinical AI Agent platforms help healthcare providers reduce manual tasks and improve patient care. We are expanding our OCI Cloud Operations team and are seeking a Principal Site Reliability Engineer (SRE) to ensure the reliability, performance, and scalability of these services in a production environment.

Role Overview

As a key member of Oracle Health's Cloud Operations team, you will be responsible for the operational health, reliability, and performance of our EHR and Clinical AI Agent services. You will ensure these customer-facing platforms meet the highest standards of scalability, availability, and security, while developing and executing strategies for continuous optimization. Your focus will include cloud service performance monitoring, incident management, and operational efficiency across Kubernetes-based environments. In this role, you will serve as a technical team lead, providing mentorship to Site Reliability Engineers (SREs).

Key Responsibilities

Service Ownership & Leadership : Own operational aspects of the EHR and Clinical AI Agent cloud services, ensuring their reliability and performance. Serve as a technical lead within the Site Reliability Engineering (SRE) team, promoting adherence to best practices in incident management, service design, operational excellence, and automation. Mentor and support team members, fostering professional growth, technical proficiency, and a culture of accountability and continuous improvement in service operations.

Operations Engineering : Oversee deployment and operations of cloud services across commercial and government data centers, ensuring compliance with corporate and regulatory standards. Monitor and optimize resource utilization, performance, and scalability to maintain high availability and reliability. Ensure security and compliance of all services in alignment with organizational and governmental requirements. Manage incidents proactively, identifying and resolving issues in real time to minimize downtime and maintain system stability. Lead critical incident response efforts, collaborating with development teams to implement corrective and preventive measures.

Service Design & Optimization : Design and implement zero-downtime deployment strategies for software and security updates. Work with cross-functional teams to enhance service stability and ensure predictable, efficient operation. Implement proactive measures for system failure analysis and rapid issue resolution.

Automation & Continuous Improvement : Lead automation initiatives to streamline operational tasks and reduce manual intervention. Drive improvements to monitoring and alerting frameworks using tools like Prometheus and Grafana. Implement Infrastructure as Code (IaaC) practices using Terraform and Shepherd. Assist in the optimization of cloud-based services, ensuring smooth performance, scalability, and operational efficiency.

Qualifications

Experience : 8+ years in Site Reliability Engineering, DevOps, or Cloud Operations. Experienced in managing customer-facing, Kubernetes-based Cloud services.

Cloud & Container Technologies : Experience with OCI, Kubernetes, Docker, Prometheus, Grafana, and cloud-native solutions.

Scripting & Automation : Proficiency in Python, Perl, Shell Scripting, and tools like Terraform.

Incident Management : Strong troubleshooting skills for resolving complex issues in production systems.

Cloud Platforms : Experience with OCI, AWS, GCP, or Azure.

Version Control : Familiarity with Git.

Operating Systems : Extensive experience with Linux / Unix environments in a Cloud Production environment.

Security & Compliance : Knowledge of cloud security best practices, particularly in regulated industries like healthcare.

US Citizenship on US soil is required. This position requires you to be eligible to receive a federal security clearance which requires you to be a US Citizen.

Disclaimer

Certain US customer or client-facing roles may be required to comply with applicable requirements, such as immunization and occupational health mandates.

Range and benefit information provided in this posting are specific to the stated locations only

US Hiring Range

Hiring Range in USD from : $86,400 to $199,500 per annum. May be eligible for bonus and equity.

Oracle maintains broad salary ranges for its roles in order to account for variations in knowledge, skills, experience, market conditions and locations, as well as reflect Oracle's differing products, industries and lines of business. Candidates are typically placed into the range based on the preceding factors as well as internal peer equity.

Oracle US offers a comprehensive benefits package which includes the following :

  • Medical, dental, and vision insurance, including expert medical opinion
  • Short term disability and long term disability
  • Life insurance and AD&D
  • Supplemental life insurance (Employee / Spouse / Child)
  • Health care and dependent care Flexible Spending Accounts
  • Pre-tax commuter and parking benefits
  • 401(k) Savings and Investment Plan with company match
  • Paid time off : Flexible Vacation is provided to all eligible employees assigned to a salaried (non-overtime eligible) position. Accrued Vacation is provided to all other employees eligible for vacation benefits. For employees working at least 35 hours per week, the vacation accrual rate is 13 days annually for the first three years of employment and 18 days annually for subsequent years of employment. Vacation accrual is prorated for employees working between 20 and 34 hours per week. Employees working fewer than 20 hours per week are not eligible for vacation.
  • 11 paid holidays
  • Paid sick leave : 72 hours of paid sick leave upon date of hire. Refreshes each calendar year. Unused balance will carry over each year up to a maximum cap of 112 hours.
  • Paid parental leave
  • Adoption assistance
  • Employee Stock Purchase Plan
  • Financial planning and group legal
  • Voluntary benefits including auto, homeowner and pet insurance

The role will generally accept applications for at least three calendar days from the posting date or as long as the job remains posted.

Career Level - IC4

Create a job alert for this search

Site Reliability Engineer • Phoenix, AZ, United States

Related jobs
Site Reliability Engineering Manager-Middleware

Site Reliability Engineering Manager-Middleware

PNC • Phoenix, AZ, United States
Full-time +1
Site Reliability Engineering Manager-Middleware.Be among the first 25 applicants.At PNC, our people are our greatest differentiator and competitive advantage in the markets we serve.We are all unit...Show more
Last updated: 30+ days ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

TWO95 International • Phoenix, AZ, United States
Full-time
Title : Site Reliability Engineer.BS or MS degree in computer science, computer engineering, or other technical discipline, or equivalent 3-6 years of work experience in DevOps - Java / J2EE / REACT JS ...Show more
Last updated: 30+ days ago • Promoted
Bomb Technical

Bomb Technical

U.S. Navy • Paradise Valley, AZ, US
Full-time +1
To be eligible to enlist in the U.Navy, candidates must be between the ages of 18-34.Americans live for fireworks on the Fourth of July. The other 364 days of the year, Explosive Ordnance Disposal (...Show more
Last updated: 1 day ago • Promoted
Principal Site Reliability Developer

Principal Site Reliability Developer

Oracle • Phoenix, AZ, United States
Full-time
We are looking for a Principal Site Reliability Engineer to join our OCI team.This role is part of a globally distributed team responsible for detecting, triaging, and mitigating OCI service-impact...Show more
Last updated: 1 day ago • Promoted
Lead Site Reliability Engineer (SRE)

Lead Site Reliability Engineer (SRE)

Lumen Inc • Phoenix, AZ, United States
Full-time
We are igniting business growth by connecting people, data and applications - quickly, securely, and effortlessly.Together, we are building a culture and company from the people up - committed to t...Show more
Last updated: 1 day ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

Diverse Lynx • Phoenix, AZ, United States
Full-time
Looking for skilled and motivated Site Reliability Engineering (SRE) Production Support Engineer to join our dynamic team in the banking industry. This role is critical to ensuring the stability, re...Show more
Last updated: 30+ days ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

KUBRA • Tempe, Arizona, US
Permanent
Are you passionate about transforming and optimizing complex infrastructures? Do you thrive on solving challenging technical problems and ensuring high availability, security, and performance in cl...Show more
Last updated: 3 days ago • Promoted
Senior Site Reliability Developer

Senior Site Reliability Developer

Oracle • Phoenix, AZ, United States
Full-time
Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence.Design, write, and deploy software to improve the availability, scalability, and e...Show more
Last updated: 30+ days ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

Arizona Staffing • Phoenix, AZ, United States
Full-time
At American Express, our culture is built on a 175-year history of innovation, shared values and Leadership Behaviors, and an unwavering commitment to back our customers, communities, and colleague...Show more
Last updated: 1 day ago • Promoted
Sr Site Reliability Developer

Sr Site Reliability Developer

Arizona Staffing • Phoenix, AZ, United States
Full-time
Join our dynamic automation team as an experienced Site Reliability Developer focused on deploying and managing automation capabilities and platforms for Linux, Windows, and cloud native systems an...Show more
Last updated: 21 hours ago • Promoted • New!
Sr Site Reliability Engineer

Sr Site Reliability Engineer

Early Warning Services, LLC • Scottsdale, AZ, United States
Full-time
At Early Warning, we've powered and protected the U.Zelle®, Paze℠, and so much more.As a trusted name in payments, we partner with thousands of institutions to increase access to financial services...Show more
Last updated: 30+ days ago • Promoted
Linux Site Reliability Engineer

Linux Site Reliability Engineer

Nutanix • Phoenix, AZ, United States
Full-time
Hungry, Humble, Honest, with Heart.Are you a detail-oriented problem solver with a passion for optimizing cloud operations and a knack for writing efficient scripts? If so, you'll thrive in our dyn...Show more
Last updated: 1 day ago • Promoted
Sr Site Reliability Developer

Sr Site Reliability Developer

Oracle • Phoenix, AZ, United States
Full-time
Join our dynamic automation team as an experienced Site Reliability Developer focused on deploying and managing automation capabilities and platforms for Linux, Windows, and cloud native systems an...Show more
Last updated: 1 day ago • Promoted
Site Reliability Engineer Advanced Software Engineer

Site Reliability Engineer Advanced Software Engineer

ClearanceJobs • Scottsdale, AZ, United States
Full-time
As a Site Reliability Engineer (SRE), you will be a member of a cross functional team responsible for maintaining survivability and reliability of mission critical resources.SREs monitor high prior...Show more
Last updated: 1 day ago • Promoted
Mainframe System Programmer / Hardware Configuration

Mainframe System Programmer / Hardware Configuration

Ensono • Phoenix, AZ, United States
Full-time
Mainframe System Programmer / Hardware ConfigurationRemote - United StatesJR011978.Mainframe System Programmer / Hardware Configuration. Purpose is to be a relentless ally, disrupting the status quo and...Show more
Last updated: 30+ days ago • Promoted
Site Reliability Engineer Upskilling Start Date Jan 27th!

Site Reliability Engineer Upskilling Start Date Jan 27th!

TEKsystems • Phoenix, AZ, United States
Temporary
Hello I have opportunities for new grads / alumni interested in moving forward within your IT career!.TEKsystems is a leader in IT solutions and is working with a top American bank holding company an...Show more
Last updated: 25 days ago • Promoted
Undergrad Site Reliability Engineer - Full-time Intern Conversion

Undergrad Site Reliability Engineer - Full-time Intern Conversion

Oracle • Phoenix, AZ, United States
Full-time
This FTE conversion requisition is ONLY for current Oracle PD interns (non-OCI) to be rehired for full-time roles.Intended for students graduating with their Bachelors degree by, or have graduated ...Show more
Last updated: 1 day ago • Promoted
Site Reliability Engineer in Tempe

Site Reliability Engineer in Tempe

Energy Jobline ZR • Tempe, AZ, United States
Permanent
Energy Jobline is the largest and fastest growing global Energy Job Board and Energy Hub.We have an audience reach of over 7 million energy professionals, 400,000+ monthly advertised global energy ...Show more
Last updated: 1 day ago • Promoted