Talent.com
Platform Reliability Engineer

Platform Reliability Engineer

HearstLouisville, KY, United States
14 days ago
Job type
  • Full-time
  • Part-time
  • Permanent
Job description

Platform Reliability Engineers (PREs)  at Homecare Homebase ensure that our most critical healthcare services remain reliable, resilient, and high-performing at scale. Blending software engineering with systems operations, PREs focus on automation, observability, incident response, and the continuous reduction of toil across complex distributed platforms.

This role calls for confident execution in high-stakes, high-visibility scenarios—particularly during major incidents—alongside proactive efforts to harden existing systems and improve service health over time. Ideal candidates are those who thrive in complex environments, take ownership of production reliability, and find purpose in creating systems that recover gracefully and support exceptional care delivery.

Platform Reliability Engineers work closely with HCHB’s Architects, Product & Development teams, System Administrators, Platform Engineers, DBAs, and Product Support in the execution of their responsibilities.

RESPONSIBILITIES

  • Deliver solutions that enhance the overall reliability of the platform and / or reduce toil.
  • Establish modern observability patterns and implement those patterns.
  • Monitor the overall platform health as well as manage overall uptime and availability.
  • Operationalization of services including system testing, instrumentation, monitoring, capacity model development, training, and transition to operation teams.
  • Manage deployments of major releases.
  • Lead and coordinate resolution efforts during major incidents by serving as the incident commander.
  • Participate in an equitable 24×7 on-call rotation—serving as first responder for production alerts and escalation point for other teams.

MINIMUM QUALIFICATIONS

  • Bachelor’s degree in Computer Science, Systems Engineering, Math or related (equivalent experience considered) required.
  • 3+ years experience in a 24x7 production enterprise-class environment as an SRE or comparable role.
  • 1+ years Kubernetes administration / support in a production environment.
  • 1+ years Azure or comparable cloud PaaS, IaaS, and resource administration / support in a production environment.
  • Demonstrated composure and effectiveness in situations requiring rapid analysis, clear prioritization, and decisive action – particularly in incidents with significant business or customer impact.
  • Excellent problem solving and analytical skills with attention to detail and driving issues to resolution.
  • Experience solving problems via automation using orchestration platforms such as Ansible, Azure Automation, and ServiceNow Flows.
  • Proficient with scripting languages (multiple preferred) : Bash, PowerShell, Python, and JavaScript.
  • Proficient with data tier languages : TSQL.
  • Proficient with the following monitoring solutions (multiple preferred) : Splunk, Prometheus / Grafana, ThousandEyes, Application Insights, Azure Monitor, and Microsoft SCOM.
  • Proficient with modern SRE and Observability concepts (eg. OTEL, service level management, etc).
  • PREFERRED QUALIFICATIONS

  • Academic coursework in Algorithms, Data Structures, Distributed Systems, and Information Security.
  • 1+ year(s) serving as incident commander for major incidents.
  • Proficient with networking and troubleshooting (ie. addressing, routing, DNS, load balancing, mesh networking).
  • Ability to debug and optimize infrastructure as code pipelines using Ansible, Terraform, and Azure ARM.
  • Proficient with ITSM\ITIL practices such as service management, change management, incident management, and problem management particularly in ServiceNow.
  • Experience designing large-scale distributed systems.
  • Experience designing and developing software oriented towards systems or network automation.
  • Proficient with administration, automation, and orchestration of large-scale Windows and Linux environments using configuration management solutions such as DSC and Ansible.
  • Experience operating in large SQL databases with complex business logic.
  • Experience utilizing ML\AI technologies to accelerate your work.
  • Experience with Healthcare industry HIPAA regulations (similar regulated industry experience considered ie. PCI, SOX)
  • Experience working in an Agile and / or SAFe environment.
  • CERTIFICATION / TRAINING

  • Candidates with relevant certifications are preferred, including but not limited to the following : ITIL Foundations Configuration : RHCE-Ansible Kubernetes : CKA, KCSP Linux : RHCE, CompTIA Linux+, GCUX, LPI Microsoft : Azure Administrator, Azure DevOps Engineer, MCSE
  • About Us

    Founded in 1999, Homecare Homebase, a subsidiary of Hearst Corporation is a market leader in healthcare software development providing mobile cloud-based solutions for clinical, operational, and financial improvement of home-based care throughout the United States. Our software enables real-time solutions for wireless information exchange and communication between the office and clinicians in the field.

    Our success is fueled by our talented teams that are driven by their passion to make a difference in patient care. Our employees work in a culture that is guided by our  CARES values : Care, Act, Respect, Excel, and Smile (a positive attitude). If you want to work in a role where your skills have a direct influence on empowering patient care, Homecare Homebase is the next step in your career.

    What You Can Expect from Us

    At Homecare Homebase, we don't just help our clients succeed; we help our employees succeed. Competitive pay, robust benefits, and professional development opportunities are a few of the many reasons that Homecare Homebase is a great place to build your career.

    Our Team Members Also Enjoy

    Meaningful work. Our employees often tell us that their work gives them a sense of purpose because it makes a difference in the lives of clinicians and home-based care staff, as well as the patients they serve.

    Leaders who care. President Luke Rutledge has continued the mission to create a culture that cares – one that appreciates and looks after its people. As a result, being an employee of HCHB feels like being a member of the family.

    Flexibility. We value work-life balance because we know that happy employees create happy clients. That's why Homecare Homebase offers both full and part-time career opportunities to fit life's unique demands.

    A company that gives back. Every year, Homecare Homebase proudly supports numerous charitable fundraising initiatives that align with our mission of empowering exceptional care and helping others in need.

    Sound like a good fit? We’d love to hear from you.

    This position does not provide sponsorship. All applicants should either be US Citizens or Permanent Residents eligible to work in the US without immigration restrictions.

    LI-CC1

    LI-Hybrid

    Create a job alert for this search

    Reliability Engineer • Louisville, KY, United States

    Related jobs
    Site Reliability Engineer

    Site Reliability Engineer

    Point72 Asset Management, L.PSingapore, Michigan, Singapore
    Full-time
    A Career with Point72’s Technology Team.As Point72 reimagines the future of investing, our Technology group is constantly improving our company’s IT infrastructure, positioning us at the forefront ...Show moreLast updated: 30+ days ago
    MTS Reliability engineer

    MTS Reliability engineer

    Goebel Fixture CompanySingapore, Michigan, Singapore
    Full-time
    GlobalFoundries is a leading full-service semiconductor foundry providing a unique combination of design, development, and fabrication services to some of the world’s most inspired technology compa...Show moreLast updated: 30+ days ago
    Security Platform Engineer

    Security Platform Engineer

    Applicable LimitedSingapore, Michigan, Singapore
    Full-time
    Join a company that is pushing the boundaries of what is possible.We are renowned for our technical excellence and leading innovations, and for making a difference to our clients and society.Our wo...Show moreLast updated: 30+ days ago
    Site Reliability Engineer, Emerging Technologies

    Site Reliability Engineer, Emerging Technologies

    Apple Inc.Singapore, Michigan, Singapore
    Full-time
    Site Reliability Engineer, Emerging Technologies.At Apple, new ideas have a way of becoming excellent products, services, and customer experiences very quickly. Bring passion and dedication to your ...Show moreLast updated: 30+ days ago
    Project Reliability Engineer

    Project Reliability Engineer

    MICHELINLOUISVILLE, KY
    Full-time
    Michelin has an immediate opportunity at our American Synthetic Rubber Company (ASRC) chemical plant in Louisville, KY for a Project Reliability Engineer. The Project Reliability Engineer ensures th...Show moreLast updated: 13 days ago
    Reliability Engineer

    Reliability Engineer

    Tower Research CapitalSingapore, Michigan, Singapore
    Full-time
    Be among the first 25 applicants.Get AI-powered advice on this job and more exclusive features.Tower Research Capital is a leading quantitative trading firm founded in 1998.Tower has built its busi...Show moreLast updated: 30+ days ago
    Reliability Engineer

    Reliability Engineer

    QuEST Global Services Pte. LtdUnited States
    Full-time
    Quest Global is an organization at the forefront of innovation and one of the world’s fastest growing engineering services firms with deep domain knowledge and recognized expertise in the top OEMs ...Show moreLast updated: 30+ days ago
    Site Reliability Engineer

    Site Reliability Engineer

    Visier Inc.Singapore, Michigan, Singapore
    Full-time
    Visier is the global leader in AI-powered people analytics, workforce planning, and compensation management solutions, helping organizations gain a Workforce AI Edge. With over 60,000 customers in 7...Show moreLast updated: 30+ days ago
    Site Reliability Engineer

    Site Reliability Engineer

    Point72Singapore, Michigan, Singapore
    Full-time
    As part of Point72’s Technology Team, you will focus on developing and maintaining complex, distributed, real-time systems that support our Global Macro business. Your responsibilities will include ...Show moreLast updated: 30+ days ago
    Site Reliability Engineer

    Site Reliability Engineer

    WomenTech NetworkRemote, US
    Remote
    Full-time
    NetApp’s Engineering Tools and Services (ETS) organization is responsible for the automation and testing infrastructure utilized by our product development and QA teams. We are looking for a Site Re...Show moreLast updated: 30+ days ago
    Platform Engineer

    Platform Engineer

    VirtualVocationsLouisville, Kentucky, United States
    Full-time
    A company is looking for a Platform Engineer to join their growing team.Key Responsibilities Design, build, and maintain scalable and resilient cloud infrastructure Implement and optimize monito...Show moreLast updated: 30+ days ago
    Site Reliability Engineer

    Site Reliability Engineer

    EpamRemote, US
    Remote
    Full-time
    With 3 to 5 years of experience in Site Reliability Engineering, DevOps, or Infrastructure, you will play a crucial role in elevating our capabilities and ensuring high-impact, internet-facing prod...Show moreLast updated: 30+ days ago
    Site Reliability Engineer

    Site Reliability Engineer

    Razer Inc.Singapore, Michigan, Singapore
    Full-time
    Be among the first 25 applicants.Joining Razer will place you on a global mission to revolutionize the way the world games. Administer, monitor, and manage cloud-scale production environments for we...Show moreLast updated: 30+ days ago
    Site Reliability Engineer - Observability

    Site Reliability Engineer - Observability

    Krisv Consulting Services Pte LtdSingapore, Michigan, Singapore
    Full-time
    We are seeking talented and driven professionals to join our Site Reliability Engineering (SRE) team.This role involves helping organizations enhance the availability, performance, and resilience o...Show moreLast updated: 30+ days ago
    Platform Engineer

    Platform Engineer

    Digital Onboarding(Multiple States), US
    Full-time
    Quick Apply
    About the Role We are seeking a skilled and motivated Platform Engineer to join our team.In this role, you will be responsible for designing, building, and maintaining the infrastructure and develo...Show moreLast updated: 30+ days ago
    Principal Site Reliability Engineer, Platform

    Principal Site Reliability Engineer, Platform

    GEMINI(USA)
    Remote
    Full-time
    Our Platform organization’s purpose is to enable Gemini to scale effectively and empower our engineering teams to focus on building innovative financial products and experiences for individuals aro...Show moreLast updated: 30+ days ago
    Platform Engineer

    Platform Engineer

    Third RepublicNew Jersey, , United States
    Permanent
    Salary range : $165,100 - $188,500.Role : Platform Engineer – Cloud Native.Location : New Jersey - Flexible Hybrid.Please Note – FinTech Experience or Background is important for this position •.Join a...Show moreLast updated: 30+ days ago
    Regional Reliability Engineer II

    Regional Reliability Engineer II

    CintasLouisville, KY, US
    Full-time
    Cintas is seeking a Regional Reliability Engineer II to assist Group Vice President(s), location management teams and Corporate Quality and Engineering in overseeing long range capacity planning fo...Show moreLast updated: 1 day ago