Site Reliability EngineerSignify Technology • Palo Alto, CA, United States

Site Reliability Engineer

Signify Technology • Palo Alto, CA, United States

7 days ago

Job type

Full-time

Job description

Job title : Site Reliability Engineer

Job type : Full time

Rate : Competitive, based on experience

Role Location : On-Site, Palo Alto

About the Role :

We are a technology startup advancing healthcare with a safety-focused AI platform that assists medical professionals by managing patient communications, including check-ins, reminders, and follow-up care. We are seeking a highly skilled Senior Site Reliability Engineer to join our team. In this role responsibilities will include designing and implementing infrastructure automation, continuous integration and delivery pipelines, and monitoring and scaling the infrastructure that powers our healthcare AI platform. You will work closely with software engineers, research scientists, and other cross-functional teams to develop and maintain reliable and scalable infrastructure that enables rapid iteration and deployment of our products.

Key Responsibilities :

Design and implement infrastructure automation and deployment pipelines using tools such as Terraform
Implement and maintain monitoring and logging systems to ensure the reliability and performance of our healthcare AI platform
Work closely with software engineers to design and deploy scalable, fault-tolerant, and secure production systems on cloud platforms such as AWS, GCP, or Azure
Develop and maintain security and compliance policies and procedures for our healthcare AI platform
Collaborate with cross-functional teams to troubleshoot and resolve complex issues related to infrastructure, deployment, and operations
Implement and maintain disaster recovery and business continuity plans
Develop and maintain documentation related to infrastructure, deployment, and operations
Mentor and provide technical guidance to junior engineers

Qualifications :

Bachelor's or Master's degree in Computer Science, Computer Engineering, or a related field

At least 5 years of professional experience as SRE

Strong skills in building cloud infra orchestration systems (Operators) using Python with some expertise in Go ideally

Expertise in infrastructure automation and deployment tools such as Terraform , or GitLab CI / CD

Experience with cloud platforms such as A WS, GCP, or Azure

Strong knowledge of containerization technologies such as Docker and Kubernetes

Experience with monitoring and logging tools such as ELK, Grafana, or Datadog

Familiarity with security and compliance best practices and tools such as HashiCorp Vault, AWS KMS, or Azure Key Vault

Strong problem-solving skills and ability to work independently and collaboratively in a team environment

Excellent communication and interpersonal skills

Preferred :

Experience implementing HIPAA and SOC2 compliance in a plus

Experience working in an HPC Environment is a plus

Accessibility Statement :

Read and apply for this role in the way that works for you by using our Recite Me assistive technology tool. Click the circle at the bottom right side of the screen and select your preferences. We make an active choice to be inclusive towards everyone every day. Please let us know if you require any accessibility adjustments through the application or interview process. Our Commitment to Diversity, Equity, and Inclusion : Signify’s mission is to empower every person, regardless of their background or circumstances, with an equitable chance to achieve the careers they deserve. Building a diverse future, one placement at a time.

Check out our DE&I page here

Create a job alert for this search

Site Reliability Engineer • Palo Alto, CA, United States