About Us
Would you like to join one of the fastest-growing organizations with a goal of using the latest AI, GenAI, LLM, Cloud, and Digital Technologies to advance drug development and improve patient care pathways? WriteMed.AI helps Biopharma and Life Sciences companies reduce time to write medical publications and regulatory paperwork.
Site Reliability Engineer
Location : Atlanta, GA; Miami, FL; Cambridge, MA; San Francisco, CA; Towson, MD
Role Overview
Our technical team supports our customers’ missions with a spirit of innovation across all technologies, including AI, GenAI, LLM, Compute, Storage, Database, Big Data, Application-level Services, Networking, Serverless, Deployment, Security, and more. This is an opportunity to partner with our principal AI Architects, Data Scientists, and Engineers to maintain a robust and secure technical foundation for our customers, ranging from small Biotech companies to large Pharmaceutical firms.
Qualifications
- Passionate about learning and evolving with current technological trends
- Engineering degree or related technical discipline, or equivalent work experience
- Experience coding in higher-level languages (e.g., Python, JavaScript, C++, or Java)
- Knowledge of Cloud-based applications & Containerization Technologies
- Understanding of metric generation, log aggregation, time-series databases, and distributed tracing
- Experience with industry standards like Terraform, Ansible
- Fundamentals in Network Design, Cloud architecture, Security, or Computer Science
- At least 5 years of hands-on experience in Engineering or Cloud
- Minimum 5 years of experience with cloud platforms (e.g., GCP, AWS, Azure)
- At least 3 years of experience in configuration and maintenance of applications or systems infrastructure for large-scale customer-facing companies
- Experience with distributed system design and architecture
Responsibilities
Develop software solutions to support service delivery processesBuild and manage CI / CD pipelines, automated testing, capacity planning, performance analysis, monitoring, alerting, chaos engineering, and auto-remediationInnovate relentlessly to ensure a flawless customer experienceEngage in the lifecycle of services from conception to EOL, including system designProvide consulting and capacity planningDefine and deploy standards related to System Architecture, Service Delivery, metrics, and operational automationSupport services, product, and engineering teams with tooling and frameworks to increase availability and incident responseImprove system performance and efficiency through automation and process refinementCollaborate with engineering teams to deliver reliable systemsIncrease operational efficiency and quality by treating operational challenges as software engineering problemsMentor junior team members and champion Site Reliability EngineeringParticipate in incident response, including on-call dutiesPartner with stakeholders to influence technical and business outcomesBenefits
Comprehensive benefits supporting your personal and professional growth, including wellness programs, tuition reimbursement, expense programs, student loan repayment, childcare, and pet insuranceInclusive culture with active employee resource groups and supportive leadershipSalary range : $140,300 to $191,550, with variations based on skills, experience, and locationEligibility for short-term and long-term incentives as part of total compensation#J-18808-Ljbffr