We are seeking a highly skilled and experienced Software Systems Engineer who will be at the forefront of building our automation platform ecosystem – transforming the way we deliver IT infrastructure and services.
The successful candidate will be responsible for designing, implementing, and maintaining our automation and orchestration platforms, ensuring their optimal performance, scalability, and reliability in a dynamic and fast-paced environment.
The candidate will also be a member of a larger platform team and will assist with managing and troubleshooting infrastructure issues related to server OS, virtualization, and container orchestration platforms.
This role is ideal for someone who thrives on building systems from the ground up, enjoys solving complex operational challenges, and has a passion for enabling others through automation.
Key Responsibilities
- Design, deploy, and administer automation platforms such as but not limited to Terraform Enterprise, Ansible Automation Platform, Vault, and Packer.
- Collaborate with development, operations, security, and COE teams to ensure seamless integration and secure & consistent automation practices.
- Establish and develop operational standards, documentation, and lifecycle management processes.
- Integrate self-service, CMDB, platform security, secrets management, observability, and other solutions.
- Monitor system performance, troubleshoot issues, and optimize the platform for high availability and resilience.
- Implement and manage CI / CD pipelines and GitOps workflows using tools such as GitLab, Jenkins, etc.
- Provide guidance and training to other engineers on automation platforms and related technologies and develop related documentation.
- Stay current with industry trends, emerging technologies, and best practices related to automation platforms, VMs, containerization, and cloud-native architectures.
- Provide supplemental VMWare & Kubernetes / Container Support : troubleshooting issues, deployment and configuration, storage and performance monitoring, and performing security updates.
- Participate in a 24 / 7 on-call rotation and respond to issues with systems and technologies supported by the team.
Required Skills
Proven expertise in automation platform deployment and administration (Terraform, Ansible, Packer, Vault, etc.).Strong understanding of platform automation architecture, components, and ecosystem, including hands-on experience.Automation pipeline development and CI / CD integration.Scripting and troubleshooting proficiency (Python, PowerShell, Bash, etc.).System reliability and observability (Prometheus, Grafana, etc.).Security and access management (SSO, RBAC, PKI).Strong problem-solving skills, with a proactive and collaborative approach to troubleshooting and issue resolution.Background in infrastructure lifecycle management and capacity planning.Solid foundation in infrastructure including understanding of database, networking, DNS, load balancing, storage, and backup concepts and solutions.Excellent interpersonal, communication, organizational and technical leadership skills.Required Years of Experience
Minimum of 5 -8 years of experience in software systems engineering, with a focus on infrastructure engineering, DevOps, or platform operations.Minimum of 2 years of hands-on experience administering automation or IaC platforms (Terraform, Ansible, etc.).