About the Role
We are seeking a skilled DevOps Engineer with strong Site Reliability Engineering (SRE) capabilities to design, build, and maintain scalable infrastructure, optimize CI / CD pipelines, and ensure reliability across critical systems. This role requires both hands-on technical expertise and strong problem-solving, collaboration, and communication skills .
Responsibilities
- Design, implement, and manage CI / CD pipelines.
- Develop and maintain infrastructure on Azure DevOps, AWS (EC2, S3, Lambdas, RDS, IAM) , and Kubernetes .
- Automate system management using Python, PowerShell, Ansible .
- Manage containerized environments (Docker) and optimize cluster operations.
- Implement and monitor application performance using AppDynamics, Grafana, Zabbix, Datadog, or Dynatrace.
- Configure and monitor logging and observability tools ( ELK, Splunk, Prometheus, CloudWatch ).
- Ensure secure software delivery via SonarQube, JFrog Artifactory .
- Collaborate with developers to review code, troubleshoot performance issues, and enforce best practices.
- Proactively identify bottlenecks, scalability issues, and reliability risks.
- Document systems, processes, and post-mortem learnings.
Required Skills
Infrastructure & Cloud : Azure DevOps, AWS (E2+), Kubernetes, Docker.Automation & Scripting : Python, PowerShell, Ansible, Core Java.CI / CD & Version Control : End-to-end pipeline design & optimization.Monitoring & Observability : AppDynamics, Grafana, Zabbix, Datadog, Dynatrace, ELK, Splunk, Prometheus.Security & Quality Tools : JFrog Artifactory, SonarQube.Professional Competencies
Strong root cause analysis and incident response skills.Capacity planning and system scalability expertise.Effective communicator with both technical and non-technical stakeholders.Self-motivated, proactive, and resourceful.Quality-focused, delivering work to high standards with minimal rework.Continuous learner who shares knowledge and mentors others.