We are seeking an experienced PostgreSQL DBA to lead the migration and ongoing management of our mission-critical database environment. Currently running on self-hosted Postgres with high availability, our large-scale (80+ TB) PostgreSQL instance supports an application that manages a fleet of autonomous robots. This role will be pivotal in maintaining our database in a Kubernetes environment using the CrunchyData PostgreSQL Operator, ensuring seamless operation and continuous availability.
Key Responsibilities
- Postgres Instance Management : Oversee the management, configuration, and operation of CrunchyData PostgreSQL Operator instances hosted on AWS EKS or Azure AKS.
- Performance Optimization & Management : Optimize and fine-tune a high-volume PostgreSQL database (80+ TB) for maximum efficiency and uptime. Develop and implement performance monitoring, alerting, and troubleshooting strategies.
- High Availability & Disaster Recovery : Architect and manage high-availability solutions, including automatic failover and backup strategies. Implement and maintain auto-expanding storage using LVM and EBS volumes to meet growing data demands.
- Collaboration & Process Improvement : Work closely with DevOps, infrastructure, and application teams to ensure seamless integration and robust performance. Develop documentation, procedures, and runbooks for backup, recovery, and routine maintenance.
- Systems Administration : Manage underlying Linux systems supporting the PostgreSQL deployment. Ensure system-level security, stability, and scalability in a dynamic cloud environment.
Required Qualifications
PostgreSQL Expertise : Extensive hands-on experience with PostgreSQL in high-demand, mission-critical environments. Proven track record of managing large-scale databases (80+ TB or similar scale).Kubernetes Proficiency : Strong experience with Kubernetes deployments, ideally within AWS EKS. Familiarity with containerized environments and orchestration best practices.Linux Administration : Solid background in Linux system administration, performance tuning, and security best practices.AWS & Storage Solutions : Experience with AWS services, especially relating to storage (e.g., EBS) and cloud networking. Practical knowledge of using LVM for managing expandable storage solutions.High Availability & DR : Demonstrated expertise in designing and managing high-availability database systems with robust disaster recovery plans.Bonus Qualifications
Direct experience with the CrunchyData PostgreSQL Operator.Experience in environments where continuous uptime is critical (e.g., robotics, IoT).Familiarity with auto-scaling, container security, and CI / CD pipelines related to database deployments.Familiarity with Apache Kafka, Apache Pulsar, or other distributed event stream systems.Experience with systems deployed in Azure or AWS.J-18808-Ljbffr