Senior Kubernetes Engineer
Do you want to tackle the biggest questions in finance with near infinite compute power at your fingertips?
G-Research is a leading quantitative research and technology firm, with offices in London and Dallas.
We are proud to employ some of the best people in their field and to nurture their talent in a dynamic, flexible and highly stimulating culture where world-beating ideas are cultivated and rewarded.
This is a hybrid role based in our new Dallas infrastructure hub where we work on the latest technologies in a cutting-edge environment.
The Role
We are seeking a highly skilled Senior Kubernetes Engineer to join our Platform Engineering function in Dallas. In this role, you will design, implement, and optimise GPU-accelerated container platforms at scale, enabling high-performance workloads (AI / ML, HPC, LLM training) across hybrid or on-prem environments. You will have deep expertise with both NVIDIA and Kubernetes ecosystems, including GPU scheduling, device plugins and custom operators.
Key responsibilities of the role include :
- Architecting and operating Kubernetes clusters optimised for GPU workloads, leveraging NVIDIA GPU Operator, Network Operator and DCGM
- Developing, deploying and maintaining custom Kubernetes operators and controllers to automate infrastructure services
- Integrating NVIDIA device plugins, Multi-Instance GPU (MIG) and GPU sharing features into the scheduling layer
- Optimising GPU utilisation and job placement through scheduler extensions, such as kube-scheduler plugins, Slurm and Volcano
- Collaborating with HPC, ML and DevOps teams to ensure multi-tenant, high-throughput cluster performance
- Driving observability and telemetry integrations using Prometheus, Grafana, DCGM Exporter and OpenTelemetry
- Implementing secure multi-user and multi-namespace GPU isolation, with RBAC and policy enforcement, such as OPA or Gatekeeper
- Maintaining CI / CD pipelines for Kubernetes infrastructure using GitOps, ArgoCD and FluxCD
- Contributing to infrastructure-as-code, using Terraform, Helm, and Kustomize
- Participating in performance tuning, incident response and production readiness reviews
Who Are We Looking For?
The ideal candidate will have the following skills and experience :
Extensive experience with Kubernetes in production-grade environments and working with NVIDIA and Kubernetes, including GPU Operator, device plugin, NVML, MIG and DCGMProficiency in Go or Python for operator development and Kubernetes controller logicDeep understanding of Kubernetes internals, including CRDs, RBAC, custom controllers and scheduler extensionsExperience with GPU-intensive workloads, for example for LLMs, training pipelines and scientific computingHands-on experience with Helm, Kustomize and GitOps workflowsFamiliarity with CNI plugins, especially NVIDIA CNI and MultusExperience with monitoring GPU metrics and cluster health using Prometheus and DCGM ExporterThe following is beneficial :
Knowledge of container runtimes with CRI-O, containerd and NVIDIA Container ToolkitContributions to open-source projects in the Kubernetes or NVIDIA ecosystemPreferred experience working with cilium or CNI pluginsWhy Should You Apply?
Market-leading compensation plus annual discretionary bonusLunch provided in the office (via GrubHub)Informal dress code and excellent work / life balanceExcellent paid time off allowance of 25 daysSick days, military leave, and family and medical leaveGenerous 401(k) plan16-weeks' fully paid parental leaveMedical and Prescription, Dental, and Vision insuranceLife and Accidental Death & Dismemberment (AD&D) insuranceEmployee Assistance and Wellness programsGenerous relocation allowance and supportGreat selection of office snacks, and hot and cold drinksOn-site gym and car parkingThis role is employed through our US affiliate.
G-Research is committed to cultivating and preserving an inclusive work environment. We are an ideas-driven business and we place great value on diversity of experience and opinions. We want to ensure that applicants receive a recruitment experience that enables them to perform at their best. If you have a disability or special need that requires accommodation please let us know in the relevant section.