About HUD
HUD (YC W25) is developing agentic evals for Computer Use Agents (CUAs) that browse the web. Our CUA Evals framework is the first comprehensive evaluation tool for CUAs.
The Role
We are looking for a systems / full-stack engineer to help build out the technical infrastructure that enables comprehensive CUA testing at scale.
Responsibilities
- Build out HUD's existing CUA evaluation framework
- Optimize our evaluation infrastructure at scale
Experience & Technical Skills
Experience with AWS, Kubernetes, Docker, Redis, Linux, Python, PostgreSQLSystems design, performance security, CI / CD management experience preferredYou May Be a Good Fit If You
Have hands-on experience with scalable infrastructure design and implementationHave contributed to large-scale system architecture projectsBuilt reliable, high-performance distributed systemsWorked with containerized applications and orchestration platformsStrong Candidates May Have
Startup experience in early-stage technology companies with ability to work independently in fast-paced environmentsStrong communication skills for remote collaboration across time zonesFamiliarity with current AI tools and LLM capabilitiesUnderstanding of LLM evaluation frameworks and methodologiesEvidence of rapid learning and adaptability in technical environmentsTeam & Company Details
Team Size : ~15 people currently, mostly full-time in-person, but some remote.Our team includes 4 international Olympiad medallists (IOI, ILO, IPhO), serial AI startup founders, and researchers with publications at ICLR, NeurIPS, etc.Company stage : We have received $2 million in seed funding, plus strong demand and revenue growth. We are scaling profitably and fast to meet demand.Logistics
Employment : Full-time.Location : On-site only, in the San Francisco Bay Area or Singapore (offices).Visa Sponsorship : We provide support for relocation and visas for strong full-time candidates to the USA or Singapore. For part-time / contract / internship arrangements, we will work fully remote.Timeline : Applications are rolling. The process should involve 2 technical interviews and a 1-week work trial.Applications : For inquiries, please contact recruiting@hud.so
#J-18808-Ljbffr