A fast-growing, deeply technical AI company is looking for a Research Engineer to join a small, high-performing team building the next generation of agentic LLM systems for software development. This is an opportunity to work at the frontier of AI, helping design and evaluate models that can understand, write, and reason about complex codebases.
We are looking for someone who is passionate about AI for code , obsessed with understanding how LLMs behave, and excited to help push the boundaries of autonomous software engineering.
What You’ll Work On
- Design and run large-scale LLM evaluations , benchmarks, and experiments - particularly for code understanding and generation tasks.
- Build tools, datasets, and research frameworks that measure LLM reasoning, reliability, grounding, and hallucination reduction.
- Collaborate closely with product and engineering teams to improve model performance across real-world developer workflows.
- Work with frontier LLMs (across multiple providers) to analyze behavior, provide structured feedback, and drive improvements in accuracy, reliability, and autonomy.
- Prototype new capabilities for long-horizon agentic systems, including retrieval, planning, debugging, and automated coding workflows.
- Contribute to model fine-tuning and alignment work, including SFT, LoRA, DPO, or RLHF pipelines.
- Build production-quality research tools and evaluation pipelines that scale to real engineering environments.
What We’re Looking For
3+ years of experience in AI / ML engineering , research engineering , or similar fields.Strong coding fundamentals and comfort working in production environments (Python preferred).Hands-on experience with LLMs (training, fine-tuning, benchmarking, or evaluation).Experience with transformer architectures , retrieval-augmented generation (RAG), or reasoning-heavy LLM workflows.Deep interest in code generation , AI developer tools, or program synthesis (professional or personal projects).Ability to think rigorously about model behavior, failure cases, and reliability.Comfortable working in a fast-paced, highly collaborative environment with end-to-end ownership.Bonus Points
Experience with AI systems for code : program synthesis, automated debugging, static analysis, code search, etc.Prior work with agentic or multi-step reasoning systems.Contributions to evaluation frameworks, benchmarks, or open-source model tooling.Background in compilers, programming languages, distributed systems, or developer tooling.Research experience at top universities, labs, or advanced ML teams.Why This Role Is Exciting
Work directly with cutting-edge LLMs and agentic tooling.Influence the direction of an emerging AI product with real-world impact.Solve some of the hardest problems in AI today : evaluation, grounding, reasoning, and autonomous coding.Join an elite team with deep experience in LLMs, distributed systems, and developer productivity tools.Massive opportunity for ownership, impact, and career growth.If you’re passionate about AI for software engineering , love solving deep technical problems, and want to work at the frontier of LLM research, we’d love to hear from you.