Role : LLM engineer
Location : Hybrid in San Jose (2x / wk on client site)
Duration : 2 Months
Model Development & Optimization : Design, train, fine-tune, and evaluate large language models (LLMs) for performance, efficiency, and alignment with product or research goals.
- Systems Integration & Deployment : Implement scalable inference pipelines, optimize serving infrastructure (e.g., quantization, caching, distillation), and integrate models into applications or APIs.
- Research & Cross-Functional Collaboration : Lead experimentation with new architectures, prompt-engineering techniques, or retrieval systems, and collaborate with product, data, and ML operations teams to translate research into production features. Model Development & Optimization : Design, train, fine-tune, and evaluate large language models (LLMs) for performance, efficiency, and alignment with product or research goals.
- Systems Integration & Deployment : Implement scalable inference pipelines, optimize serving infrastructure (e.g., quantization, caching, distillation), and integrate models into applications or APIs.
- Research & Cross-Functional Collaboration : Lead experimentation with new architectures, prompt-engineering techniques, or retrieval systems, and collaborate with product, data, and ML operations teams to translate research into production features.