Title : Sr. Gen AI MLOps Software Developer
Location : San Jose, CA
Schedule : On-site / In-office
Terms : 6 Months – Potential Extension
Top 3-5 Must Have Skills for this role :
- Strong skills in Python and data analysis libraries (Pandas, NumPy, SQL).
- Demonstrable experience or strong projects in LLM / RAG development.
- Strong proficiency in agentic LLM Libraries / Technologies like LangChain, LangGraph, AutoGen, CrewAI, etc.
- Familiarity with RL, techniques for fine-tuning LLMs (e.g., LoRA), and other emerging ML methodologies.
Responsibility
Play key role in design, development, and deployment of large-scale high-performance enterprise ready agent frameworks and tools.Collaborate with engineering team to understand specific needs and challenges of chip design and ensure our agent platform is well-suited to these needsDevelop and optimize retrieval and generation algorithms for enterprise data (text, code, and images) to build advanced AI applications.Design, implement, test, and continuously optimize end-to-end RAG pipelines, including data parsing, ingestion, prompt engineering, and chunking strategies.Collect & organize training / fine-tuning data and help build domain specific large language models.Optimize infrastructure for performance, scalability, and reliability, and ensure secure and efficient management of data.Stay ahead by engaging with the latest advancements in machine learning and AI to create state-of-the-art solutions.Qualifications
BS or MS Degree in Electrical Engineering, Computer Science / Engineering, or a related discipline (or equivalent experience).5+ years of proven industry experienceSkilled at rapidly taking products from concept to launch and scaling them massively by performance tuning and optimizing complex, globally distributed systemsDemonstrable experience or strong projects in LLM / RAG development.Strong skills in Python and data analysis libraries (Pandas, NumPy, SQL).Strong proficiency in agentic LLM Libraries / Technologies like LangChain, LangGraph, AutoGen, CrewAI, etc.Familiarity with RL, techniques for fine-tuning LLMs (e.g., LoRA), and other emerging ML methodologies.Optimize inference and infrastructure for low-latency, cost-effective operation (vLLM / TGI / Triton, batching, caching, quantization) on GPU / accelerators; support on-prem / VPC deployments with enterprise security controls.A proactive approach to problem-solving and a willingness to acquire new skills and knowledge as needed to achieve results.#J-18808-Ljbffr