Job Description : AI Architect with RAG
Location : Plano, TX (Hybrid)
Experience Level : 8+ years (with at least 2+ years in GenAI / LLM-based solution design)
About the Role
We are seeking a RAG Architect to lead the design, development, and optimization of Retrieval-Augmented Generation (RAG) systems that integrate LLMs (Large Language Models) with enterprise data sources. The ideal candidate will combine expertise in AI architecture, data retrieval, vector databases, and LLM integration to build scalable, secure, and high-performing GenAI solutions.
Key Responsibilities
Design and architect end-to-end RAG systems, integrating LLMs with structured and unstructured data sources.
Define and implement document ingestion, chunking, embedding, and retrieval pipelines.
Evaluate and select vector databases (e.g., Pinecone, Milvus, FAISS, Chroma, Weaviate) and optimize retrieval performance.
Collaborate with data engineering and ML teams to design data indexing, caching, and query optimization strategies.
Integrate LLMs (OpenAI, Anthropic, Gemini, Llama, etc.) with enterprise backends through APIs or custom frameworks.
Implement prompt engineering, context management, and grounding techniques for accuracy and reliability.
Ensure compliance with data governance, privacy, and security standards.
Lead PoCs and pilots for RAG use cases such as chatbots, document summarization, knowledge assistants, and search systems.
Define MLOps / LLMOps practices for monitoring, evaluation, and model lifecycle management.
Stay current with advancements in GenAI frameworks (LangChain, LlamaIndex, Haystack, etc.) and emerging best practices.
Required Skills & Experience
Proven experience designing or deploying RAG or LLM-based applications.
Strong proficiency in Python, with experience in libraries like LangChain, LlamaIndex, Haystack, or semantic search frameworks.
Deep understanding of vector embeddings, semantic search, and information retrieval principles.
Experience with cloud AI services (Azure OpenAI, AWS Bedrock, Google Vertex AI, etc.).
Familiarity with data processing pipelines and API integration (REST, GraphQL).
Good understanding of prompt engineering, fine-tuning, and model evaluation techniques.
Knowledge of Docker / Kubernetes, Git, and CI / CD pipelines.
Excellent communication and documentation skills to explain technical concepts to business stakeholders.
Preferred Qualifications
Experience with enterprise-scale AI / ML system design.
Knowledge of data security frameworks (PII masking, RBAC, encryption).
Exposure to multi-modal RAG (text + image + audio retrieval).
Contributions to open-source RAG or GenAI frameworks.
Education
Bachelor's or Master's degree in Computer Science, Artificial Intelligence, Data Engineering, or a related field
Note : Momento USA is an Equal Opportunity / Affirmative Action Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, pregnancy, sexual orientation, gender identity, national origin, age, protected veteran status, or disability status.
Ai Architect • Plano, TX, United States