Role : Technical Architect AI
Location : San Jose, CA Onsite
Full-Time
Key technical skills :
As a Technical Architect specializing in LLMs and Agentic AI, you will own the architecture, strategy, and delivery of enterprise-grade AI solutions. You will work with cross-functional teams and customers to define the AI roadmap, design scalable solutions, and ensure responsible deployment of Generative AI across the organization :
Primary Responsibilities :
Architect scalable and secure AI / ML / LLM platform solutions including data, model, and inference pipelines.
Establish enterprise reference architectures, reusable components, best practices, and governance standards for AI adoption.
Integrate cloud-native, open-source, and enterprise tools such as vector databases, feature stores, registries, and orchestration frameworks.
Implement automated MLOps / LLMOps workflows covering deployment, monitoring, observability, compliance, and performance optimization.
Collaborate with cross-functional teams (engineering, data science, security, and product) to align platform capabilities with business goals and drive adoption.
Secondary Responsibilities :
Support GenAI and AI application teams by providing platform enablement, solution advisory, and architecture reviews.
Conduct technology research, PoCs, benchmarking, and evaluate emerging AI tools, frameworks, and deployment patterns.
Drive knowledge sharing through documentation, workshops, training sessions, and internal community building initiatives.
Provide guidance on cost estimation, usage monitoring, finops optimization, and capacity planning.
Partner with security, compliance, and cloud teams to ensure alignment with regulatory, data privacy, and policy frameworks.
Primary Skills :
6-10 years of experience in Designing and implementing large-scale distributed systems, microservices, serverless, and event-driven architectures.
5-8 years of experience n Cloud-native architecture experience in Azure / AWS / GCP including networking, storage, compute scaling, GPU workloads, and managed AI services.
5-8 years of experience with platform components, API design, integration patterns, and high-performance compute architecture.
4-7 years of experience building or integrating AI / ML platforms, pipelines, model lifecycle components, inference gateways, and / or enterprise GenAI frameworks.
3-6 years of experience using AI platform tools such as Databricks, Vertex AI, Azure AI Studio, AWS Bedrock, LangChain, PromptFlow, Ray, Kubeflow, MLflow, Airflow, Kafka, etc.
2-5 years of experience in designing and integrating vector database solutions such as Pinecone, Weaviate, FAISS, Milvus, Qdrant, Elastic, OpenSearch, CosmosDB Vector.
2-3 years of experience in LLM architectures, embeddings, tokenization, prompt engineering, evaluation strategies, hallucination reduction, and RAG patterns.
2-3 years of experience building GenAI applications, agent workflows, or knowledge retrieval systems using frameworks like LangChain, LlamaIndex, GraphRAG, or custom implementations.
Secondary Skills :
6-10 years of experience in Designing and implementing large-scale distributed systems, microservices, serverless, and event-driven architectures.
5-8 years of experience n Cloud-native architecture experience in Azure / AWS / GCP including networking, storage, compute scaling, GPU workloads, and managed AI services.
5-8 years of experience with platform components, API design, integration patterns, and high-performance compute architecture.
4-7 years of experience building or integrating AI / ML platforms, pipelines, model lifecycle components, inference gateways, and / or enterprise GenAI frameworks.
3-6 years of experience using AI platform tools such as Databricks, Vertex AI, Azure AI Studio, AWS Bedrock, LangChain, PromptFlow, Ray, Kubeflow, MLflow, Airflow, Kafka, etc.
2-5 years of experience in designing and integrating vector database solutions such as Pinecone, Weaviate, FAISS, Milvus, Qdrant, Elastic, OpenSearch, CosmosDB Vector.
2-3 years of experience in LLM architectures, embeddings, tokenization, prompt engineering, evaluation strategies, hallucination reduction, and RAG patterns.
2-3 years of experience building GenAI applications, agent workflows, or knowledge retrieval systems using frameworks like LangChain, LlamaIndex, GraphRAG, or custom implementations.
Technical Architect • San Jose, CA, United States