Staff Machine Learning Engineer, LLM Fine Tuning (Verilog / RTL Applications) :
HIGHLIGHTS :
- Location : San Jose, CA (Onsite / Hybrid)
- Schedule : Full Time
- Position Type : Contract
- Hourly : BOE
Overview :
Our client is building privacy preserving LLM capabilities that help hardware design teams reason over Verilog / SystemVerilog and RTL artifacts-code generation, refactoring, lint explanation, constraint translation, and spec to RTL assistance. Our client is looking for a Staff level engineer to technically lead a small, high leverage team that fine tunes and productizes LLMs for these workflows in a strict enterprise data privacy environment.
You don't need to be a Verilog / RTL expert to start;curiosity, drive, and deep LLM craftsmanship matter most. Any HDL / EDA fluency is a strong plus.
What you'll do (Responsibilities) :
Own the technical roadmap for Verilog / RTL focused LLM capabilities-from model selection and adaptation to evaluation, deployment, and continuous improvement.Lead a hands on team of applied scientists / engineers : set direction, unblock technically, review designs / code, and raise the bar on experimentation velocity and reliability.Fine tune and customize models using state of the art techniques (LoRA / QLoRA, PEFT, instruction tuning, preference optimization / RLAIF) with robust HDL specific evals :Compile / lint / simulate based pass rates, pass@k for code generation, constrained decoding to enforce syntax, and "does it synthesize"checks.Design privacy first ML pipelines on AWS :Training / customization and hosting using Amazon Bedrock (including Anthropic models) where appropriate;SageMaker (or EKS + KServe / Triton / DJL) for bespoke training needs.Artifacts in S3 with KMS CMKs;isolated VPC subnets & PrivateLink (including Bedrock VPC endpoints), IAM least privilege, CloudTrail auditing, and Secrets Manager for credentials.Enforce encryption in transit / at rest, data minimization, no public egress for customer / RTL corpora.Stand up dependable model serving : Bedrock model invocation where it fits, and / or low latency self hosted inference (vLLM / TensorRT LLM), autoscaling, and canary / blue green rollouts.Build an evaluation culture : automatic regression suites that run HDL compilers / simulators, measure behavioral fidelity, and detect hallucinations / constraint violations;model cards and experiment tracking (MLflow / Weights & Biases).Partner deeply with hardware design, CAD / EDA, Security, and Legal to source / prepare datasets (anonymization, redaction, licensing), define acceptance gates, and meet compliance requirements.Drive productization : integrate LLMs with internal developer tools (IDEs / plug ins, code review bots, CI), retrieval (RAG) over internal HDL repos / specs, and safe tool use / function calling.Mentor & uplevel : coach ICs on LLM best practices, reproducible training, critical paper reading, and building secure by default systems.What you'll bring (Minimum qualifications) :
10+ years total engineering experience with 5+ years in ML / AI or large scale distributed systems;3+ years working directly with transformers / LLMs.Proven track record shipping LLM powered features in production and leading ambiguous, cross functional initiatives at Staff level.Deep hands on skill with PyTorch, Hugging Face Transformers / PEFT / TRL, distributed training (DeepSpeed / FSDP), quantization aware fine tuning (LoRA / QLoRA), and constrained / grammar guided decoding.AWS expertise to design and defend secure enterprise deployments, including :Amazon Bedrock (model selection, Anthropic model usage, model customization, Guardrails, Knowledge Bases, Bedrock runtime APIs, VPC endpoints)SageMaker (Training, Inference, Pipelines), S3, EC2 / EKS / ECR, VPC / Subnets / Security Groups, IAM, KMS, PrivateLink, CloudWatch / CloudTrail, Step Functions, Batch, Secrets Manager.Strong software engineering fundamentals : testing, CI / CD, observability, performance tuning;Python a must (bonus for Go / Java / C++).Demonstrated ability to set technical vision and influence across teams;excellent written and verbal communication for execs and engineers.Nice to have (Preferred qualifications) :
Familiarity with Verilog / SystemVerilog / RTL workflows : lint, synthesis, timing closure, simulation, formal, test benches, and EDA tools (Synopsys / Cadence / Mentor).Experience integrating static analysis / AST aware tokenization for code models or grammar constrained decoding.RAG at scale over code / specs (vector stores, chunking strategies), tool use / function calling for code transformation.Inference optimization : TensorRT LLM, KV cache optimization, speculative decoding;throughput / latency trade offs at batch and token levels.Model governance / safety in the enterprise : model cards, red teaming, secure eval data handling;exposure to SOC2 / ISO 27001 / NIST frameworks.Data anonymization, DLP scanning, and code de identification to protect IP.What success looks like :
90 daysBaseline an HDL aware eval harness that compiles / simulates;establish secure AWS training & serving environments (VPC only, KMS backed, no public egress).
Ship an initial fine tuned / customized model with measurable gains vs. Base (e.G., +X% compile pass rate, Y% lint findings per K LOC generated).180 daysExpand customization / training coverage (Bedrock for managed FMs including Anthropic;SageMaker / EKS for bespoke / open models).
Add constrained decoding + retrieval over internal design specs;productionize inference with SLOs (p95 latency, availability) and audited rollout to pilot hardware teams.12 monthsDemonstrably reduce review / iteration cycles for RTL tasks with clear metrics (defect reduction, time to lint clean, % auto fix suggestions accepted), and a stable MLOps path for continuous improvement.
(Security & privacy by design)Customer and internal design data remain within private AWS VPCs;access via IAM roles and audited by CloudTrail;all artifacts encrypted with KMS.
No public internet calls for sensitive workloads;Bedrock access via VPC interface endpoints / PrivateLink with endpoint policies;SageMaker and / or EKS run in private subnets.Data pipelines enforce minimization, tagging, retention windows, and reproducibility;DLP scanning and redaction are first class steps.We produce model cards, data lineage, and evaluation artifacts for every release.Tech you'll touch :Modeling : PyTorch, HF Transformers / PEFT / TRL, DeepSpeed / FSDP, vLLM, TensorRT LLM
AWS & MLOps : Amazon Bedrock (Anthropic and other FMs, Guardrails, Knowledge Bases, Runtime APIs), SageMaker (Training / Inference / Pipelines), MLflow / W&B, ECR, EKS / KServe / Triton, Step FunctionsPlatform / Security : S3 + KMS, IAM, VPC / PrivateLink (incl. Bedrock), CloudWatch / CloudTrail, Secrets ManagerTooling (nice to have) :
HDL toolchains for compile / simulate / lint, vector stores (pgvector / OpenSearch), GitHub / GitLab CI"We are GTN The Go To Network"