Machine Learning Engineer

AlphaX Inc.San Francisco, California, United States

Hace 20 horas

Tipo de contrato

A tiempo completo

Descripción del trabajo

Overview

Machine Learning Engineer — Post-Training, Evaluation & Continuous Improvement

Location : Bay Area preferred (remote-first; ~10% in-office for workshops / on-sites)

Employment : Full-time (new grads welcome)

Internship / Co-op options available

About AlphaX

AlphaX builds financial reasoning models and agent workflows for professional investment research. Our stack spans RLHF / DPO post-training, reasoning workflow generation, agent tooling & model routing, and a financial data lake (prices, filings, transcripts, news).

The Role

Own the complete post-training life cycle for AlphaX’s financial reasoning model—from data curation through human-in-the-loop feedback, evaluation, regression gating, and continuous improvement. You’ll stand up and maintain the training / eval pipelines that turn research ideas into measurable production gains across our analyst-style task suite (e.g., earnings analysis, risk scoring, forecast memos).

What You’ll Do

Post-training & alignment

Run and refine SFT and preference-based training (RLHF, DPO, KTO or similar) on finance-specific datasets.

Train and version reward models and rubric-based scorers for reasoning quality, factuality, and safety.

Build ingestion & cleaning for filings, transcripts, market data, and analyst workflows; dedup, redact, and split with leakage controls.

Operate human-in-the-loop loops (experts, students, crowd) with clear rubrics and QA.

Design a multi-layer eval harness : unit tests for tools / prompts, scenario suites for research tasks, red-team probes, latency / cost tracking.

Implement automated A / B and canary gating with statistically sound decision rules and regression alerts.

Instrument chain-of-thought-free scoring proxies, tool-use success rates, and multi-step task completion.

Tune prompt policies, tool-calling strategies, and model routing (OpenAI / Claude / Gemini, etc.) behind a consistent interface.

Ship pipelines on GPUs, schedule jobs, track experiments, and maintain reproducible artifacts and datasets.

Add observability for drift, outliers, PII / financial compliance checks.

Close the loop : mine failures from production, generate counter-examples, synthesize new training / eval data, and re-train with tight feedback cycles.

Qualifications

BS / MS (or rising senior) in CS / EE / Math or equivalent with hands-on experience shipping post-training pipelines.

Practical experience with one or more : RLHF / DPO / KTO, reward modeling, or structured preference data.

Experience building training and evaluation pipelines end-to-end (data → train → eval → release), including experiment tracking (e.g., MLflow / W&B) and artifact / version control (e.g., DVC, Git-LFS).

Comfort reading financial text and reasoning about factuality & compliance (you don’t need to be an investor, just curious and precise).

Tech You’ll Touch

Python, PyTorch, Ray, MLflow / W&B, Airflow / Prefect, GPUs, Postgres / BigQuery, vector DBs, Docker / K8s, and major model APIs (OpenAI / Claude / Gemini).

Work Setup

Remote-first with ~10% in-office (Bay Area) for collaboration sprints, model & agent jam sessions, and eval workshops.

#J-18808-Ljbffr

Crear una alerta de empleo para esta búsqueda

Machine Learning Engineer • San Francisco, California, United States