Talent.com
Principal AI / ML Operations Engineer

Principal AI / ML Operations Engineer

Blackline Systems IncPleasanton, CA, United States
2 days ago
Job type
  • Full-time
Job description

Get to Know Us :

It's fun to work in a company where people truly believe in what they're doing!

At BlackLine, we're committed to bringing passion and customer focus to the business of enterprise applications.

Since being founded in 2001, BlackLine has become a leading provider of cloud software that automates and controls the entire financial close process. Our vision is to modernize the finance and accounting function to enable greater operational effectiveness and agility, and we are committed to delivering innovative solutions and services to empower accounting and finance leaders around the world to achieve Modern Finance.

Being a best-in-class SaaS Company, we understand that bringing in new ideas and innovative technology is mission critical. At BlackLine we are always working with new, cutting edge technology that encourages our teams to learn something new and expand their creativity and technical skillset that will accelerate their careers.

Work, Play and Grow at BlackLine!

Make Your Mark :

The Principal AI / ML Operations Engineer leads the architecture, automation, and operationalization of both machine learning and AI systems at scale. This role defines the strategy and technical standards for ML-Ops and AIOps across the organization, ensuring models and agents are evaluated, deployed, governed, and monitored with reliability, efficiency, and compliance. The candidate will collaborate across AI, data, and product engineering teams to drive best practices for serving, observability, automated retraining, evaluation flywheels, and operational guardrails for AI systems in production

You'll Get To :

Leadership and Strategy

  • Define enterprise-level standards and reference architectures for ML-Ops and AIOps systems.
  • Partner with data science, security, and product teams to set evaluation and governance standards (Guardrails, Bias, Drift, Latency SLAs).
  • Mentor senior engineers and drive design reviews for ML pipelines, model registries, and agentic runtime environments.
  • Lead incident response and reliability strategies for ML / AI systems.

AI System Deployment and Integration :

  • Lead the deployment of AI models and systems in various environments.
  • Collaborate with development teams to integrate AI solutions into existing workflows and applications.
  • Ensure seamless integration with different platforms and technologies.
  • Define and manage MCP Registry for agentic component onboarding, lifecycle versioning, and dependency governance.
  • Build CI / CD pipelines automating LLM agent deployment, policy validation, and prompt evaluation of workflows.
  • Develop and operationalize experimentation frameworks for agent evaluations, scenario regression, and performance analytics.
  • Implement logging, metering, and auditing for agent behavior, function calls, and compliance alignment.
  • Create scalable observability systems-tracking conversation outcomes, factual accuracy, latency, escalation patterns, and safety events.
  • Architect end-to-end guardrails for AI agents including prompt injection protection, identity-aware routing, and tool usage authorization.
  • Collaborate cross-functionally to standardize authentication, authorization, and session governance for multi-agent runtimes.
  • Model Deployment and Integration :

  • Architect and standardize model registries and feature stores to support version tracking, lineage, and reproducibility across environments.
  • Lead the deployment of machine learning models into production environments, ensuring scalability, reliability, and efficiency.
  • Collaborate with software engineers to integrate machine learning models into existing applications and systems.
  • Implement and maintain APIs for model inference.
  • Infrastructure and Environment Management :

  • Design and manage training infrastructure including distributed training orchestration, GPU / TPU resource allocation, and automatic scaling.
  • Implement CI / CD for model workflows using pipelines integrated with model validation, bias checks, and rollback automation.
  • Build standardized experimentation frameworks for reproducible training, tuning, and deployment cycles (MLflow, W&B, Kubeflow).
  • Manage and optimize the infrastructure required for machine learning operations in cloud.
  • Work closely with other teams to ensure the availability, security, and performance of machine learning systems.
  • Monitoring and Maintenance :

  • Implement robust monitoring solutions for deployed machine learning models to detect issues and ensure performance.
  • Collaborate with data scientists and engineers to address and resolve model performance and data quality issues.
  • Conduct regular system maintenance, updates, and optimizations to ensure optimal performance of machine learning solutions.
  • Automation and Orchestration :

  • Develop and maintain automation scripts and tools for managing machine learning workflows.
  • Implement orchestration systems to streamline the end-to-end machine learning lifecycle, from data preparation to model deployment.
  • Collaboration with Data Science Teams :

  • Collaborate with data scientists to understand model requirements and constraints for deployment.
  • Facilitate the transition of machine learning models from research to production, ensuring scalability and efficiency.
  • Performance Optimization :

  • Identify and implement optimizations to enhance the performance and efficiency of machine learning models in production.
  • Conduct performance analysis and implement improvements based on resource utilization of metrics.
  • Security and Compliance :

  • Implement security measures to protect machine learning systems and data.
  • Ensure compliance with regulatory requirements and industry standards related to machine learning and data privacy.
  • Integrate audit controls, metadata storage, and lineage tracking across ML and AI workflows.
  • Ensure complete monitoring and feedback loops including event logs, evaluations, and automated retraining triggers.
  • Enforce secure deployment patterns with Infrastructure-as-Code and cloud-native secrets management.
  • Define SLAs, error budgets, and compliance reporting mechanisms for ML and AI systems.
  • What You'll Bring :

  • Education and Experience :
  • Bachelor's or Master's degree in Computer Science, Machine Learning, Data Science, or a related field.
  • 10+ years in ML infrastructure, DevOps, and software system architecture; 4+ years in leading MLOps or AI Ops platforms.
  • Technical Skills :
  • Strong programming skills in languages such as Python, Java, or Scala.
  • Expertise in ML frameworks (TensorFlow, PyTorch, scikit-learn) and orchestration tools (Airflow, Kubeflow, Vertex AI, MLflow).
  • Proven experience operating production pipelines for ML and LLM-based systems across cloud ecosystems (GCP, AWS, Azure).
  • Deep familiarity with LangChain, LangGraph, ADK or similar agentic system runtime management.
  • Strong competencies in CI / CD, IaC, and DevSecOps pipelines integrating testing, compliance, and deployment automation.
  • Hands-on with observability stacks (Prometheus, Grafana, Newrelic) for model and agent performance tracking.
  • Understanding of governance frameworks for Responsible AI, auditability, and cost metering across training and inference workloads.
  • Proficiency in containerization technologies (e.g., Docker, Kubernetes).
  • Operations and Infrastructure :
  • Proficient in scripting languages (e.g., Bash, python) for automation.
  • Experience with workflow orchestration tools (e.g., Apache Airflow).
  • Expertise in managing and optimizing cloud-based infrastructure.
  • Familiarity with DevOps practices and tools for automated deployment.
  • Understanding of network configurations and security protocols.
  • Problem-solving and Critical Thinking :
  • Ability to define problems, collect and analyze data, and propose innovative solutions. Strong critical thinking skills to evaluate models, identify limitations, and
  • Adaptability and Learning Agility :
  • Comfortable working in a fast-paced, rapidly evolving environment. Proactive in staying up to date with the latest trends, techniques, and technologies in AI / data science
  • Thrive at BlackLine Because You Are Joining :

  • A technology-based company with a sense of adventure and a vision for the future. Every door at BlackLine is open. Just bring your brains, your problem-solving skills, and be part of a winning team at the world's most trusted name in Finance Automation!
  • A culture that is kind, open, and accepting. It's a place where people can embrace what makes them unique, and the mix of cultural backgrounds and varying interests cultivates diverse thought and perspectives.
  • A culture where BlackLiner's continued growth and learning is empowered. BlackLine offers a wide variety of professional development seminars and inclusive affinity groups to celebrate and support our diversity.
  • BlackLine is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to sex, gender identity or expression, race, ethnicity, age, religious creed, national origin, physical or mental disability, ancestry, color, marital status, sexual orientation, military or veteran status, status as a victim of domestic violence, sexual assault or stalking, medical condition, genetic information, or any other protected class or category recognized by applicable equal employment opportunity or other similar laws.

    BlackLine recognizes that the ways we work and the workplace itself have shifted. We innovate in a workplace that optimizes a combination of virtual and in-person interactions to maximize collaboration and nurture our culture. Candidates who live within a reasonable commute to one of our offices will work in the office at least 2 days a week.

    Salary Range :

    USD $276,000.00 / Yr. - USD $346,000.00 / Yr.

    Pay Transparency Statement :

    Placement within this range depends upon several factors, including the applicant's prior relevant job experience, skill set, and geographic location. In addition to base pay, BlackLine also offers short-term and long-term incentive programs, based on eligibility, along with a robust offering of benefit and wellness plans.

    BlackLine is committed to creating an inclusive and accessible experience for all candidates. If you require a reasonable accommodation that would better enable your success during the application or interview process, please complete this form.

    Accommodations :

    BlackLine is committed to creating an inclusive and accessible experience for all candidates. If you require a reasonable accommodation that would better enable your success during the application or interview process, please complete this form.

    Create a job alert for this search

    Principal Engineer • Pleasanton, CA, United States

    Related jobs
    • Promoted
    AI / ML Engineer

    AI / ML Engineer

    Air AppsSan Francisco, CA, United States
    Full-time
    At Air Apps, we believe in thinking bigger-and moving faster.We're a family-founded company on a mission to create the world's first AI-powered Personal & Entrepreneurial Resource Planner (PRP), an...Show moreLast updated: 30+ days ago
    • Promoted
    Principal AI Engineer

    Principal AI Engineer

    SynopsysMountain View, CA, United States
    Full-time
    You are a passionate and driven individual with a degree in Computer Science, Computer Engineering, or Electrical Engineering. With a strong foundation in Artificial Intelligence algorithms and expe...Show moreLast updated: 30+ days ago
    • Promoted
    AI / ML Engineer

    AI / ML Engineer

    BraincoSan Francisco, CA, United States
    Full-time
    Applied AI startup founded by Elad Gil and Jared Kushner, and backed by many of Silicon Valley's leading builders - including Patrick Collison (CEO of Stripe), Andrej Karpathy (Cofounder of OpenAI)...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    ML / AI Engineer

    ML / AI Engineer

    AkkodisSan Francisco, CA, United States
    Full-time
    National IT Recruiter | Learner and Adventurer | Toddler Mom | Film Fanatic | Foodie.Akkodis is seeking a ML / AI Engineer for an FTE role with a growing and well funded startup in the San Francisco / ...Show moreLast updated: 14 hours ago
    • Promoted
    Principal Machine Learning Engineer

    Principal Machine Learning Engineer

    General MotorsSunnyvale, CA, United States
    Full-time
    We are seeking a Principal AI Engineer to lead the design and advancement of our AI platform.You will play a key role in shaping the infrastructure that powers large-scale training and cloud infere...Show moreLast updated: 30+ days ago
    • Promoted
    AI / ML Engineer

    AI / ML Engineer

    Seven Seven SoftwareSan Francisco, CA, United States
    Full-time
    AI / ML (Artificial Intelligence , Machine Learning) Engineer.Experience in engineering and deploying Generative AI models, specifically focusing on Retrieval-Augmented Generation (RAG) systems and...Show moreLast updated: 30+ days ago
    • Promoted
    Principal / Senior Principal Machine Learning Engineer, AI Enablement

    Principal / Senior Principal Machine Learning Engineer, AI Enablement

    GenentechSan Francisco, CA, United States
    Full-time
    We advance science so that we all have more time with the people we love.It’s what drives us to innovate.To continuously advance science and ensure everyone has access to the healthcare they need t...Show moreLast updated: 30+ days ago
    • Promoted
    Principal AI / ML Engineer

    Principal AI / ML Engineer

    WEXSan Francisco, CA, United States
    Full-time
    Lead and drive the development of technology and platform for the company's AI / ML engineering needs, ensure the functional richness, reliability, performance, and flexibility of this platform.Help ...Show moreLast updated: 30+ days ago
    • Promoted
    Capgemini Invent / Synapse - Principal AI / ML Engineer (Hardware Products)

    Capgemini Invent / Synapse - Principal AI / ML Engineer (Hardware Products)

    CapgeminiSan Francisco, CA, United States
    Full-time
    At Capgemini Invent, we believe difference drives change.As inventive transformation consultants, we blend our strategic, creative and scientific capabilities, collaborating closely with clients to...Show moreLast updated: 30+ days ago
    • Promoted
    Principal AI / ML Engineer - VC Backed Startups

    Principal AI / ML Engineer - VC Backed Startups

    SignalFireSan Francisco, CA, United States
    Full-time
    Join SignalFire's Talent Network for Principal AI / ML Engineer Roles at VC-Backed Startups.Our portfolio spans 200+ innovative companies across AI, cybersecurity, healthtech, fintech, developer tool...Show moreLast updated: 30+ days ago
    • Promoted
    AI / ML Engineer

    AI / ML Engineer

    General MotorsSan Francisco, CA, United States
    Full-time
    As an AI / ML Engineer on the Metrics Frameworks team, part of the Simulation, Evaluation, and Data organization, you will be an individual contributor focused on developing and optimizing infrastruc...Show moreLast updated: 30+ days ago
    • Promoted
    AI / ML Engineer

    AI / ML Engineer

    PragmatikeSan Francisco, CA, United States
    Full-time
    On-site Cambridge, MA (Eastern Time / UTC -4).This role is on-site in in the.Please inquire for more details during the interview process. We are hiring at Pragmatike to expand our team and drive th...Show moreLast updated: 30+ days ago
    • Promoted
    AI / ML Engineer

    AI / ML Engineer

    Diverse LynxSan Francisco, CA, United States
    Full-time
    Experience presenting data to executives (must-have).Strong communication, written skills, and interpersonal skills (required to establish and maintain inter-departmental relationships.Experience e...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    AI / ML Lead

    AI / ML Lead

    Inherent TechnologiesSan Jose, CA, United States
    Full-time
    This role is unique in its ability to not only oversee but also actively perform AI work.The TPM directly contributes to the development and implementation of models, algorithms, and infrastructure...Show moreLast updated: 15 hours ago
    • Promoted
    • New!
    AI / ML Engineer

    AI / ML Engineer

    Trek HealthSan Francisco, CA, United States
    Full-time
    Trek Health high energy startup that empowers provider organizations with AI-driven insights, tools, and strategies to improve payer contract reimbursement, optimize service line performance, and f...Show moreLast updated: 15 hours ago
    • Promoted
    • New!
    Founding AI / ML Engineer ( Personalisation )

    Founding AI / ML Engineer ( Personalisation )

    codaxSan Francisco, CA, United States
    Full-time +1
    Founding AI / ML Engineer (Personalisation).Location : SF Bay Area (hybrid) or In-Person (Preferred).Were building the reasoning layer for customer experience, a privacy-first, explainable system th...Show moreLast updated: 14 hours ago
    • Promoted
    • New!
    AI ML Engineer

    AI ML Engineer

    A5 Talent FindersSan Francisco, CA, United States
    Full-time
    As a Founding AI / ML Engineer you will design and build core systems from the ground up working across the stack to ship fast reliable features. You'll lead technical architecture decisions impleme...Show moreLast updated: 15 hours ago
    • Promoted
    • New!
    Senior ML / AI Engineer

    Senior ML / AI Engineer

    ReacherSan Francisco, CA, United States
    Full-time
    Under Armour, Hanes, HeyDude, and Logitech scale their affiliate marketing.Youtube Shopping, Instagram Shopping, Shopify, Amazon). We're building key infrastructure for the creator economy and imple...Show moreLast updated: 15 hours ago