• Scale AI, Inc. (San Francisco, CA)
    Machine Learning Engineer - Model Evaluations , Public Sector San Francisco, CA; St. Louis, MO; New York, NY; Washington, DC Ready to Apply? Join the team ... shaping the future of AI at Scale. Machine Learning Engineer - Model Evaluations , Public...simulation environments, or automated evaluation systems. Ability to convert research insights into measurable evaluation criteria. Nice to haves:… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Scale AI (New York, NY)
    Machine Learning Engineer - Model Evaluations , Public Sector San Francisco, CA; St. Louis, MO; New York, NY; Washington, DC The Public Sector ML team at ... operate reliably, safely, and effectively under real-world constraints. As an ML Engineer , you will design, implement, and scale automated evaluation pipelines that… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Scale AI, Inc. (Washington, DC)
    Machine Learning Engineer - Model Evaluations , Public Sector The Public Sector ML team at Scale deploys advanced AI systems-including LLMs, agentic models, ... operate reliably, safely, and effectively under real‑world constraints. As an ML Engineer , you will design, implement, and scale automated evaluation pipelines that… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Waymo (San Francisco, CA)
    …and training the Waymo Driver. We use modern machine learning techniques to model the complexities of the real world, including the behavior of diverse agents ... platforms and pipelines for measuring simulation realism across massive datasets. Research and implement systems that can guide simulator development priorities.… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Menlo Ventures (San Francisco, CA)
    …systems that push the boundaries of what AI can accomplish. About the Role As a Research Engineer on the Horizons team, you will collaborate with a diverse group ... AI systems. About Horizons The Horizons team leads Anthropic's reinforcement learning research and development, playing a critical role in advancing our AI systems.… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Scale (San Francisco, CA)
    …public and private evaluations . About This Role We're looking for a researcher/ engineer who can turn research ideas into working prototypes and solve complex ... Machine Learning Research Scientist / Engineer , Reasoning About...Artificial General Intelligence (AGI). Building on our history of model evaluation with enterprise and government customers, we are… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Scale AI (San Francisco, CA)
    Machine Learning Research Scientist / Engineer , Reasoning About Scale At Scale AI, our mission is to accelerate the development of AI applications. For 8 years, ... progress toward Artificial General Intelligence (AGI). Building on our history of model evaluation with enterprise and government customers, we are expanding our… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Scale (San Francisco, CA)
    Machine Learning Research Engineer - Robotics Scale's Robotics business unit is dedicated to solving the data bottleneck in Physical AI. This position will be a ... key contributor in conducting applied research in Robotics and developing ML pipelines for training...offerings, and expand the frontier of Robotics data and model evaluation. You will: Collaborate closely with Robotics customers… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Scale AI (San Francisco, CA)
    Machine Learning Research Engineer , GenAI Applied ML Join to apply for the Machine Learning Research Engineer , GenAI Applied ML role at Scale AI About ... Artificial General Intelligence (AGI), and building upon our prior model evaluation work with enterprise customers and governments to...our capabilities and offerings for both public and private evaluations . About This Role This role will lead the… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Prima Mente (San Francisco, CA)
    …develop, and maintain robust experimentation pipelines enabling rapid iteration, precise evaluations , and reproducible research outcomes Refactor and scale ... data, building brain foundation models, and translating discovery to real clinical and research impact. Role focus - Foundation Models for Biology You will play a… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Scale AI (San Francisco, CA)
    …road to Artificial General Intelligence (AGI), and building upon our prior model evaluation work with enterprise customers and governments, to deepen our ... capabilities and offerings for both public and private evaluations . About This Role Ideally you'd have: Practical experience working with LLMs, with proficiency in… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Scale AI, Inc. (San Francisco, CA)
    …road to Artificial General Intelligence (AGI), and building upon our prior model evaluation work with enterprise customers and governments to deepen our capabilities ... and offerings for both public and private evaluations . About This Role This role will lead the...open‑source LLM fine‑tuning efforts or internal LLM alignment projects Research or published work in top ML venues (eg,… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Second Renaissance (Palo Alto, CA)
    …partnership with Stanford University, UCSF, and UC Berkeley. While the prevailing university research model has yielded many tremendous successes, we believe in ... the position We are searching for an experienced and collaborative machine learning research engineer focused on advancing the frontiers of biological foundation… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Arc Institute (Palo Alto, CA)
    …partnership with Stanford University, UCSF, and UC Berkeley. While the prevailing university research model has yielded many tremendous successes, we believe in ... the position We are searching for an experienced and collaborative machine learning research engineer focused on advancing the frontiers of biological foundation… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • LMArena (San Francisco, CA)
    …perform- and we use our community's feedback to build transparent, rigorous, and human-centered model evaluations . Leading enterprises and AI labs rely on our ... Machine Learning Engineer - LMArena Join to apply for the Machine...evaluations to understand real-world reliability, alignment, and impact. Our leaderboards… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Snorkel AI (San Francisco, CA)
    Senior Software Engineer - AI/ML Join to apply for the...Snorkel, we believe meaningful AI doesn't start with the model , it starts with the data. We're on a ... Senior Software Engineer - AI/ML role at Snorkel AI. About Snorkel...incredible changes between 2015, when Snorkel started as a research project in the Stanford AI Lab, to the… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • SOLANA FOUNDATION (San Francisco, CA)
    …pipelines and working with large datasets Track record of creating benchmarks and evaluations Ability to take research techniques and apply them to production ... improving the core ML systems that power our custom model training platform, while also applying these systems directly...customers. Your role sits at the intersection of applied research and production engineering. You'll lead projects from data… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Turing (San Francisco, CA)
    …across all evaluations Identify and document edge cases or unusual model behavior Collaborate with the team to improve evaluation processes and identify ... to write and understand code. Key Responsibilities Review and compare 3-4 model -generated code responses for each task using a structured ranking framework Assess… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Bedrock Robotics (San Francisco, CA)
    The Role Machine Learning Evaluation Engineer : Bedrock is bringing autonomy to the construction industry! We're a group of veterans from the autonomous vehicle ... industry currently underserved by the market. We're looking for a highly motivated engineer with experience evaluating complex ML systems deployed in the real world.… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Snorkel AI (San Francisco, CA)
    About Snorkel At Snorkel, we believe meaningful AI doesn't start with the model , it starts with the data. We're on a mission to help enterprises transform expert ... incredible changes between 2015, when Snorkel started as a research project in the Stanford AI Lab, to the...Apply to be the newest Snorkeler! As a Software Engineer on the Evaluation Engineering team, you'll build systems… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source