- Scale AI, Inc. (San Francisco, CA)
- Machine Learning Engineer - Model Evaluations , Public Sector San Francisco, CA; St. Louis, MO; New York, NY; Washington, DC Ready to Apply? Join the team ... shaping the future of AI at Scale. Machine Learning Engineer - Model Evaluations , Public...simulation environments, or automated evaluation systems. Ability to convert research insights into measurable evaluation criteria. Nice to haves:… more
- Scale AI (New York, NY)
- Machine Learning Engineer - Model Evaluations , Public Sector San Francisco, CA; St. Louis, MO; New York, NY; Washington, DC The Public Sector ML team at ... operate reliably, safely, and effectively under real-world constraints. As an ML Engineer , you will design, implement, and scale automated evaluation pipelines that… more
- Scale AI, Inc. (Washington, DC)
- Machine Learning Engineer - Model Evaluations , Public Sector The Public Sector ML team at Scale deploys advanced AI systems-including LLMs, agentic models, ... operate reliably, safely, and effectively under real‑world constraints. As an ML Engineer , you will design, implement, and scale automated evaluation pipelines that… more
- Waymo (San Francisco, CA)
- …and training the Waymo Driver. We use modern machine learning techniques to model the complexities of the real world, including the behavior of diverse agents ... platforms and pipelines for measuring simulation realism across massive datasets. Research and implement systems that can guide simulator development priorities.… more
- Menlo Ventures (San Francisco, CA)
- …systems that push the boundaries of what AI can accomplish. About the Role As a Research Engineer on the Horizons team, you will collaborate with a diverse group ... AI systems. About Horizons The Horizons team leads Anthropic's reinforcement learning research and development, playing a critical role in advancing our AI systems.… more
- Scale (San Francisco, CA)
- …public and private evaluations . About This Role We're looking for a researcher/ engineer who can turn research ideas into working prototypes and solve complex ... Machine Learning Research Scientist / Engineer , Reasoning About...Artificial General Intelligence (AGI). Building on our history of model evaluation with enterprise and government customers, we are… more
- Scale AI (San Francisco, CA)
- Machine Learning Research Scientist / Engineer , Reasoning About Scale At Scale AI, our mission is to accelerate the development of AI applications. For 8 years, ... progress toward Artificial General Intelligence (AGI). Building on our history of model evaluation with enterprise and government customers, we are expanding our… more
- Scale (San Francisco, CA)
- Machine Learning Research Engineer - Robotics Scale's Robotics business unit is dedicated to solving the data bottleneck in Physical AI. This position will be a ... key contributor in conducting applied research in Robotics and developing ML pipelines for training...offerings, and expand the frontier of Robotics data and model evaluation. You will: Collaborate closely with Robotics customers… more
- Scale AI (San Francisco, CA)
- Machine Learning Research Engineer , GenAI Applied ML Join to apply for the Machine Learning Research Engineer , GenAI Applied ML role at Scale AI About ... Artificial General Intelligence (AGI), and building upon our prior model evaluation work with enterprise customers and governments to...our capabilities and offerings for both public and private evaluations . About This Role This role will lead the… more
- Prima Mente (San Francisco, CA)
- …develop, and maintain robust experimentation pipelines enabling rapid iteration, precise evaluations , and reproducible research outcomes Refactor and scale ... data, building brain foundation models, and translating discovery to real clinical and research impact. Role focus - Foundation Models for Biology You will play a… more
- Scale AI (San Francisco, CA)
- …road to Artificial General Intelligence (AGI), and building upon our prior model evaluation work with enterprise customers and governments, to deepen our ... capabilities and offerings for both public and private evaluations . About This Role Ideally you'd have: Practical experience working with LLMs, with proficiency in… more
- Scale AI, Inc. (San Francisco, CA)
- …road to Artificial General Intelligence (AGI), and building upon our prior model evaluation work with enterprise customers and governments to deepen our capabilities ... and offerings for both public and private evaluations . About This Role This role will lead the...open‑source LLM fine‑tuning efforts or internal LLM alignment projects Research or published work in top ML venues (eg,… more
- Second Renaissance (Palo Alto, CA)
- …partnership with Stanford University, UCSF, and UC Berkeley. While the prevailing university research model has yielded many tremendous successes, we believe in ... the position We are searching for an experienced and collaborative machine learning research engineer focused on advancing the frontiers of biological foundation… more
- Arc Institute (Palo Alto, CA)
- …partnership with Stanford University, UCSF, and UC Berkeley. While the prevailing university research model has yielded many tremendous successes, we believe in ... the position We are searching for an experienced and collaborative machine learning research engineer focused on advancing the frontiers of biological foundation… more
- LMArena (San Francisco, CA)
- …perform- and we use our community's feedback to build transparent, rigorous, and human-centered model evaluations . Leading enterprises and AI labs rely on our ... Machine Learning Engineer - LMArena Join to apply for the Machine...evaluations to understand real-world reliability, alignment, and impact. Our leaderboards… more
- Snorkel AI (San Francisco, CA)
- Senior Software Engineer - AI/ML Join to apply for the...Snorkel, we believe meaningful AI doesn't start with the model , it starts with the data. We're on a ... Senior Software Engineer - AI/ML role at Snorkel AI. About Snorkel...incredible changes between 2015, when Snorkel started as a research project in the Stanford AI Lab, to the… more
- Turing (San Francisco, CA)
- …across all evaluations Identify and document edge cases or unusual model behavior Collaborate with the team to improve evaluation processes and identify ... to write and understand code. Key Responsibilities Review and compare 3-4 model -generated code responses for each task using a structured ranking framework Assess… more
- Bedrock Robotics (San Francisco, CA)
- The Role Machine Learning Evaluation Engineer : Bedrock is bringing autonomy to the construction industry! We're a group of veterans from the autonomous vehicle ... industry currently underserved by the market. We're looking for a highly motivated engineer with experience evaluating complex ML systems deployed in the real world.… more
- Snorkel AI (San Francisco, CA)
- About Snorkel At Snorkel, we believe meaningful AI doesn't start with the model , it starts with the data. We're on a mission to help enterprises transform expert ... incredible changes between 2015, when Snorkel started as a research project in the Stanford AI Lab, to the...Apply to be the newest Snorkeler! As a Software Engineer on the Evaluation Engineering team, you'll build systems… more
- Cartesia (San Francisco, CA)
- …representative datasets at scale. Your Impact Design and build large‑scale datasets for model training. Build evaluations of speech models, both via manual ... video tokens-let alone do this on-device. We're pioneering the model architectures that will make this possible. Our founding...and cultures. We are searching for a Machine Learning Engineer to own the quality and coverage of the… more