- TikTok (San Jose, CA)
- …models and end-to-end approaches. Qualifications Minimum Qualifications Proven experience in multimodal content understanding, with expertise in large language ... Machine Learning Engineer Graduate (TikTok Short Video Content Understanding/ Multimodal Recommendation) - 2026 Start (BS/MS) Join to apply for the Machine Learning… more
- Proximity Works (Los Angeles, CA)
- …Proficiency in Python, PyTorch/TensorFlow, and modern ML toolkits. Experience in multimodal AI (bridging text, vision , or speech with LLMs). Track record ... of AI. You will design, fine-tune, and optimize large-scale language and multimodal models, with a strong...engineering and product teams to build systems that combine language , vision , and retrieval modalities - powering… more
- Attis (San Francisco, CA)
- …an SME in ML you will develop perception models and build multimodal pipelines that integrate visual, geometric and language ‑based inputs to provide context ... Head of ML & Geometry (3D Simulation/Point Cloud) Head of...benefits. Working on site in SF. 401(k), Medical insurance, Vision insurance. Disclaimer: Attis Global Ltd is an equal… more
- RoboForce (Milpitas, CA)
- …and communicate seamlessly with humans. Responsibilities Design and deploy vision - language (-action) models (VLM/VLA) for contextual understanding and generalized ... capabilities to achieve high‑precision robotic actions. Integrate multi‑modal data sources ( vision , language , speech, etc.) to enable natural human‑robot… more
- Toyota Research Institute (Los Altos, CA)
- …MLflow, or ClearML. Data Infrastructure: Build scalable pipelines for heterogeneous multimodal data (images, text, video, touch, depth, proprioception). Work with ... fundamental problems. Continuous integration Bonus Qualifications Familiarity with modern ML efficiency frameworks (eg, FSDP, DeepSpeed, XLA, Ray, Hugging Face… more
- Willing Tech (San Francisco, CA)
- …Apply the latest in generative AI , computer vision , and multimodal learning to scientific contexts. Collaborate with ML researchers, engineers, and product ... years of experience in machine learning, particularly in computer vision , NLP , or multimodal models...multimodal models . Hands-on experience building and deploying ML systems in production (preferably at a SaaS or… more
- TwelveLabs (San Francisco, CA)
- …who gets excited by the prospect of advancing the State of the Art in vision ‑ language modeling by perfecting ML systems and infrastructure. In This Role, ... Who We Are At TwelveLabs, we are pioneering the development of frontier multimodal foundation models that can see, hear and understand the world as humans do. Our… more
- TwelveLabs (San Francisco, CA)
- …who get excited by the prospect of advancing the State of the Art in vision - language modeling by perfecting ML systems and infrastructure! In This Role, ... incorporating already-great research into fault tolerant, low latency e2e systems Scale multimodal AI/ ML systems for video understanding Deliver industry leading… more
- TikTok (San Jose, CA)
- …NLP (Natural Language Processing), CV (Computer Vision ), LLM (Large Language Models)/MLLM ( Multimodal Language Models), Search, or Agentic AI. ... Our Focus Areas Include TikTok Intelligent Customer Service: Utilizing large language models to address intelligent customer service and commercial customer service… more
- Unity (San Francisco, CA)
- …maintain, and reason over structured representations of complex game environments. Advance multimodal LLM architectures that integrate text, vision , and spatial ... Scientist to lead this work. In this role, you'll focus on advancing multimodal LLMs, strengthening spatial reasoning, and building world models that help creators… more
- Unity3d (San Francisco, CA)
- …maintain, and reason over structured representations of complex game environments. Advance multimodal LLM architectures that integrate text, vision , and spatial ... Scientist to lead this work. In this role, you'll focus on advancing multimodal LLMs, strengthening spatial reasoning, and building world models that help creators… more
- Unity Technologies (San Francisco, CA)
- …maintain, and reason over structured representations of complex game environments. Advance multimodal LLM architectures that integrate text, vision , and spatial ... Scientist to lead this work. In this role, you'll focus on advancing multimodal LLMs, strengthening spatial reasoning, and building world models that help creators… more
- Xometry (Waltham, MA)
- …learning and generative AI capabilities, particularly for fine-tuning generative and language models, multimodal document understanding, and structured data ... objectives. Develop and deploy generative AI models and large language models (LLMs) for multimodal document processing,...machine learning, focusing on generative models, LLMs, or computer vision . Expertise in large-scale language and … more
- Carlsbad Tech (San Francisco, CA)
- …who get excited by the prospect of advancing the State of the Art in vision - language modeling by perfecting ML systems and infrastructure! In This Role, ... prior work history with at least one statically typed language (we use Golang) Experience with modern ML...related discipline Acquiring, filtering, (re)labeling, or sanitizing large scale language or vision - language datasets for… more
- Scale AI, Inc. (San Francisco, CA)
- …Practical experience with Multimodal AI, specifically integrating OCR and vision - language models for document intelligence and structured data extraction ... Agent Development , translating cutting-edge research in Generative AI, Large Language Models (LLMs), and Agentic Frameworks into robust, scalable, and high-impact… more
- Scale AI (New York, NY)
- …Practical experience with Multimodal AI, specifically integrating OCR and vision ‑ language models for document intelligence and structured data extraction ... Research Agent Development, translating cutting‑edge research in Generative AI, Large Language Models (LLMs), and Agentic Frameworks into robust, scalable production… more
- Stripe (San Francisco, CA)
- …our entire product suite. Our mission is to fundamentally transform how Stripe uses ML , leveraging our extensive and rich dataset to solve some of the most ... . We process petabytes of financial data using our ML platform to build features, train models, and deploy...in merchant risk and understanding how we can align language to our immense ocean of payments data. Some… more
- Takeda Pharmaceuticals (Boston, MA)
- … problems and domains, with depth in at least two (computer vision , natural language processing, geometric deep learning, timeseries, reinforcement learning, ... AI to join us in the ShinrAI Center for AI/ ML at Takeda, based in Cambridge, MA. At the...and LLM based solutions. Experience in fine tuning large language models for domain specific applications. Experience in designing… more
- Unity (San Francisco, CA)
- …maintain, and reason over structured representations of complex game environments. Advance multimodal LLM architectures that integrate text, vision , and spatial ... interactive creation tools. Prototype new AI‑assisted workflows, such as natural‑ language scene editing, spatial instruction following, or agent behavior… more
- TwelveLabs (San Francisco, CA)
- …- $182,000.00/yr At TwelveLabs, we are pioneering the development of cutting‑edge multimodal foundation models that have the ability to comprehend videos just like ... humans do. Our models have redefined the standards in video‑ language modeling, empowering us with more intuitive and far‑reaching...world. Join us as we revolutionize video understanding and multimodal AI. About The Role As a Senior Product… more