• krea.ai (San Francisco, CA)
    …AI research, real‑time user experiences, and large‑scale model deployments. As a Distributed Systems Engineer , you will design, build, and maintain large‑scale ... tools to harness this medium. This job Robust, reliable, and scalable distributed systems form the backbone of Krea. These systems support the infrastructure… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • OpenAI (San Francisco, CA)
    Training Performance Engineer , you'll drive efficiency improvements across our distributed training stack. You'll analyze large-scale training runs, ... in Python and C++ (Rust or CUDA a plus). Have experience running distributed training jobs on multi‑GPU systems or HPC clusters. Enjoy debugging complex … more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • DoorDash USA (Seattle, WA)
    Senior Software Engineer , ML Training Platform San Francisco, CA; Sunnyvale, CA; Seattle, WA About the Team DoorDash is building the world's most reliable ... and Search. About the Role As a Senior Software Engineer in the team, you will take ownership of...Training Platform-creating reliable, extensible solutions for data transformations, distributed model training , and rapid experimentation in… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • P-1 AI (San Francisco, CA)
    …and post‑ training workflows Configure, launch, monitor, and debug multi‑node distributed training jobs using FSDP, DeepSpeed, or custom wrappers Contribute ... earth. About The Role We're looking for an experienced engineer to take ownership of LLM training ... training pipelines Deep familiarity with PyTorch, especially distributed training via FSDP, DeepSpeed, or DDP… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • LinkedIn (Mountain View, CA)
    …open‑ended problems. Designing, implementing, and optimizing the performance of large‑scale distributed training for personalized recommendation as well as large ... Principal Staff Software Engineer , AI Training Platform Full-time Workplace...leading / building deep learning systems Hands‑on experience developing distributed systems or other large‑scale systems Preferred Qualifications MS… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Tempus AI (Redwood City, CA)
    …data processing workflows responsible for ingesting, processing, and preparing multimodal training data that seamlessly integrate with large-scale distributed ML ... Join to apply for the Staff Machine Learning Engineer role at Tempus AI Join to apply...generative AI models. Your work will directly enable the training and deployment of robust, production-ready multimodal systems that… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Beautiful.ai (San Francisco, CA)
    …San Francisco, CA . San Francisco, CA $130,000.00-$165,000.00 1 month ago Software Engineer , HTML - AI Training (Freelance, Remote) San Francisco, CA ... ago San Francisco, CA $150,000.00-$190,000.00 2 months ago Coders - AI Training (Freelance, Remote) Software Engineer Internship (8 openings) San Francisco,… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Jobright.ai (San Francisco, CA)
    Join to apply for the Site Reliability Engineer - Inference role at Jobright.ai 2 days ago Be among the first 25 applicants Join to apply for the Site Reliability ... Engineer - Inference role at Jobright.ai Get AI-powered advice...AI models and building a high-throughput, low-latency API for distributed systems. Responsibilities: * Work on our Inference service,… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Ipro Networks Pte. Ltd. (San Francisco, CA)
    …clusters or cloud instances, such as AWS or Google Cloud, to support distributed training . Ensure that infrastructure can handle the resource-intensive tasks ... Overview Job Title: Machine Learning Engineer , Training Infrastructure | Position Type:...and Kubernetes required for deployments at scale. Understanding of distributed training techniques and how to scale… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Hedra, Inc (San Francisco, CA)
    …clusters or cloud instances, such as AWS or Google Cloud, to support distributed training . Ensure that our infrastructure can handle the resource-intensive tasks ... whiteboard problem-solving. Overview We are looking for an ML Engineer with 3+ YOE in high-performance computing systems to...and Kubernetes required for deployments at scale. Understanding of distributed training techniques and how to scale… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Stanford University School of Medicine (Palo Alto, CA)
    …of practical experience in implementing and optimizing machine learning algorithms with distributed training using common libraries (eg Ray, DeepSpeed, HF ... Machine Learning Research Engineer (1 Year Fixed Term) Join to apply...ML experiments, using the latest MLOps platforms Run large-scale distributed model training on high-performance computing clusters… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • SoFi (San Francisco, CA)
    Join to apply for the Senior Software Engineer , ML Platform role at SoFi 1 day ago Be among the first 25 applicants Join to apply for the Senior Software Engineer ... & Support Technology), we are seeking a Senior Software Engineer to join the Machine Learning Platform team. This...support the entire ML lifecycle, from feature generation to training pipelines, batch and online inference, CI/CD integration, and… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Jobright.ai (San Francisco, CA)
    Join to apply for the Senior Software Engineer , Data Platform role at Jobright.ai 3 days ago Be among the first 25 applicants Join to apply for the Senior Software ... Engineer , Data Platform role at Jobright.ai Parafin is dedicated...with a strong background in data infrastructure, pipelines, and distributed systems. * Advanced proficiency in Python and SQL.… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Adobe (San Jose, CA)
    Principal Machine Learning Engineer , Firefly Join to apply for the Principal Machine Learning Engineer , Firefly role at Adobe Principal Machine Learning ... among the first 25 applicants Join to apply for the Principal Machine Learning Engineer , Firefly role at Adobe Get AI-powered advice on this job and more exclusive… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Gauss Labs (Palo Alto, CA)
    Join to apply for the AI Engineer - Machine Learning (US) role at Gauss Labs Continue with Google Continue with Google Join to apply for the AI Engineer - ... at Gauss Labs Gauss Labs is looking for a passionate and talented AI Engineer for developing cutting-edge Industrial AI solutions that will normalize the standard of… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Aurora (San Francisco, CA)
    Senior Software Engineer , Rendering and Sensor Simulation Join to apply for the Senior Software Engineer , Rendering and Sensor Simulation role at Aurora Senior ... Software Engineer , Rendering and Sensor Simulation Join to apply for...third-party graphics libraries (eg, Embree, OIIO). Knowledge of network distributed programming. Proficiency in Python. The base salary range… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Doximity (San Francisco, CA)
    …days ago Senior Software Engineer , Backend - Fintech Senior Software Engineer , Distributed Systems San Francisco, CA $150,000.00-$207,000.00 4 months ago San ... Senior Software Engineer (Python), Data Platform Join to apply for...is determined by factors including relevant skills, experience, and education/ training . More on Benefits & Perks Doximity is proud… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Menlo Ventures (Berkeley, CA)
    …performance improvements to the open source project. Develop and optimize distributed training algorithms for large language models. Implement high‑performance ... of deep learning training and its applications Understanding of distributed training techniques (data parallelism, model parallelism, pipeline parallelism,… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Skylight (Washington, DC)
    Staff/Principal Software Engineer , Lead (HHS) Washington, District of Columbia, United States About Skylight Skylight is a digital consultancy using design and ... can make smarter, safer decisions. As a lead software engineer on this project, you'll provide direction while staying...by sharing knowledge and practices that last - from training and enablement to reusable tools like templates, playbooks,… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source
  • Canonical (San Francisco, CA)
    …notified about new Software Engineer jobs in San Francisco, CA . Software Engineer , SQL - AI Training (Freelance, Remote) Software Engineer , TypeScript - ... many sectors. The company is a pioneer of global distributed collaboration, with 1200+ colleagues in 75+ countries and...Francisco, CA $40,000 - $100,000 3 weeks ago Software Engineer , C# - AI Training (Freelance, Remote)… more
    job goal (01/14/26)
    - Save Job - Related Jobs - Block Source