• OpenAI (San Francisco, CA)
    …to distribute their benefits widely. About the Role As a Distributed Systems/ML engineer , you will work on improving the training throughput for our internal ... of supercomputers. We're looking for people who love optimizing performance, understanding distributed systems, and who cannot stand having bugs in their code. This… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Periodic Labs (Menlo Park, CA)
    …experience with: Training on clusters with ≥5,000 GPUs 5D parallel LLM training Distributed training frameworks such as Megatron-LM, FSDP, DeepSpeed, ... About the role You will optimize, operate and develop large-scale distributed LLM training systems that power AI scientific research. You will work closely… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Periodic Labs (Menlo Park, CA)
    …to open-source frameworks. Ideal candidates will have expertise in GPU clusters, parallel training , and distributed training frameworks. Join a rapidly ... in California seeks an experienced professional to optimize and develop large-scale distributed LLM training systems. This role involves working with researchers… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Pantera Capital (Palo Alto, CA)
    A leading technology firm is seeking engineers to design and develop large-scale distributed training systems. The ideal candidates will have expertise in ... optimizing GPU utilization and building scalable training frameworks for AI models. Responsibilities include maintaining the codebase and innovating tools to enhance… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Harmonic (Palo Alto, CA)
    …evaluations for model capability Expertise in Python and PyTorch Experience with distributed training , parallel computing, and GPU acceleration Strong experience ... team. We are seeking a highly motivated and skilled Research Engineer with expertise in model training , to focus on reasoning in formal and informal settings.… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Amazon (San Francisco, CA)
    …seeking a Senior Machine Learning Engineer to join the AWS Neuron Distributed Training team. This role involves developing and enabling software stacks for ... machine learning accelerators, contributing to innovative cloud solutions. Candidates should be passionate about tackling complex technical challenges and delivering impactful results in a fast-paced environment. #J-18808-Ljbffr more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • IMC (Chicago, IL)
    A global trading firm is seeking a Machine Learning Engineer to develop large-scale training pipelines and optimize real-time predictions. Ideal candidates have ... 5+ years in ML, strong programming skills in Python or C++, and experience with GPU programming. This role offers a competitive salary range of $175,000 - $250,000. Join a collaborative environment where your work will influence trading strategies and… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Amazon (San Francisco, CA)
    …is seeking a Senior Machine Learning Engineer focused on AWS Neuron distributed training . The role demands strong programming skills, experience in machine ... learning, and leadership capabilities. Candidates should possess a bachelor's degree in computer science and have over 5 years of software development experience. This position offers a competitive salary and various benefits. #J-18808-Ljbffr more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Amazon (Seattle, WA)
    Engineer to develop solutions for AI/ML applications. This role involves building distributed training support and tuning ML models to maximize performance on ... AWS infrastructure. Candidates should have a strong software development background and experience in deep learning. An inclusive work culture that promotes work-life balance and career growth opportunities is offered. #J-18808-Ljbffr more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • NVIDIA Corporation (Santa Clara, CA)
    …The role involves designing, developing, and optimizing models and requires a Master's or PhD in a related field, along with 5+ years of industry experience. The ... salary ranges from $184,000 to $356,500 based on experience and level. This position includes equity and benefits, fostering a diverse work environment. #J-18808-Ljbffr more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Amazon (San Francisco, CA)
    Software Engineer - AI/ML, AWS Neuron Distributed Training Annapurna Labs designs silicon and software that accelerate innovation. Customers choose us to ... learning accelerators. This role is for a Senior Machine Learning Engineer in the Distributed Training team for AWS Neuron, responsible for development,… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Amazon (San Francisco, CA)
    Engineer for their Machine Learning Applications team. This role focuses on distributed training , performance tuning, and support for numerous large-scale ML ... models. Candidates should have strong software development skills and experience with Python. The position offers competitive compensation, support for career growth, and a commitment to work-life balance. #J-18808-Ljbffr more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Boeing (Hazelwood, MO)
    …future with us. The Boeing Company is currently seeking a highly motivated Software Engineer (Experienced or Senior) to join the Training Systems - Battlespace ... control). BSM is responsible for the design, development, manufacture, and maintenance of training devices for a wide variety of commercial and military aircraft -… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Rubrik, Inc. (Palo Alto, CA)
    Software Engineer , Atlas Distributed Systems Rubrik Atlas is the core data path for all Rubrik products, whether in the data center, at the edge, or in the ... our Atlas platform. We are looking for an experienced distributed systems engineer to guide us through...factors, including job‑related skills, experience, and relevant education or training . US Pay Range $158,000 - $237,000 USD Join… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Google Inc. (Sunnyvale, CA)
    Senior Software Engineer , Google Distributed Cloud, Kubernetes corporate_fare Google place Sunnyvale, CA, USA Apply Bachelor's degree in Computer Science, ... year of experience with software design and architecture for distributed systems. Preferred qualifications: Master's degree or PhD in...on and is growing every day. As a software engineer , you will work on a specific project critical… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • krea.ai (San Francisco, CA)
    …AI research, real‑time user experiences, and large‑scale model deployments. As a Distributed Systems Engineer , you will design, build, and maintain large‑scale ... tools to harness this medium. This job Robust, reliable, and scalable distributed systems form the backbone of Krea. These systems support the infrastructure… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Labelbox (San Francisco, CA)
    …, data quality, or evaluation systems Familiarity with AI/ML workflows, model training , or benchmarking pipelines Experience with distributed systems or ... workflows across data, tooling, and infrastructure. Position Senior Rust Full-Stack Engineer - AI Data & Infrastructure Type: Contract, Remote Commitment: 20-40… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Google Inc. (Sunnyvale, CA)
    …C, C++, Go. 2 years of experience with developing large-scale infrastructure, distributed systems or networks, or experience with compute technologies, storage or ... bring fresh ideas from all areas, including information retrieval, distributed computing, large-scale system design, networking and data storage,...on and is growing every day. As a software engineer , you will work on a specific project critical… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Databricks Inc. (San Francisco, CA)
    …the roll-up and drill-down capabilities of traditional SQL query engines. As a software engineer on the Runtime team at Databricks, you will be building the next ... generation distributed data storage and processing systems that can outperform...to job-related skills, depth of experience, relevant certifications and training , and specific work location. Based on the factors… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Institute of Foundation Models (Sunnyvale, CA)
    A leading research lab in Sunnyvale is seeking a distributed ML infrastructure engineer to extend and scale training systems. The ideal candidate must have ... over 5 years of experience in ML systems with strong expertise in distributed training frameworks like DeepSpeed and FSDP. This role offers a competitive salary… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source