• Databricks Inc. (San Francisco, CA)
    A leading AI-focused technology company in San Francisco is seeking a Software Engineer for GenAI inference . In this role, you'll design, develop, and ... optimize the inference engine powering the Foundation Model API. You will collaborate closely with researchers and engage in performance-critical system challenges,… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Databricks Inc. (San Francisco, CA)
    Staff Software Engineer - GenAI inference P-1285 About This Role As a staff software engineer for GenAI inference , you will lead the ... low latency, and robust scaling. Your work will encompass the full GenAI inference stack: kernels, runtimes, orchestration, memory, and integration with… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Menlo Ventures (San Francisco, CA)
    About This Role As a software engineer for GenAI inference , you will help design, develop, and optimize the inference engine that powers Databricks' ... our large language model (LLM) serving systems are fast, scalable , and efficient. Your work will touch the full..., and efficient. Your work will touch the full GenAI inference stack - from kernels and… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Genentech (San Francisco, CA)
    …and optimise workflows. We also work on scaling up model training and inference , evaluating the quality of AI/ML models and output, and building impactful ... the scientific needs. The Opportunity: As a machine learning engineer in AI Enablement, you will be working closely...everyone in between. You'll build, own, and constantly improve scalable AI/ML based systems that unlock the potential of… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Socotra, Inc. (San Francisco, CA)
    Build the Future of Scalable AI at TrueFoundry At TrueFoundry , we're redefining how ML teams train, deploy, and scale their models. Our LLMOps and MLOps platform ... on Kubernetes-with the same muscle as Big Tech. We're looking for an Engineer who is passionate about scaling deep learning workloads, optimizing multi-GPU training,… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • harvey.ai (San Francisco, CA)
    GenAI ‑native applications - such as supporting high‑throughput model inference , managing streaming and long‑running API interactions, and designing abstractions ... today - and we're just getting started. Role Overview As a Backend Platform Engineer at Harvey, you will help build and operate the cohesive backend platform that… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Amazon (San Francisco, CA)
    …multi‑lingual large language models (LLM). AGI's mission is to leverage our hyper‑ scalable , general‑purpose large model training and inference systems to build ... cluster and node management to ensure smooth operation of GenAI infrastructure. Continuously improve and automate cluster/capacity/maintenance upgrades. Troubleshoot… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Principal Software Engineer

    DataRobot (San Francisco, CA)
    …& Libraries, LLM Onboarding,Tools, Multi-Agent Evaluations, Multimodality, etc.) and GenAI systems (eg Inference optimization, Distributed Training, Finetuning, ... today and in the future. As a Principal Software Engineer for Generative AI at DataRobot, you will be...DataRobot, you will be the technical anchor for our GenAI Tooling and Systems teams, shaping the architecture, ensuring… more
    DataRobot (01/08/26)
    - Save Job - Related Jobs - Block Source
  • Software Engineer , SystemML - Scaling…

    Meta (Menlo Park, CA)
    GenAI /LLM scaling reliability and performance. **Required Skills:** Software Engineer , SystemML - Scaling / Performance Responsibilities: 1. Enabling reliable ... products and innovations to leverage our large-scale GPU training and inference fleet through an observable, reliable and high-performance distributed AI/GPU… more
    Meta (12/20/25)
    - Save Job - Related Jobs - Block Source
  • Research Scientist, AI Networking (PhD)

    Meta (Menlo Park, CA)
    …products and innovations to leverage our large-scale GPU training and inference fleet through an observable, reliable and high-performance distributed AI/GPU ... to improve the full-stack distributed ML reliability and performance (eg Large-Scale GenAI /LLM training) from the trainer down to the inter-GPU and network… more
    Meta (12/20/25)
    - Save Job - Related Jobs - Block Source