- Databricks Inc. (San Francisco, CA)
- A leading AI-focused technology company in San Francisco is seeking a Software Engineer for GenAI inference . In this role, you'll design, develop, and ... optimize the inference engine powering the Foundation Model API. You will collaborate closely with researchers and engage in performance-critical system challenges,… more
- Databricks Inc. (San Francisco, CA)
- Staff Software Engineer - GenAI inference P-1285 About This Role As a staff software engineer for GenAI inference , you will lead the ... low latency, and robust scaling. Your work will encompass the full GenAI inference stack: kernels, runtimes, orchestration, memory, and integration with… more
- Menlo Ventures (San Francisco, CA)
- About This Role As a software engineer for GenAI inference , you will help design, develop, and optimize the inference engine that powers Databricks' ... our large language model (LLM) serving systems are fast, scalable , and efficient. Your work will touch the full..., and efficient. Your work will touch the full GenAI inference stack - from kernels and… more
- Genentech (San Francisco, CA)
- …and optimise workflows. We also work on scaling up model training and inference , evaluating the quality of AI/ML models and output, and building impactful ... the scientific needs. The Opportunity: As a machine learning engineer in AI Enablement, you will be working closely...everyone in between. You'll build, own, and constantly improve scalable AI/ML based systems that unlock the potential of… more
- Socotra, Inc. (San Francisco, CA)
- Build the Future of Scalable AI at TrueFoundry At TrueFoundry , we're redefining how ML teams train, deploy, and scale their models. Our LLMOps and MLOps platform ... on Kubernetes-with the same muscle as Big Tech. We're looking for an Engineer who is passionate about scaling deep learning workloads, optimizing multi-GPU training,… more
- harvey.ai (San Francisco, CA)
- … GenAI ‑native applications - such as supporting high‑throughput model inference , managing streaming and long‑running API interactions, and designing abstractions ... today - and we're just getting started. Role Overview As a Backend Platform Engineer at Harvey, you will help build and operate the cohesive backend platform that… more
- Amazon (San Francisco, CA)
- …multi‑lingual large language models (LLM). AGI's mission is to leverage our hyper‑ scalable , general‑purpose large model training and inference systems to build ... cluster and node management to ensure smooth operation of GenAI infrastructure. Continuously improve and automate cluster/capacity/maintenance upgrades. Troubleshoot… more
- DataRobot (San Francisco, CA)
- …& Libraries, LLM Onboarding,Tools, Multi-Agent Evaluations, Multimodality, etc.) and GenAI systems (eg Inference optimization, Distributed Training, Finetuning, ... today and in the future. As a Principal Software Engineer for Generative AI at DataRobot, you will be...DataRobot, you will be the technical anchor for our GenAI Tooling and Systems teams, shaping the architecture, ensuring… more
- Meta (Menlo Park, CA)
- … GenAI /LLM scaling reliability and performance. **Required Skills:** Software Engineer , SystemML - Scaling / Performance Responsibilities: 1. Enabling reliable ... products and innovations to leverage our large-scale GPU training and inference fleet through an observable, reliable and high-performance distributed AI/GPU… more
- Meta (Menlo Park, CA)
- …products and innovations to leverage our large-scale GPU training and inference fleet through an observable, reliable and high-performance distributed AI/GPU ... to improve the full-stack distributed ML reliability and performance (eg Large-Scale GenAI /LLM training) from the trainer down to the inter-GPU and network… more