- DatologyAI (Redwood City, CA)
- …looking for an engineer with deep experience building and operating large-scale training and inference systems. You will design, implement, and maintain the ... researchers to productionize new models and features quickly and safely. Optimize training and inference pipelines for performance, reliability, and cost. Ensure… more
- Amazon (San Francisco, CA)
- Senior Software Development Engineer , AI/ML, AWS Neuron, Model Inference Job ID: 3067759 | Amazon.com Services LLC The Annapurna Labs team at Amazon Web ... Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and...ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference… more
- Menlo Ventures (San Francisco, CA)
- About This Role As a software engineer for GenAI inference , you will help design, develop, and optimize the inference engine that powers Databricks' ... are fast, scalable, and efficient. Your work will touch the full GenAI inference stack - from kernels and runtimes to orchestration and memory management. What… more
- OpenAI (San Francisco, CA)
- …and low-latency connection management. Have 5+ years of experience as a software engineer and systems architect working on high-scale, high-reliability ... About the Team Our Inference team brings OpenAI's most capable research and.... About the Role We're looking for a senior engineer to design and build the load balancer that… more
- OpenAI (San Francisco, CA)
- …tighter coordination with product and research. About the Role We're looking for a software engineer to help us serve OpenAI's multimodal models at scale. You'll ... About the Team OpenAI's Inference team powers the deployment of our most...work is inherently cross-functional: you'll collaborate directly with researchers training these models and with product teams defining new… more
- Amazon (San Francisco, CA)
- Software Development Engineer , AI/ML, AWS Neuron, Model Inference The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software ... integrates with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement and… more
- Hamilton Barnes Associates Limited (San Francisco, CA)
- …of H100s, H200s, and B200s, ready to go for experimentation, full-scale model training , or inference . Our client operates high-performance GPU clusters powering ... the most advanced AI workloads worldwide. They're now building a serverless inference platform, beginning with cost-efficient batch inference and expanding into… more
- Capital One (San Francisco, CA)
- …Java, or Golang* Experience developing and applying state-of-the-art techniques for optimizing training and inference software to improve hardware ... in engineering and mathematics, and your expertise in hardware, software , and AI enable you to see and exploit...developing AI and ML algorithms or technologies (eg LLM Inference , Similarity Search and VectorDBs, Guardrails, Memory) using Python,… more
- F. Hoffmann-La Roche AG (South San Francisco, CA)
- Senior/Principal Software Engineer , AI Enablement (Full stack) page is loaded Senior/Principal Software Engineer , AI Enablement (Full stack) Apply ... optimise workflows. We also work on scaling up model training and inference , evaluating the quality of...to meet the scientific needs. The Opportunity: As a software engineer in AI Enablement with a… more
- Crusoe Energy Systems LLC (San Francisco, CA)
- …This Role: The Crusoe Cloud Managed AI team seeks an ambitious and experienced Senior Software Engineer to join their team. You'll have a pivotal role in shaping ... large-scale, production-level services. (Preferred) Familiarity with AI infrastructure, including training , inference , and ETL pipelines. (Preferred) Contributions… more
- quadric.io, Inc (Burlingame, CA)
- …executes both NN graph code and conventional C++ DSP and control code. Role: The AI Inference Engineer in Quadric is the key bridge between the world of AI/LLM ... general purpose neural processing unit (GPNPU) architecture. Quadric's co-optimized software and hardware is targeted to run neural network...models and Quadric unique platforms. The AI Inference Engineer at Quadric will [1] port… more
- Crusoe Energy Systems LLC (San Francisco, CA)
- …Generative AI (Large Language Models, Multimodal). Familiarity with AI infrastructure, including training , inference , and ETL pipelines. Software Engineering ... About This Role: As a Senior Staff Software Engineer on the Managed AI...shaping the architecture and scalability of our next-generation AI inference platform. You will lead the design and implementation… more
- Menlo Ventures (San Francisco, CA)
- About This Role As a staff software engineer for GenAI Performance and Kernel, you will own the design, implementation, optimization, and correctness of the ... high-performance GPU kernels powering our GenAI inference stack. You will lead development of highly-tuned, low-level compute paths, manage trade-offs between… more
- Rockstar (San Francisco, CA)
- …promise is simple: they make your AI system better. They are hiring a Backend Software Engineer (ML Infrastructure) to help design, build, and scale the core ... ML workloads, including fine-tuning and reinforcement learning. Build distributed training and inference pipelines that are efficient, fault-tolerant,… more
- Amazon (San Francisco, CA)
- …model, such as 1) model data pipelines: building data pipelines to produce inputs for training and inference in both online and offline contexts; 2) Training ... rapid pace. We are looking for a Machine Learning Engineer (MLE) to join the team to drive key...and inference pipelines: orchestration of model training and inference jobs; 3) Post- inference… more
- Voxel (San Francisco, CA)
- … training jobs, mixed-precision optimizations, or TensorRT / Triton inference . Familiarity with active-learning, continuous- training , or online distillation ... by industry leading VC's. Voxel is looking for a Staff Machine-Learning Infrastructure Engineer to drive the next wave of our computer-vision platform for workplace… more
- Menlo Ventures (San Francisco, CA)
- …fine-tuned and proprietary large language models. It offers real-time, low-latency inference , governance, monitoring, and lineage. As AI adoption accelerates, Model ... operationalize models at scale with strong SLAs and cost efficiency. As a Staff Engineer , you'll play a critical role in shaping both the product experience and the… more
- Databricks Inc. (San Francisco, CA)
- …fine‑tuned and proprietary large language models. It offers real‑time, low‑latency inference , governance, monitoring, and lineage. As AI adoption accelerates, Model ... operationalize models at scale with strong SLAs and cost efficiency. As a Staff Engineer , you'll play a critical role in shaping both the product experience and the… more
- Lodestar (San Francisco, CA)
- …end-to-end, autonomous in-space bodyguarding service. About the Job At Lodestar, as a Software Engineer - Localization, State Estimation & Prediction , you'll be ... Proficiency with GPU acceleration (CUDA, TensorRT) for neural network training and inference Experience with Linux, Git,...more about the ITARhere . Pay Range: (E1) Junior Software Engineer : $120,000 - $140,000 / year… more
- OpenAI (San Francisco, CA)
- …kernels, distributed system optimizations, and runtime improvements to make large-scale training and inference more efficient. Our work enables OpenAI ... up new compute platforms that can support large-scale AI training and inference . Your work will range...inference . Your work will range from prototyping system software on new accelerators to enabling performance optimizations across… more