- TwelveLabs (San Francisco, CA)
- Join to apply for the Senior Software Engineer , Backend role at TwelveLabs. Base Pay Range $145,000.00/yr - $182,000.00/yr At TwelveLabs, we are pioneering the ... features like video search, generation, and embedding, integrated with model inference pipelines. Architect high‑throughput, service‑oriented backend systems… more
- Hp Iq (San Francisco, CA)
- …seamlessly integrating with cloud infrastructure. We are looking for a Senior Software Engineer to design and develop high‑performance, scalable services to ... edge devices. Optimize data pipelines and storage solutions for real‑time AI inference and processing. Implement security and privacy best practices for distributed… more
- Capital One (San Francisco, CA)
- …develop, test, deploy, and support AI software components including foundation model training, large language model inference , similarity search, ... Senior Lead AI Engineer (AI Foundations, LLM Core and Agentic AI)...developing and applying state‑of‑the‑art techniques for optimizing training and inference software to improve hardware utilization, latency,… more
- Applied Intuition (Washington, DC)
- …data generation techniques including simulation, diffusion, and gaussian splats Create inference software providing low-latency, real-time feedback to autonomy ... MLOps pipeline, improving ingestion and tooling, labeling and autolabeling, model architectures, training, evaluation and validation, inference -time… more
- Hp Iq (San Francisco, CA)
- …across a wide range of devices. We are looking for a Lead Machine Learning Engineer to focus on model development, optimisation, and deployment across edge and ... What You Might Do Train, optimise, and deploy machine learning models for real‑time inference on edge devices. Develop and refine AI pipelines for efficient model… more
- Expedia, Inc. (San Jose, CA)
- …including exciting travel perks, generous time-off, parental leave, a flexible work model (with some pretty cool offices), and career development resources, all to ... career journey. We're building a more open world. Join us. Machine Learning Engineer III Introduction to the Team: Expedia Technology teams partner with our Product… more
- WekaIO (San Francisco, CA)
- …into data pipelines that dramatically increase GPU utilization and make AI model training and inference , machine learning, and other compute‑intensive workloads ... agentic AI data infrastructure with a cloud and AI‑native software solution that can be deployed anywhere. It transforms...on this exciting journey. The Bay Area regional Sales Engineer will join our rapidly growing sales organization. Being… more
- Assembled (San Francisco, CA)
- Machine Learning Engineer - Forecasting & Scheduling at Assembled About Assembled Great customer support requires human agents and AI in perfect balance, and ... Forecasting & Scheduling Contact‑volume forecasting: data pipelines, statistical/ML models and inference services that predict ticket volumes, agent demand and time… more
- Eluvio (Berkeley, CA)
- …C/C++/Go/Rust). * Demonstrated knowledge in API design and implementation for model inference services, ensuring scalable, reliable, and efficient integration ... focused and expert team of systems, networking, application, and video software engineers, AI scientists, ML engineers, and security specialists working together… more
- Red Hat (Raleigh, NC)
- …models, and deliver innovative apps. The OpenShift AI team seeks a Software Engineer with Kubernetes and Model Inference Runtimes experience to join our ... packaging, such as PyPI libraries + Solid understanding of the fundamentals of model inference architectures + Experience with Jenkins, Git, shell scripting, and… more
- Amazon (Seattle, WA)
- …The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon's ... with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement and Acceleration team… more
- Amazon (Cupertino, CA)
- … lifecycles along with work experience on some optimizations for improving the model execution. - Software development experience in C++, Python (experience in ... at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and...ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement… more
- NVIDIA (Santa Clara, CA)
- …open-sourced inference frameworks. Seeking a Senior Deep Learning Algorithms Engineer to improve innovative generative AI models like LLMs, VLMs, multimodal and ... In this role, you will design, implement, and productionize model optimization algorithms for inference and deployment...you'll be doing: + Design and build modular, scalable model optimization software platforms that deliver exceptional… more
- NVIDIA (CA)
- …can make a lasting impact on the world. We are now looking for a Senior System Software Engineer to work on user facing tools for Dynamo Inference Server! ... NVIDIA is hiring software engineers for its GPU-accelerated deep learning software team, and we are a remote friendly work environment. Academic and commercial… more
- MongoDB (Palo Alto, CA)
- … Engineer , you'll focus on building core systems and services that power model inference at scale. You'll own key components of the infrastructure, work ... **About the Role** We're looking for a Senior Engineer to help build the next-generation inference...multi-tenant service design + Familiar with concepts in ML model serving and inference runtimes, even if… more
- NVIDIA (Santa Clara, CA)
- NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference for our growing team. As a key contributor, you will help design, build, and ... high-performance open-source frameworks, which are at the forefront of efficient large-scale model serving and inference . You will play a central role… more
- NVIDIA (Santa Clara, CA)
- NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference for our growing team. As a key contributor, you will help design, build, and ... vLLM, which are at the forefront of efficient large-scale model serving and inference . You will play...inference libraries, vLLM and SGLang, FlashInfer and LLM software solutions. + Work with cross-collaborative teams across frameworks,… more
- Amazon (Cupertino, CA)
- …and efficiently on AWS silicon. We are seeking a Software Development Engineer to lead and architect our next-generation model serving infrastructure, with a ... Description AWS Neuron is the software stack powering AWS Inferentia and Trainium machine...resilient AI infrastructure at AWS. We focus on developing model -agnostic inference innovations, including disaggregated serving, distributed… more
- Amazon (Seattle, WA)
- …cloud-scale machine learning accelerators. This role is for a senior software engineer in the Machine Learning Inference Applications team. This role is ... Description AWS Neuron is the complete software stack for the AWS Inferentia and Trainium...and performance optimization of core building blocks of LLM Inference - Attention, MLP, Quantization, Speculative Decoding, Mixture of… more
- Red Hat (Boston, MA)
- …closely with our product and research teams to scale SOTA deep learning products and software . As an ML Ops engineer , you will work closely with our technical ... open-source LLMs and vLLM to every enterprise. Red Hat Inference team accelerates AI for the enterprise and brings...the vLLM project, and inventors of state-of-the-art techniques for model compression, our team provides a stable platform for… more