Senior Model Inference Engineer Jobs

163 jobs (page 1)

Categories

All Categories

Engineering (63)

Software/IT (20)

Management (5)

Senior Lead AI Engineer (AI…

Capital One (San Francisco, CA)

…Design, develop, test, deploy, and support AI software components including foundation model training, large language model inference , similarity search, ... Senior Lead AI Engineer (AI Foundations,...developing AI and ML algorithms or technologies (eg LLM Inference , Similarity Search and VectorDBs, Guardrails, Memory) using Python,… more

job goal (01/13/26)
- Save Job - Related Jobs - Block Source
Senior Software Engineer , Backend

TwelveLabs (San Francisco, CA)

…features like video search, generation, and embedding, integrated with model inference pipelines. Architect high‑throughput, service‑oriented backend systems ... Join to apply for the Senior Software Engineer , Backend role at TwelveLabs. Base Pay Range $145,000.00/yr - $182,000.00/yr At TwelveLabs, we are pioneering the… more

job goal (01/13/26)
- Save Job - Related Jobs - Block Source
Senior AI Software Engineer

Hp Iq (San Francisco, CA)

…while seamlessly integrating with cloud infrastructure. We are looking for a Senior Software Engineer to design and develop high‑performance, scalable services ... edge devices. Optimize data pipelines and storage solutions for real‑time AI inference and processing. Implement security and privacy best practices for distributed… more

job goal (01/13/26)
- Save Job - Related Jobs - Block Source
Senior Sales Engineer (Bay Area)…

WekaIO (San Francisco, CA)

…into data pipelines that dramatically increase GPU utilization and make AI model training and inference , machine learning, and other compute‑intensive workloads ... on this exciting journey. The Bay Area regional Sales Engineer will join our rapidly growing sales organization. Being...our rapidly growing sales organization. Being a WEKA Sales Engineer will require you to partner with customers, understanding… more

job goal (01/13/26)
- Save Job - Related Jobs - Block Source
Machine Learning Engineer

Bland (San Francisco, CA)

…YC, the founders of Twilio, Affirm, ElevenLabs, and many more. About The Role As a Senior ML Engineer at Bland, you'll own the intelligence behind our voice AI ... and reduce latency. Optimize for enterprise scale: Handle complex inference optimization challenges- model quantization, efficient serving architectures, and… more

job goal (01/13/26)
- Save Job - Related Jobs - Block Source
Machine Learning Engineer - Forecasting…

Assembled (San Francisco, CA)

Machine Learning Engineer - Forecasting & Scheduling at Assembled About Assembled Great customer support requires human agents and AI in perfect balance, and ... Forecasting & Scheduling Contact‑volume forecasting: data pipelines, statistical/ML models and inference services that predict ticket volumes, agent demand and time… more

job goal (01/13/26)
- Save Job - Related Jobs - Block Source
Machine Learning Infrastructure Engineer

Workshop Labs (San Francisco, CA)

…data. Our core ML systems challenge: how do we serve the world's best personal model , at low cost and high speed, with bulletproof privacy? What you'll do Build the ... finetuned models for our customers Monitor & optimize in-the-wild model serving performance to hit low latency & cost...us-and integrate the privacy architecture with the finetuning & inference code You have A deep understanding of the… more

job goal (01/13/26)
- Save Job - Related Jobs - Block Source
Staff Data Scientist / Machine Learning…

Faire (San Francisco, CA)

…and willingness to learn new tools and techniques Demonstrated ability to mentor Senior Data Scientists, develop team strategy, and independently lead model ... in supervised fine tuning of multi-modal LLMs Experience deploying and optimizing LLM inference systems at scale (10B+ tokens), with focus on cost efficiency and… more

job goal (01/13/26)
- Save Job - Related Jobs - Block Source
Senior GenAI Algorithms Engineer…

NVIDIA (Santa Clara, CA)

…streamlined deployment strategies with open-sourced inference frameworks. Seeking a Senior Deep Learning Algorithms Engineer to improve innovative generative ... and diffusion models. In this role, you will design, implement, and productionize model optimization algorithms for inference and deployment on NVIDIA's latest… more

NVIDIA (01/10/26)
- Save Job - Related Jobs - Block Source
Senior Software Development Engineer…

Amazon (Seattle, WA)

…with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement and Acceleration team ... culture. The team works closely with customers on their model enablement, providing direct support and optimization expertise to...models like the Llama family, DeepSeek and beyond. The Inference Enablement and Acceleration team works side by side… more

Amazon (01/06/26)
- Save Job - Related Jobs - Block Source
Senior Software Development Engineer…

Amazon (Cupertino, CA)

…with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement and Acceleration team ... culture. The team works closely with customers on their model enablement, providing direct support and optimization expertise to...models like the Llama family, DeepSeek and beyond. The Inference Enablement and Acceleration team works side by side… more

Amazon (11/05/25)
- Save Job - Related Jobs - Block Source
Senior Software Engineer…

MongoDB (Palo Alto, CA)

**About the Role** We're looking for a Senior Engineer to help build the next-generation inference platform that supports embedding models used for semantic ... with Atlas and designed for developer-first experiences. As a Senior Engineer , you'll focus on building core...focus on building core systems and services that power model inference at scale. You'll own key… more

MongoDB (01/08/26)
- Save Job - Related Jobs - Block Source
Senior Software Engineer , AI…

NVIDIA (CA)

…how you can make a lasting impact on the world. We are now looking for a Senior System Software Engineer to work on user facing tools for Dynamo Inference ... data scientists. What you'll be doing: + Build and maintain distributed model management systems, including Rust-based runtime components, for large-scale AI … more

NVIDIA (11/29/25)
- Save Job - Related Jobs - Block Source
Senior Principal Machine Learning…

Red Hat (Boston, MA)

…bring the power of open-source LLMs and vLLM to every enterprise. Red Hat Inference team accelerates AI for the enterprise and brings operational simplicity to GenAI ... the vLLM and LLM-D projects, and inventors of state-of-the-art techniques for model quantization and sparsification, our team provides a stable platform for… more

Red Hat (01/08/26)
- Save Job - Related Jobs - Block Source
Senior DL Algorithms Engineer…

NVIDIA (Santa Clara, CA)

…leads the AI revolution. What you will be doing: + Implement language and multimodal model inference as part of NVIDIA Inference Microservices (NIMs). + ... We are now looking for a Senior DL Algorithms Engineer ! NVIDIA is...bugs and deliver production code to TRT-LLM, NVIDIA's open-source inference serving library. + Profile and analyze bottlenecks across… more

NVIDIA (01/08/26)
- Save Job - Related Jobs - Block Source
Senior Deep Learning Software…

NVIDIA (Santa Clara, CA)

NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference for our growing team. As a key contributor, you will help design, build, and ... frameworks, which are at the forefront of efficient large-scale model serving and inference . You will play...are growing fast. If you're a creative and autonomous engineer with a genuine passion for technology, we want… more

NVIDIA (12/07/25)
- Save Job - Related Jobs - Block Source
Senior Deep Learning Software…

NVIDIA (Santa Clara, CA)

NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference for our growing team. As a key contributor, you will help design, build, and ... SGLang and vLLM, which are at the forefront of efficient large-scale model serving and inference . You will play a central role in improving these platforms,… more

NVIDIA (12/05/25)
- Save Job - Related Jobs - Block Source
Senior Engineer -AI Inference

Bank of America (Addison, TX)

Senior Engineer -AI Inference Addison, Texas;Plano, Texas; Newark, Delaware; Charlotte, North Carolina; Kennesaw, Georgia **To proceed with your application, ... must be at least 18 years of age.** Acknowledge (https://ghr.wd1.myworkdayjobs.com/Lateral-US/job/Addison/ Senior - Engineer -AI- Inference \_25029879) **Job Description:** At Bank… more

Bank of America (12/22/25)
- Save Job - Related Jobs - Block Source
Senior Software Engineer - vLLM…

Red Hat (Boston, MA)

…for enterprises to build, optimize, and scale LLM deployments. We are seeking an experienced Senior ML Ops engineer to work closely with our product and research ... open-source LLMs and vLLM to every enterprise. Red Hat Inference team accelerates AI for the enterprise and brings...deep learning products and software. As an ML Ops engineer , you will work closely with our technical and… more

Red Hat (12/06/25)
- Save Job - Related Jobs - Block Source
Senior Principal Machine Learning…

Red Hat (Boston, MA)

…bring the power of open-source LLMs and vLLM to every enterprise. Red Hat Inference team accelerates AI for the enterprise and brings operational simplicity to GenAI ... maintainers of the vLLM project, and inventors of state-of-the-art techniques for model quantization and sparsification, our team provides a stable platform for… more

Red Hat (01/08/26)
- Save Job - Related Jobs - Block Source

"Juju

Recent Searches

Recent Jobs

Account Login

Sign Up

Forgot your password?

Advanced Search