Ml Engineer Vllm Inference Jobs | Juju

Senior Principal Machine Learning Engineer…

Red Hat (Boston, MA)

…is open and we are on a mission to bring the power of open-source LLMs and vLLM to every enterprise. Red Hat Inference team accelerates AI for the enterprise and ... optimize, and scale LLM deployments. As a Machine Learning Engineer focused on distributed vLLM (https://github.com/ vllm...components in Go and/or Rust to integrate with the vLLM project and manage distributed inference workloads.… more

Red Hat (01/08/26)
- Save Job - Related Jobs - Block Source
Machine Learning Engineer , vLLM…

Red Hat (Raleigh, NC)

…open, and we are on a mission to bring the power of open-source LLMs and vLLM to every enterprise. The Red Hat Inference team accelerates AI for the enterprise ... optimize, and scale LLM deployments. As a Machine Learning Engineer focused on vLLM , you will be...you. Join us in shaping the future of AI Inference ! **What You Will Do** + Write robust Python… more

Red Hat (12/31/25)
- Save Job - Related Jobs - Block Source
Senior Principal Machine Learning Engineer…

Red Hat (Boston, MA)

…is open and we are on a mission to bring the power of open-source LLMs and vLLM to every enterprise. Red Hat Inference team accelerates AI for the enterprise and ... project (https://github.blog/news-insights/octoverse/octoverse-a-new-developer-joins-github-every-second-as-ai-leads-typescript-to-1/#the-top-open-source-projects-by-contributors) on Github. As a Machine Learning Engineer focused on vLLM , you will… more

Red Hat (01/08/26)
- Save Job - Related Jobs - Block Source
Senior Software Engineer - vLLM…

Red Hat (Boston, MA)

…is open and we are on a mission to bring the power of open-source LLMs and vLLM to every enterprise. Red Hat Inference team accelerates AI for the enterprise and ... to GenAI deployments. As leading developers, maintainers of the vLLM project, and inventors of state-of-the-art techniques for model...scale LLM deployments. We are seeking an experienced Senior ML Ops engineer to work closely with… more

Red Hat (12/06/25)
- Save Job - Related Jobs - Block Source
Staff ML Engineer , Inference…

General Motors (Sunnyvale, CA)

…Python, C++ or other relevant coding languages. + Expertise in ML inference , model serving frameworks (triton, rayserve, vLLM etc). + Strong communication ... is eligible for relocation assistance.** **About the Team:** The ML Inference Platform is part of the...efficiency. **About the Role:** We are seeking a Staff ML Infrastructure engineer to help build and… more

General Motors (10/21/25)
- Save Job - Related Jobs - Block Source
Senior Software Development Engineer , AI/…

Amazon (Seattle, WA)

…integrates with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement and ... collaborate across teams to develop innovative optimization techniques * Build online/offline inference serving with vLLM , SGLang, TensorRT or similar platforms… more

Amazon (01/06/26)
- Save Job - Related Jobs - Block Source
Software Development Engineer - AI/…

Amazon (Seattle, WA)

…integrates with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement and ... with syntax and tile-level semantics similar to Triton. - Experience with online/offline inference serving with vLLM , SGLang, TensorRT or similar platforms in… more

Amazon (12/31/25)
- Save Job - Related Jobs - Block Source
Software Development Engineer AI/ ML…

Amazon (Cupertino, CA)

…the boundaries of what's possible in large-scale ML serving. Recent shares: https://github.com/aws-neuron/upstreaming-to- vllm /releases/tag/2.25.0 ... - Master's degree in computer science or equivalent - Deep expertise in ML Frameworks/Libraries such as JAX, PyTorch, vLLM , SGLang, Dynamo, TorchXLA, TensorRT.… more

Amazon (12/21/25)
- Save Job - Related Jobs - Block Source
Senior Software Engineer , AI…

NVIDIA (Santa Clara, CA)

…and passionate about performance engineering in ML frameworks (eg, PyTorch) and inference engines (eg, vLLM and SGLang). + Familiarity with GPU programming ... latest NVIDIA GPU hardware features; profile and optimize the inference framework ( vLLM ) with methods like speculative...building and optimizing LLM inference engines (eg, vLLM , SGLang). + Hands-on work with ML … more

NVIDIA (01/10/26)
- Save Job - Related Jobs - Block Source
Lead Engineer , Inference Platform

MongoDB (Palo Alto, CA)

…in multi-tenant environments + 1+ years of experience serving as TL for a large-scale ML inference or training platform SW project **Nice to Have** + Prior ... We're looking for a Lead Engineer , Inference Platform to join our...of experience in managing a technical team focused on ML inference or training infrastructure **Why Join… more

MongoDB (12/27/25)
- Save Job - Related Jobs - Block Source
Senior Software Engineer , Inference…

MongoDB (Palo Alto, CA)

**About the Role** We're looking for a Senior Engineer to help build the next-generation inference platform that supports embedding models used for semantic ... Atlas and designed for developer-first experiences. As a Senior Engineer , you'll focus on building core systems and services...a cloud-native environment + Work across product, infrastructure, and ML teams to ensure the inference platform… more

MongoDB (01/08/26)
- Save Job - Related Jobs - Block Source
Senior GenAI Algorithms Engineer - Model…

NVIDIA (Santa Clara, CA)

…open-sourced inference frameworks. Seeking a Senior Deep Learning Algorithms Engineer to improve innovative generative AI models like LLMs, VLMs, multimodal and ... PyTorch, JAX, vLLM , SGLang, or other machine learning training and inference frameworks. + Hands-on experience training or fine-tuning generative AI models on… more

NVIDIA (01/10/26)
- Save Job - Related Jobs - Block Source
Senior Software Engineer , AI…

NVIDIA (CA)

…of modern ML architectures with a keen intuition for optimizing inference performance. + Take full ownership of problems end-to-end, proactively acquiring any ... We are now looking for a Senior System Software Engineer to work on user facing tools for Dynamo... to work on user facing tools for Dynamo Inference Server! NVIDIA is hiring software engineers for its… more

NVIDIA (11/29/25)
- Save Job - Related Jobs - Block Source
Senior Technical Marketing Engineer - AI…

NVIDIA (Santa Clara, CA)

…(PyTorch, TensorFlow, JAX), and inference -specific frameworks & optimizations (Triton Inference Server, TensorRT-LLM, vLLM , SGLang). + Market Awareness - ... to power AI at scale. We are looking for a Senior Technical Marketing Engineer to join our growing accelerated computing product team. This role is pivotal in… more

NVIDIA (11/06/25)
- Save Job - Related Jobs - Block Source
Senior Engineer -AI Inference

Bank of America (Addison, TX)

Senior Engineer -AI Inference Addison, Texas;Plano, Texas; Newark, Delaware; Charlotte, North Carolina; Kennesaw, Georgia **To proceed with your application, you ... must be at least 18 years of age.** Acknowledge (https://ghr.wd1.myworkdayjobs.com/Lateral-US/job/Addison/Senior- Engineer -AI- Inference \_25029879) **Job Description:** At Bank of America,… more

Bank of America (12/22/25)
- Save Job - Related Jobs - Block Source
AI / ML Engineer

Guidehouse (Huntsville, AL)

…to 10% **Clearance Required** **:** Active Top Secret (TS) Guidehouse is seeking a Lead AI/ ML Engineer to join our Technology / AI and Data team, supporting ... You Will Do** **:** + Serves as the lead AI/ ML engineer responsible for developing, optimizing, and...to ensure accuracy and stability. + Implement distributed GPU inference frameworks ( vLLM , TGI, DeepSpeed, Sagemaker) and… more

Guidehouse (01/01/26)
- Save Job - Related Jobs - Block Source
Sr. Software Engineer - AI/ ML , AWS…

Amazon (Seattle, WA)

…AWS's next-generation AI accelerators Inferentia and Trainium. As a Senior Software Engineer in our Machine Learning Applications team, you'll be at the forefront ... AI models at unprecedented scale. What You'll Impact: * Pioneer distributed inference solutions for industry-leading LLMs such as GPT, Llama, Qwen * Optimize… more

Amazon (10/31/25)
- Save Job - Related Jobs - Block Source
Staff Software Engineer , ML Serving…

DoorDash (San Francisco, CA)

…or operating large-scale, high-QPS ML serving systems. + Bring deep familiarity with ML inference and serving ecosystems. + Know how to leverage and extend ... About the Role We're looking for a Staff Software Engineer with deep expertise in ML model...model serving to drive the next generation of our inference platform. This is a highly technical, hands-on role:… more

DoorDash (11/24/25)
- Save Job - Related Jobs - Block Source
Senior Deep Learning Software Engineer…

NVIDIA (Santa Clara, CA)

…or using deep learning frameworks (eg PyTorch, JAX, TensorFlow, ONNX, etc) and ideally inference engines and runtimes such as vLLM , SGLang, and MLC. + Strong ... are now looking for a Senior Deep Learning Software Engineer , FlashInfer. NVIDIA has been transforming computer graphics, PC...and training (eg FlashInfer, Flash Attention) + Expertise in inference engines like vLLM and SGLang +… more

NVIDIA (11/01/25)
- Save Job - Related Jobs - Block Source
Senior Performance Engineer - AI Platforms

Red Hat (Boston, MA)

…Nsight Systems, PyTorch Profiler, among others + Hands-on experience with modern LLM inference server stacks (eg, vLLM , TensorRT-LLM, TGI, Triton Inference ... and Scale Engineering team is seeking a Senior Performance Engineer to join our PSAP (Performance and Scale for...you will drive the performance and scalability of distributed inference for Large Language Models (LLMs) as part of… more

Red Hat (01/05/26)
- Save Job - Related Jobs - Block Source

"Juju

Recent Searches

Recent Jobs

Account Login

Sign Up

Forgot your password?

Advanced Search