Senior Tensorrt Llm Engineer Jobs

32 jobs (page 1)

Categories

All Categories

Engineering (11)

Senior Software Development Engineer…

NVIDIA Corporation (Santa Clara, CA)

We are now looking for a TensorRT - LLM Software Development Engineer ! NVIDIA is hiring software engineers for its TensorRT - LLM team. Academic and ... high-quality (C++/Python) code for our core backend software for LLM inference. Improve the usability of the TensorRT... LLM inference. Improve the usability of the TensorRT - LLM library and build systems (CMake) What… more

job goal (01/13/26)
- Save Job - Related Jobs - Block Source
Principal Software Engineer - Large-Scale…

NVIDIA Corporation (Santa Clara, CA)

Principal Software Engineer - Large-Scale LLM Memory and Storage Systems page is loaded## Principal Software Engineer - Large-Scale LLM Memory and ... of any single GPU, this platform enables efficient, resilient deployment of cutting-edge LLM workloads.We are seeking a Principal Systems Engineer to define the… more

job goal (01/13/26)
- Save Job - Related Jobs - Block Source
Senior AI Engineer , NeMo Retriever…

NVIDIA Corporation (Santa Clara, CA)

Senior AI Engineer , NeMo Retriever - Model Optimization and MLOps page is loaded## Senior AI Engineer , NeMo Retriever - Model Optimization and ... inference engines from NVIDIA and the community, including NVIDIA TensorRT and TensorRT - LLM , NIM microservices...The NeMo Retriever team is looking for an AI Engineer to join our team, focusing on the intersection… more

job goal (01/13/26)
- Save Job - Related Jobs - Block Source
Senior Software Engineer - AI/ML…

GEICO (Palo Alto, CA)

…Great Rewards and Great Careers.**GEICO AI ML Infrastructure team is seeking an exceptional Senior ML Platform Engineer to build and scale our machine learning ... maintain feature stores for ML model training and inference pipelines* Build and optimize LLM inference systems using frameworks like vLLM, TensorRT - LLM , and… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
Senior Inference Platform Engineer…

Hamilton Barnes Associates Limited (San Francisco, CA)

…Integrate, tune, and operate inference engines such as vLLM, SGLang, and TensorRT - LLM across multiple model types. Develop APIs, orchestration layers, and ... orchestration. Practical experience with model-serving frameworks such as vLLM, SGLang, TensorRT - LLM , or custom PyTorch deployments. Knowledge of performance… more

job goal (01/13/26)
- Save Job - Related Jobs - Block Source
Senior Software Engineer , Model…

Apple Inc. (San Francisco, CA)

Senior Software Engineer , Model Inference San Francisco Bay Area, California, United States Software and Services Join Apple Maps to help build the best map in ... measurable results at global scale. Description As a Software Engineer on the Apple Maps team, you will lead...and Speculative Decoding. Skilled in GPU optimization (eg, CUDA, TensorRT - LLM , cuDNN) to accelerate inference tasks. Skilled… more

job goal (01/13/26)
- Save Job - Related Jobs - Block Source
Senior Software Development Engineer…

Amazon (San Francisco, CA)

Senior Software Development Engineer , AI/ML, AWS Neuron, Model Inference Job ID: 3067759 | Amazon.com Services LLC The Annapurna Labs team at Amazon Web Services ... responsible for development, enablement and performance tuning of a wide variety of LLM model families, including massive scale large language models like the Llama… more

job goal (01/13/26)
- Save Job - Related Jobs - Block Source
Senior Technical Marketing Engineer…

NVIDIA Corporation (Santa Clara, CA)

Senior Technical Marketing Engineer - AI Inference at Scale page is loaded## Senior Technical Marketing Engineer - AI Inference at Scalelocations: US, ... power AI at scale. We are looking for a Senior Technical Marketing Engineer to join our...JAX), and inference-specific frameworks & optimizations (Triton Inference Server, TensorRT - LLM , vLLM, SGLang).* Market Awareness - Experience… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
Software Development Engineer , AI/ML, AWS…

Amazon (San Francisco, CA)

Software Development Engineer , AI/ML, AWS Neuron, Model Inference The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development ... responsible for development, enablement and performance tuning of a wide variety of LLM model families, including massive scale large language models like the Llama… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
Senior Deep Learning Communication…

NVIDIA Corporation (Santa Clara, CA)

…more deep neural network (DNN) training and Inference frameworks, such as PyTorch, TensorRT - LLM , vLLM, SGLang.* Strong programming skills in C++ and Python.* ... our exclusive engineering teams are rapidly growing. If you're a creative and autonomous engineer with a real passion for technology, we want to hear from you.… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
Software Architect, NIM Factory

NVIDIA Corporation (Santa Clara, CA)

…MS) or equivalent experience.**Ways to stand out from the crowd: Hands-on with LLM inference stacks (Triton Inference Server, TensorRT - LLM , vLLM).* ... GPUs. You will shape our strategy for emerging challenges like disaggregated LLM inference and safeguard the long-term technical health of the platform.**What you'll… more

job goal (01/13/26)
- Save Job - Related Jobs - Block Source
Senior Software Development Engineer…

NVIDIA (Santa Clara, CA)

We are now looking for a TensorRT - LLM Software Development Engineer ! NVIDIA is hiring software engineers for its TensorRT - LLM team. Academic and ... core backend software for LLM inference. + Improve the usability of the TensorRT - LLM library and build systems (CMake) What we need to see: + Masters or… more

NVIDIA (01/10/26)
- Save Job - Related Jobs - Block Source
Senior Deep Learning Software…

NVIDIA (Santa Clara, CA)

We are now looking for a Senior Deep Learning Software Engineer , LLM Performance! NVIDIA is seeking an experienced Deep Learning Engineer passionate ... learning community to implement the latest algorithms for public release in TensorRT LLM , VLLM, SGLang and LLM benchmarks. Identify performance opportunities… more

NVIDIA (11/25/25)
- Save Job - Related Jobs - Block Source
Principal Software Engineer - Large-Scale…

NVIDIA (Santa Clara, CA)

…deployment of cutting-edge LLM workloads. We are seeking a Principal Systems Engineer to define the vision and roadmap for memory management of large-scale ... large-scale LLM inference. + Architect and implement deep integrations with leading LLM serving engines (such as vLLM, SGLang, TensorRT - LLM ), with a… more

NVIDIA (01/10/26)
- Save Job - Related Jobs - Block Source
Senior GenAI Algorithms Engineer…

NVIDIA (Santa Clara, CA)

…and streamlined deployment strategies with open-sourced inference frameworks. Seeking a Senior Deep Learning Algorithms Engineer to improve innovative LLMs, ... ( TensorRT Model Optimizer, Megatron-LM, Megatron-Bridge, Nvidia-NeMo, NeMo-AutoModel, TensorRT - LLM ) and open-source frameworks (PyTorch, Hugging Face, vLLM,… more

NVIDIA (12/18/25)
- Save Job - Related Jobs - Block Source
Senior AI Engineer , NeMo Retriever…

NVIDIA (Santa Clara, CA)

…on pre-optimized inference engines from NVIDIA and the community, including NVIDIA TensorRT and TensorRT - LLM , NIM microservices optimize response latency ... The NeMo Retriever team is looking for an AI Engineer to join our team, focusing on the intersection...deployments, etc. + Familiarity with ML libraries, especially PyTorch, TensorRT , or TensorRT - LLM . + Excellent… more

NVIDIA (01/10/26)
- Save Job - Related Jobs - Block Source
Senior DL Algorithms Engineer…

NVIDIA (Santa Clara, CA)

We are now looking for a Senior DL Algorithms Engineer ! We are seeking a highly skilled Deep Learning Algorithms Engineer with hands-on experience optimizing ... deploy, and optimize models for efficient inference using frameworks such as TensorRT , TensorRT - LLM , vLLM, and SGLang. + Understand, analyze, profile, and… more

NVIDIA (11/06/25)
- Save Job - Related Jobs - Block Source
Senior Deep Learning Algorithm…

NVIDIA (Santa Clara, CA)

We are now looking for a Senior DL Algorithms Engineer ! We are seeking a highly skilled Deep Learning Algorithms Engineer with hands-on experience optimizing ... inference. + Convert and deploy models using frameworks such as TensorRT and TensorRT - LLM + Understand, analyze, profile, and optimize performance of… more

NVIDIA (11/06/25)
- Save Job - Related Jobs - Block Source
Senior Staff Machine Learning…

NVIDIA (Santa Clara, CA)

…Today, we are increasingly known as "the AI computing company." We are seeking a Senior Staff Machine Learning Engineer to join our Enterprise AI team and build ... frameworks such as PyTorch or TensorFlow; familiarity with CUDA-accelerated libraries (eg, TensorRT - LLM ) is a plus. + Proven track record to take a significant… more

NVIDIA (01/12/26)
- Save Job - Related Jobs - Block Source
Senior Performance Engineer - AI…

Red Hat (Boston, MA)

**About the Job** The Red Hat Performance and Scale Engineering team is seeking a Senior Performance Engineer to join our PSAP (Performance and Scale for AI ... for example.This is a dynamic role for a seasoned engineer with a growth mindset who handles and adapts...PyTorch Profiler, among others + Hands-on experience with modern LLM inference server stacks (eg, vLLM, TensorRT -… more

Red Hat (01/05/26)
- Save Job - Related Jobs - Block Source

"Juju

Recent Searches

Recent Jobs

Account Login

Sign Up

Forgot your password?

Advanced Search