Senior TensorRT LLM Engineer Jobs in Menlo Park, CA

28 jobs (page 1)

Categories

All Categories

Engineering (10)

Senior TensorRT - LLM…

NVIDIA Corporation (Santa Clara, CA)

A leading technology company is seeking a TensorRT - LLM Software Development Engineer . This role involves developing inferencing software for deep learning ... applications using C++ and Python, requiring a master's degree and experience in software development. Applicants should be proactive and have solid technical skills, particularly in C/C++ programming and AI frameworks like TensorFlow and PyTorch.… more

job goal (01/13/26)
- Save Job - Related Jobs - Block Source
Staff Software Engineer - AI/ML Infra

GEICO (Palo Alto, CA)

…and Great Careers.**GEICO AI platform and Infrastructure team is seeking an exceptional Senior ML Platform Engineer to build and scale our machine learning ... GEICO . For more information, please .Staff Software Engineer - AI/ML Infra page is loaded## Staff...ML model training and inference pipelines* Build and optimize LLM inference systems using frameworks like vLLM, TensorRT… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
Principal Software Engineer - Large-Scale…

NVIDIA Corporation (Santa Clara, CA)

Principal Software Engineer - Large-Scale LLM Memory and Storage Systems page is loaded## Principal Software Engineer - Large-Scale LLM Memory and ... of any single GPU, this platform enables efficient, resilient deployment of cutting-edge LLM workloads.We are seeking a Principal Systems Engineer to define the… more

job goal (01/13/26)
- Save Job - Related Jobs - Block Source
Senior AI Engineer , NeMo Retriever…

NVIDIA Corporation (Santa Clara, CA)

Senior AI Engineer , NeMo Retriever - Model Optimization and MLOps page is loaded## Senior AI Engineer , NeMo Retriever - Model Optimization and ... inference engines from NVIDIA and the community, including NVIDIA TensorRT and TensorRT - LLM , NIM microservices...The NeMo Retriever team is looking for an AI Engineer to join our team, focusing on the intersection… more

job goal (01/13/26)
- Save Job - Related Jobs - Block Source
Senior Technical Marketing Engineer…

NVIDIA Corporation (Santa Clara, CA)

Senior Technical Marketing Engineer - AI Inference at Scale page is loaded## Senior Technical Marketing Engineer - AI Inference at Scalelocations: US, ... power AI at scale. We are looking for a Senior Technical Marketing Engineer to join our...JAX), and inference-specific frameworks & optimizations (Triton Inference Server, TensorRT - LLM , vLLM, SGLang).* Market Awareness - Experience… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
Senior Deep Learning Communication…

NVIDIA Corporation (Santa Clara, CA)

…more deep neural network (DNN) training and Inference frameworks, such as PyTorch, TensorRT - LLM , vLLM, SGLang.* Strong programming skills in C++ and Python.* ... our exclusive engineering teams are rapidly growing. If you're a creative and autonomous engineer with a real passion for technology, we want to hear from you.… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
Senior Inference Platform Engineer…

Hamilton Barnes Associates Limited (San Francisco, CA)

…Integrate, tune, and operate inference engines such as vLLM, SGLang, and TensorRT - LLM across multiple model types. Develop APIs, orchestration layers, and ... orchestration. Practical experience with model-serving frameworks such as vLLM, SGLang, TensorRT - LLM , or custom PyTorch deployments. Knowledge of performance… more

job goal (01/13/26)
- Save Job - Related Jobs - Block Source
Senior Software Engineer , Model…

Apple Inc. (San Francisco, CA)

Senior Software Engineer , Model Inference San Francisco Bay Area, California, United States Software and Services Join Apple Maps to help build the best map in ... measurable results at global scale. Description As a Software Engineer on the Apple Maps team, you will lead...and Speculative Decoding. Skilled in GPU optimization (eg, CUDA, TensorRT - LLM , cuDNN) to accelerate inference tasks. Skilled… more

job goal (01/13/26)
- Save Job - Related Jobs - Block Source
Senior Software Development Engineer…

Amazon (San Francisco, CA)

Senior Software Development Engineer , AI/ML, AWS Neuron, Model Inference Job ID: 3067759 | Amazon.com Services LLC The Annapurna Labs team at Amazon Web Services ... responsible for development, enablement and performance tuning of a wide variety of LLM model families, including massive scale large language models like the Llama… more

job goal (01/13/26)
- Save Job - Related Jobs - Block Source
Software Architect, NIM Factory

NVIDIA Corporation (Santa Clara, CA)

…MS) or equivalent experience.**Ways to stand out from the crowd: Hands-on with LLM inference stacks (Triton Inference Server, TensorRT - LLM , vLLM).* ... GPUs. You will shape our strategy for emerging challenges like disaggregated LLM inference and safeguard the long-term technical health of the platform.**What you'll… more

job goal (01/13/26)
- Save Job - Related Jobs - Block Source
Software Development Engineer , AI/ML, AWS…

Amazon (San Francisco, CA)

Software Development Engineer , AI/ML, AWS Neuron, Model Inference The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development ... responsible for development, enablement and performance tuning of a wide variety of LLM model families, including massive scale large language models like the Llama… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
Senior Software Development Engineer…

NVIDIA (Santa Clara, CA)

We are now looking for a TensorRT - LLM Software Development Engineer ! NVIDIA is hiring software engineers for its TensorRT - LLM team. Academic and ... core backend software for LLM inference. + Improve the usability of the TensorRT - LLM library and build systems (CMake) What we need to see: + Masters or… more

NVIDIA (01/10/26)
- Save Job - Related Jobs - Block Source
Senior Deep Learning Software…

NVIDIA (Santa Clara, CA)

We are now looking for a Senior Deep Learning Software Engineer , LLM Performance! NVIDIA is seeking an experienced Deep Learning Engineer passionate ... learning community to implement the latest algorithms for public release in TensorRT LLM , VLLM, SGLang and LLM benchmarks. Identify performance opportunities… more

NVIDIA (11/25/25)
- Save Job - Related Jobs - Block Source
Principal Software Engineer - Large-Scale…

NVIDIA (Santa Clara, CA)

…deployment of cutting-edge LLM workloads. We are seeking a Principal Systems Engineer to define the vision and roadmap for memory management of large-scale ... large-scale LLM inference. + Architect and implement deep integrations with leading LLM serving engines (such as vLLM, SGLang, TensorRT - LLM ), with a… more

NVIDIA (01/10/26)
- Save Job - Related Jobs - Block Source
Senior GenAI Algorithms Engineer…

NVIDIA (Santa Clara, CA)

…and streamlined deployment strategies with open-sourced inference frameworks. Seeking a Senior Deep Learning Algorithms Engineer to improve innovative LLMs, ... ( TensorRT Model Optimizer, Megatron-LM, Megatron-Bridge, Nvidia-NeMo, NeMo-AutoModel, TensorRT - LLM ) and open-source frameworks (PyTorch, Hugging Face, vLLM,… more

NVIDIA (12/18/25)
- Save Job - Related Jobs - Block Source
Senior AI Engineer , NeMo Retriever…

NVIDIA (Santa Clara, CA)

…on pre-optimized inference engines from NVIDIA and the community, including NVIDIA TensorRT and TensorRT - LLM , NIM microservices optimize response latency ... The NeMo Retriever team is looking for an AI Engineer to join our team, focusing on the intersection...deployments, etc. + Familiarity with ML libraries, especially PyTorch, TensorRT , or TensorRT - LLM . + Excellent… more

NVIDIA (01/10/26)
- Save Job - Related Jobs - Block Source
Senior DL Algorithms Engineer…

NVIDIA (Santa Clara, CA)

We are now looking for a Senior DL Algorithms Engineer ! We are seeking a highly skilled Deep Learning Algorithms Engineer with hands-on experience optimizing ... deploy, and optimize models for efficient inference using frameworks such as TensorRT , TensorRT - LLM , vLLM, and SGLang. + Understand, analyze, profile, and… more

NVIDIA (11/06/25)
- Save Job - Related Jobs - Block Source
Senior Deep Learning Algorithm…

NVIDIA (Santa Clara, CA)

We are now looking for a Senior DL Algorithms Engineer ! We are seeking a highly skilled Deep Learning Algorithms Engineer with hands-on experience optimizing ... inference. + Convert and deploy models using frameworks such as TensorRT and TensorRT - LLM + Understand, analyze, profile, and optimize performance of… more

NVIDIA (11/06/25)
- Save Job - Related Jobs - Block Source
Senior Staff Machine Learning…

NVIDIA (Santa Clara, CA)

…Today, we are increasingly known as "the AI computing company." We are seeking a Senior Staff Machine Learning Engineer to join our Enterprise AI team and build ... frameworks such as PyTorch or TensorFlow; familiarity with CUDA-accelerated libraries (eg, TensorRT - LLM ) is a plus. + Proven track record to take a significant… more

NVIDIA (01/12/26)
- Save Job - Related Jobs - Block Source
AI Senior Staff Systems Engineer

Cadence Design Systems, Inc. (San Jose, CA)

…quantization, distillation, and using high-performance serving frameworks (eg, vLLM, TGI, TensorRT - LLM ) to maximize inference throughput and minimize latency. + ... implementing CI/CD pipelines for AI model development. + Advanced LLM Deployment & Optimization: Lead the deployment, serving, and...AI infrastructure. Proven track record as a Principal or Senior Staff Engineer . + Expert-level knowledge of… more

Cadence Design Systems, Inc. (12/29/25)
- Save Job - Related Jobs - Block Source

"Juju

Recent Searches

Recent Jobs

Account Login

Sign Up

Forgot your password?

Advanced Search