• NVIDIA Corporation (Santa Clara, CA)
    A leading technology company is seeking a TensorRT - LLM Software Development Engineer . This role involves developing inferencing software for deep learning ... applications using C++ and Python, requiring a master's degree and experience in software development. Applicants should be proactive and have solid technical skills, particularly in C/C++ programming and AI frameworks like TensorFlow and PyTorch.… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • NVIDIA Corporation (Santa Clara, CA)
    We are now looking for a TensorRT - LLM Software Development Engineer ! NVIDIA is hiring software engineers for its TensorRT - LLM team. Academic and ... high-quality (C++/Python) code for our core backend software for LLM inference. Improve the usability of the TensorRT... LLM inference. Improve the usability of the TensorRT - LLM library and build systems (CMake) What… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • NVIDIA Corporation (Santa Clara, CA)
    Principal Software Engineer - Large-Scale LLM Memory and Storage Systems page is loaded## Principal Software Engineer - Large-Scale LLM Memory and ... of any single GPU, this platform enables efficient, resilient deployment of cutting-edge LLM workloads.We are seeking a Principal Systems Engineer to define the… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • NVIDIA Corporation (Santa Clara, CA)
    Senior AI Engineer , NeMo Retriever - Model Optimization and MLOps page is loaded## Senior AI Engineer , NeMo Retriever - Model Optimization and ... inference engines from NVIDIA and the community, including NVIDIA TensorRT and TensorRT - LLM , NIM microservices...The NeMo Retriever team is looking for an AI Engineer to join our team, focusing on the intersection… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • GEICO (Palo Alto, CA)
    …Great Rewards and Great Careers.**GEICO AI ML Infrastructure team is seeking an exceptional Senior ML Platform Engineer to build and scale our machine learning ... maintain feature stores for ML model training and inference pipelines* Build and optimize LLM inference systems using frameworks like vLLM, TensorRT - LLM , and… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Hamilton Barnes Associates Limited (San Francisco, CA)
    …Integrate, tune, and operate inference engines such as vLLM, SGLang, and TensorRT - LLM across multiple model types. Develop APIs, orchestration layers, and ... orchestration. Practical experience with model-serving frameworks such as vLLM, SGLang, TensorRT - LLM , or custom PyTorch deployments. Knowledge of performance… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Apple Inc. (San Francisco, CA)
    Senior Software Engineer , Model Inference San Francisco Bay Area, California, United States Software and Services Join Apple Maps to help build the best map in ... measurable results at global scale. Description As a Software Engineer on the Apple Maps team, you will lead...and Speculative Decoding. Skilled in GPU optimization (eg, CUDA, TensorRT - LLM , cuDNN) to accelerate inference tasks. Skilled… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • NVIDIA Corporation (Santa Clara, CA)
    Senior Technical Marketing Engineer - AI Inference at Scale page is loaded## Senior Technical Marketing Engineer - AI Inference at Scalelocations: US, ... power AI at scale. We are looking for a Senior Technical Marketing Engineer to join our...JAX), and inference-specific frameworks & optimizations (Triton Inference Server, TensorRT - LLM , vLLM, SGLang).* Market Awareness - Experience… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Amazon (San Francisco, CA)
    Senior Software Development Engineer , AI/ML, AWS Neuron, Model Inference Job ID: 3067759 | Amazon.com Services LLC The Annapurna Labs team at Amazon Web Services ... responsible for development, enablement and performance tuning of a wide variety of LLM model families, including massive scale large language models like the Llama… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Amazon (San Francisco, CA)
    Software Development Engineer , AI/ML, AWS Neuron, Model Inference The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development ... responsible for development, enablement and performance tuning of a wide variety of LLM model families, including massive scale large language models like the Llama… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • NVIDIA Corporation (Santa Clara, CA)
    …more deep neural network (DNN) training and Inference frameworks, such as PyTorch, TensorRT - LLM , vLLM, SGLang.* Strong programming skills in C++ and Python.* ... our exclusive engineering teams are rapidly growing. If you're a creative and autonomous engineer with a real passion for technology, we want to hear from you.… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • NVIDIA Corporation (Santa Clara, CA)
    …MS) or equivalent experience.**Ways to stand out from the crowd: Hands-on with LLM inference stacks (Triton Inference Server, TensorRT - LLM , vLLM).* ... GPUs. You will shape our strategy for emerging challenges like disaggregated LLM inference and safeguard the long-term technical health of the platform.**What you'll… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Senior Software Development Engineer

    NVIDIA (Santa Clara, CA)
    We are now looking for a TensorRT - LLM Software Development Engineer ! NVIDIA is hiring software engineers for its TensorRT - LLM team. Academic and ... core backend software for LLM inference. + Improve the usability of the TensorRT - LLM library and build systems (CMake) What we need to see: + Masters or… more
    NVIDIA (01/10/26)
    - Save Job - Related Jobs - Block Source
  • Senior Deep Learning Software…

    NVIDIA (Santa Clara, CA)
    We are now looking for a Senior Deep Learning Software Engineer , LLM Performance! NVIDIA is seeking an experienced Deep Learning Engineer passionate ... learning community to implement the latest algorithms for public release in TensorRT LLM , VLLM, SGLang and LLM benchmarks. Identify performance opportunities… more
    NVIDIA (11/25/25)
    - Save Job - Related Jobs - Block Source
  • Principal Software Engineer - Large-Scale…

    NVIDIA (Santa Clara, CA)
    …deployment of cutting-edge LLM workloads. We are seeking a Principal Systems Engineer to define the vision and roadmap for memory management of large-scale ... large-scale LLM inference. + Architect and implement deep integrations with leading LLM serving engines (such as vLLM, SGLang, TensorRT - LLM ), with a… more
    NVIDIA (01/10/26)
    - Save Job - Related Jobs - Block Source
  • Senior GenAI Algorithms Engineer

    NVIDIA (Santa Clara, CA)
    …and streamlined deployment strategies with open-sourced inference frameworks. Seeking a Senior Deep Learning Algorithms Engineer to improve innovative LLMs, ... ( TensorRT Model Optimizer, Megatron-LM, Megatron-Bridge, Nvidia-NeMo, NeMo-AutoModel, TensorRT - LLM ) and open-source frameworks (PyTorch, Hugging Face, vLLM,… more
    NVIDIA (12/18/25)
    - Save Job - Related Jobs - Block Source
  • Senior AI Engineer , NeMo Retriever…

    NVIDIA (Santa Clara, CA)
    …on pre-optimized inference engines from NVIDIA and the community, including NVIDIA TensorRT and TensorRT - LLM , NIM microservices optimize response latency ... The NeMo Retriever team is looking for an AI Engineer to join our team, focusing on the intersection...deployments, etc. + Familiarity with ML libraries, especially PyTorch, TensorRT , or TensorRT - LLM . + Excellent… more
    NVIDIA (01/10/26)
    - Save Job - Related Jobs - Block Source
  • Senior DL Algorithms Engineer

    NVIDIA (Santa Clara, CA)
    We are now looking for a Senior DL Algorithms Engineer ! We are seeking a highly skilled Deep Learning Algorithms Engineer with hands-on experience optimizing ... deploy, and optimize models for efficient inference using frameworks such as TensorRT , TensorRT - LLM , vLLM, and SGLang. + Understand, analyze, profile, and… more
    NVIDIA (11/06/25)
    - Save Job - Related Jobs - Block Source
  • Senior Deep Learning Algorithm…

    NVIDIA (Santa Clara, CA)
    We are now looking for a Senior DL Algorithms Engineer ! We are seeking a highly skilled Deep Learning Algorithms Engineer with hands-on experience optimizing ... inference. + Convert and deploy models using frameworks such as TensorRT and TensorRT - LLM + Understand, analyze, profile, and optimize performance of… more
    NVIDIA (11/06/25)
    - Save Job - Related Jobs - Block Source
  • Senior Staff Machine Learning…

    NVIDIA (Santa Clara, CA)
    …Today, we are increasingly known as "the AI computing company." We are seeking a Senior Staff Machine Learning Engineer to join our Enterprise AI team and build ... frameworks such as PyTorch or TensorFlow; familiarity with CUDA-accelerated libraries (eg, TensorRT - LLM ) is a plus. + Proven track record to take a significant… more
    NVIDIA (01/12/26)
    - Save Job - Related Jobs - Block Source