• NVIDIA Corporation (Santa Clara, CA)
    A leading technology company is seeking a TensorRT - LLM Software Development Engineer . This role involves developing inferencing software for deep learning ... applications using C++ and Python, requiring a master's degree and experience in software development. Applicants should be proactive and have solid technical skills, particularly in C/C++ programming and AI frameworks like TensorFlow and PyTorch.… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • GEICO (Palo Alto, CA)
    …and Great Careers.**GEICO AI platform and Infrastructure team is seeking an exceptional Senior ML Platform Engineer to build and scale our machine learning ... GEICO . For more information, please .Staff Software Engineer - AI/ML Infra page is loaded## Staff...ML model training and inference pipelines* Build and optimize LLM inference systems using frameworks like vLLM, TensorRT more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • NVIDIA Corporation (Santa Clara, CA)
    Principal Software Engineer - Large-Scale LLM Memory and Storage Systems page is loaded## Principal Software Engineer - Large-Scale LLM Memory and ... of any single GPU, this platform enables efficient, resilient deployment of cutting-edge LLM workloads.We are seeking a Principal Systems Engineer to define the… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • NVIDIA Corporation (Santa Clara, CA)
    Senior AI Engineer , NeMo Retriever - Model Optimization and MLOps page is loaded## Senior AI Engineer , NeMo Retriever - Model Optimization and ... inference engines from NVIDIA and the community, including NVIDIA TensorRT and TensorRT - LLM , NIM microservices...The NeMo Retriever team is looking for an AI Engineer to join our team, focusing on the intersection… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • NVIDIA Corporation (Santa Clara, CA)
    Senior Technical Marketing Engineer - AI Inference at Scale page is loaded## Senior Technical Marketing Engineer - AI Inference at Scalelocations: US, ... power AI at scale. We are looking for a Senior Technical Marketing Engineer to join our...JAX), and inference-specific frameworks & optimizations (Triton Inference Server, TensorRT - LLM , vLLM, SGLang).* Market Awareness - Experience… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • NVIDIA Corporation (Santa Clara, CA)
    …more deep neural network (DNN) training and Inference frameworks, such as PyTorch, TensorRT - LLM , vLLM, SGLang.* Strong programming skills in C++ and Python.* ... our exclusive engineering teams are rapidly growing. If you're a creative and autonomous engineer with a real passion for technology, we want to hear from you.… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Hamilton Barnes Associates Limited (San Francisco, CA)
    …Integrate, tune, and operate inference engines such as vLLM, SGLang, and TensorRT - LLM across multiple model types. Develop APIs, orchestration layers, and ... orchestration. Practical experience with model-serving frameworks such as vLLM, SGLang, TensorRT - LLM , or custom PyTorch deployments. Knowledge of performance… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Apple Inc. (San Francisco, CA)
    Senior Software Engineer , Model Inference San Francisco Bay Area, California, United States Software and Services Join Apple Maps to help build the best map in ... measurable results at global scale. Description As a Software Engineer on the Apple Maps team, you will lead...and Speculative Decoding. Skilled in GPU optimization (eg, CUDA, TensorRT - LLM , cuDNN) to accelerate inference tasks. Skilled… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Amazon (San Francisco, CA)
    Senior Software Development Engineer , AI/ML, AWS Neuron, Model Inference Job ID: 3067759 | Amazon.com Services LLC The Annapurna Labs team at Amazon Web Services ... responsible for development, enablement and performance tuning of a wide variety of LLM model families, including massive scale large language models like the Llama… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • NVIDIA Corporation (Santa Clara, CA)
    …MS) or equivalent experience.**Ways to stand out from the crowd: Hands-on with LLM inference stacks (Triton Inference Server, TensorRT - LLM , vLLM).* ... GPUs. You will shape our strategy for emerging challenges like disaggregated LLM inference and safeguard the long-term technical health of the platform.**What you'll… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Amazon (San Francisco, CA)
    Software Development Engineer , AI/ML, AWS Neuron, Model Inference The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development ... responsible for development, enablement and performance tuning of a wide variety of LLM model families, including massive scale large language models like the Llama… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Senior Software Development Engineer

    NVIDIA (Santa Clara, CA)
    We are now looking for a TensorRT - LLM Software Development Engineer ! NVIDIA is hiring software engineers for its TensorRT - LLM team. Academic and ... core backend software for LLM inference. + Improve the usability of the TensorRT - LLM library and build systems (CMake) What we need to see: + Masters or… more
    NVIDIA (01/10/26)
    - Save Job - Related Jobs - Block Source
  • Senior Deep Learning Software…

    NVIDIA (Santa Clara, CA)
    We are now looking for a Senior Deep Learning Software Engineer , LLM Performance! NVIDIA is seeking an experienced Deep Learning Engineer passionate ... learning community to implement the latest algorithms for public release in TensorRT LLM , VLLM, SGLang and LLM benchmarks. Identify performance opportunities… more
    NVIDIA (11/25/25)
    - Save Job - Related Jobs - Block Source
  • Principal Software Engineer - Large-Scale…

    NVIDIA (Santa Clara, CA)
    …deployment of cutting-edge LLM workloads. We are seeking a Principal Systems Engineer to define the vision and roadmap for memory management of large-scale ... large-scale LLM inference. + Architect and implement deep integrations with leading LLM serving engines (such as vLLM, SGLang, TensorRT - LLM ), with a… more
    NVIDIA (01/10/26)
    - Save Job - Related Jobs - Block Source
  • Senior GenAI Algorithms Engineer

    NVIDIA (Santa Clara, CA)
    …and streamlined deployment strategies with open-sourced inference frameworks. Seeking a Senior Deep Learning Algorithms Engineer to improve innovative LLMs, ... ( TensorRT Model Optimizer, Megatron-LM, Megatron-Bridge, Nvidia-NeMo, NeMo-AutoModel, TensorRT - LLM ) and open-source frameworks (PyTorch, Hugging Face, vLLM,… more
    NVIDIA (12/18/25)
    - Save Job - Related Jobs - Block Source
  • Senior AI Engineer , NeMo Retriever…

    NVIDIA (Santa Clara, CA)
    …on pre-optimized inference engines from NVIDIA and the community, including NVIDIA TensorRT and TensorRT - LLM , NIM microservices optimize response latency ... The NeMo Retriever team is looking for an AI Engineer to join our team, focusing on the intersection...deployments, etc. + Familiarity with ML libraries, especially PyTorch, TensorRT , or TensorRT - LLM . + Excellent… more
    NVIDIA (01/10/26)
    - Save Job - Related Jobs - Block Source
  • Senior DL Algorithms Engineer

    NVIDIA (Santa Clara, CA)
    We are now looking for a Senior DL Algorithms Engineer ! We are seeking a highly skilled Deep Learning Algorithms Engineer with hands-on experience optimizing ... deploy, and optimize models for efficient inference using frameworks such as TensorRT , TensorRT - LLM , vLLM, and SGLang. + Understand, analyze, profile, and… more
    NVIDIA (11/06/25)
    - Save Job - Related Jobs - Block Source
  • Senior Deep Learning Algorithm…

    NVIDIA (Santa Clara, CA)
    We are now looking for a Senior DL Algorithms Engineer ! We are seeking a highly skilled Deep Learning Algorithms Engineer with hands-on experience optimizing ... inference. + Convert and deploy models using frameworks such as TensorRT and TensorRT - LLM + Understand, analyze, profile, and optimize performance of… more
    NVIDIA (11/06/25)
    - Save Job - Related Jobs - Block Source
  • Senior Staff Machine Learning…

    NVIDIA (Santa Clara, CA)
    …Today, we are increasingly known as "the AI computing company." We are seeking a Senior Staff Machine Learning Engineer to join our Enterprise AI team and build ... frameworks such as PyTorch or TensorFlow; familiarity with CUDA-accelerated libraries (eg, TensorRT - LLM ) is a plus. + Proven track record to take a significant… more
    NVIDIA (01/12/26)
    - Save Job - Related Jobs - Block Source
  • AI Senior Staff Systems Engineer

    Cadence Design Systems, Inc. (San Jose, CA)
    …quantization, distillation, and using high-performance serving frameworks (eg, vLLM, TGI, TensorRT - LLM ) to maximize inference throughput and minimize latency. + ... implementing CI/CD pipelines for AI model development. + Advanced LLM Deployment & Optimization: Lead the deployment, serving, and...AI infrastructure. Proven track record as a Principal or Senior Staff Engineer . + Expert-level knowledge of… more
    Cadence Design Systems, Inc. (12/29/25)
    - Save Job - Related Jobs - Block Source