• Apple Inc. (San Francisco, CA)
    Senior Software Engineer , Model Inference San Francisco Bay Area, California, United States Software and Services Join Apple Maps to help build the ... take end-to-end ownership, and deliver measurable results at global scale. Description As a Software Engineer on the Apple Maps team, you will lead the design… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • OpenAI (San Francisco, CA)
    …things that they've never been able to before. We focus on performant and efficient model inference , as well as accelerating research progression via model ... . About the Role We are looking for an engineer who wants to take the world's largest and...improve the performance, latency, throughput, and efficiency of our model inference stack. Build tools to give… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Amazon (San Francisco, CA)
    Senior Software Development Engineer , AI/ML, AWS Neuron, Model Inference Job ID: 3067759 | Amazon.com Services LLC The Annapurna Labs team at Amazon Web ... ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement...with work experience on some optimizations for improving the model execution. - Software development experience in… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Amazon (San Francisco, CA)
    Software Development Engineer , AI/ML, AWS Neuron, Model Inference The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software ... lifecycles along with work experience on some optimizations for improving the model execution. Software development experience in C++, Python (experience in at… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Capital One (Fredericksburg, VA)
    …develop, test, deploy, and support AI software components including foundation model training, large language model inference , similarity search, ... Lead AI Engineer (FM Hosting, LLM Inference ) Overview...developing and applying state-of-the-art techniques for optimizing training and inference software to improve hardware utilization, latency,… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Databricks Inc. (San Francisco, CA)
    Staff Software Engineer - GenAI inference P-1285 About This Role As a staff software engineer for GenAI inference , you will lead the ... architecture, development, and optimization of the inference engine that powers Databricks Foundation Model API. You'll bridge research advances and production… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Menlo Ventures (San Francisco, CA)
    About This Role As a software engineer for GenAI inference , you will help design, develop, and optimize the inference engine that powers Databricks' ... What You Will Do Contribute to the design and implementation of the inference engine, and collaborate on model ‑serving stack optimized for large‑scale LLMs… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Virtue AI (San Francisco, CA)
    …we're looking for passionate builders to join our core team. What You'll Do As an Inference Engineer , you will own how models are served in production. Your job ... You will: Serve and optimize LLM, embedding, and other ML models' inference across multiple model families Design and operate inference APIs with clear… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Akamai Technologies GmbH (Cambridge, MA)
    Senior Principal Software Engineer - Akamai Inference Cloud (Remote) United States (Remote) Job Description Do you thrive on defining the future of AI ... deep understanding of business objectives. As a Senior Principal Software Engineer , you will be responsible for:...or its equivalent Be a recognized expert in AI inference optimization, model serving, and distributed AI… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • NVIDIA Corporation (Santa Clara, CA)
    Senior Deep Learning Software Engineer , Inference page is loaded## Senior Deep Learning Software Engineer , Inferencelocations: US, CA, Santa Clara: ... requisition id: JR2002670NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference for...vLLM, which are at the forefront of efficient large-scale model serving and inference . You will play… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • DatologyAI (Redwood City, CA)
    …infrastructure that are reliable, scalable, and cost-efficient. Build robust model serving infrastructure for low-latency, high-throughput inference across ... a week. About the Role We're looking for an engineer with deep experience building and operating large-scale training...for training and/or inference Have familiarity with inference tooling like vLLM, SGLang, or custom model more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • OpenAI (San Francisco, CA)
    …things that they've never been able to before. We focus on performant and efficient model inference , as well as accelerating research progression via model ... About the Team Our Inference team brings OpenAI's most capable research and...connection management. Have 5+ years of experience as a software engineer and systems architect working on… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Amazon (Seattle, WA)
    …cloud-scale machine-learning accelerators. This role is for a senior software engineer in the Machine Learning Inference Applications team. This role is ... improving model performance Preferred Qualifications 3+ years of full software development life cycle, including coding standards, code reviews, source control… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • OpenAI (San Francisco, CA)
    …tighter coordination with product and research. About the Role We're looking for a software engineer to help us serve OpenAI's multimodal models at scale. You'll ... networking, distributed compute, and high-throughput data handling. Have familiarity with inference tooling like vLLM, TensorRT-LLM, or custom model parallel… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Menlo Ventures (San Francisco, CA)
    …and contribute to our innovative projects. Position Overview We are looking for a Software Engineer to work at the forefront of deploying our cutting-edge AI ... machine learning architectures. Experience with machine learning compilers. Experience optimizing model inference for robotic systems deployment. Base Salary… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • jobr.pro (Sunnyvale, CA)
    …UI design and mobile; the list goes on and is growing every day. As a software engineer , you will work on a specific project critical to Google's needs with ... Large Language Models (LLM) and other Machine Learning (ML) models for inference . Experience building GPU-related software . Experience with compilers or ML… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Google Inc. (Sunnyvale, CA)
    Software Engineer III, Infrastructure, Inference Control Plane corporate_fare Google place Sunnyvale, CA, USA Apply Bachelor's degree or equivalent practical ... goes on and is growing every day. As a software engineer , you will work on a...push technology forward. The mission of Vertex AI Online Inference Infrastructure team is to build a model more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • General Motors (Sunnyvale, CA)
    …this role, you'll work closely with ML engineers and researchers to ensure efficient model serving and inference in production, for their workflows such as data ... and metrics to ensure reliability, performance, and resource optimization of inference services. Proactively research and integrate state-of-the‑art model more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • quadric.io, Inc (Burlingame, CA)
    …executes both NN graph code and conventional C++ DSP and control code. Role: The AI Inference Engineer in Quadric is the key bridge between the world of AI/LLM ... models and Quadric unique platforms. The AI Inference Engineer at Quadric will [1] port...and/or Electric Engineering. 5+ years of experience in AI/LLM model inference and deployment frameworks/tools experience with… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Pulse (San Francisco, CA)
    …experience is a plus About the Role Specialize in low-latency, high-throughput inference for OCR and multimodal models. Own profiling, batching, and autoscaling ... across single-tenant and multi-tenant environments. Responsibilities Build inference services with smart batching and caching Optimize kernels, tokenization, and … more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source