• Virtue AI (San Francisco, CA)
    …we're looking for passionate builders to join our core team. What You'll Do As an Inference Engineer , you will own how models are served in production. Your job ... workloads. You will: Serve and optimize LLM, embedding, and other ML models' inference across multiple model families Design and operate inference APIs with… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Amazon (San Francisco, CA)
    Senior Software Development Engineer , AI/ML, AWS Neuron, Model Inference Job ID: 3067759 | Amazon.com Services LLC The Annapurna Labs team at Amazon Web ... principles. - Proficiency in debugging, profiling, and implementing best software engineering practices in large-scale systems. PREFERRED QUALIFICATIONS… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Apple Inc. (San Francisco, CA)
    Senior Software Engineer , Model Inference ...or related field (or equivalent experience). 5+ years in software engineering focused on ML inference ... deliver measurable results at global scale. Description As a Software Engineer on the Apple Maps team,...will lead the design and implementation of large-scale, high-performance inference services that support a wide range of models… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Databricks Inc. (San Francisco, CA)
    Staff Software Engineer - GenAI inference P-1285 About This Role As a staff software engineer for GenAI inference , you will lead the ... architecture, development, and optimization of the inference engine that powers Databricks Foundation Model API. You'll...BS/MS/PhD in Computer Science or a related field. Strong software engineering background (6+ years or equivalent)… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Amazon (San Francisco, CA)
    Software Development Engineer , AI/ML, AWS Neuron, Model Inference The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software ... computing principles. Proficiency in debugging, profiling, and implementing best software engineering practices in large‑scale systems. Preferred Qualifications… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Menlo Ventures (San Francisco, CA)
    About This Role As a software engineer for GenAI inference , you will help design, develop, and optimize the inference engine that powers Databricks' ... and efficient. Your work will touch the full GenAI inference stack - from kernels and runtimes to orchestration...BS/MS/PhD in Computer Science, or a related field Strong software engineering background (3+ years or equivalent)… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Akamai Technologies GmbH (Cambridge, MA)
    Senior Principal Software Engineer - Akamai Inference Cloud (Remote) United States (Remote) Job Description Do you thrive on defining the future of AI ... advisor shaping AI at the edge? Join the Akamai Inference Cloud Team! The Akamai Inference Cloud...deep understanding of business objectives. As a Senior Principal Software Engineer , you will be responsible for:… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • NVIDIA Corporation (Santa Clara, CA)
    Senior Deep Learning Software Engineer , Inference page is loaded## Senior Deep Learning Software Engineer , Inferencelocations: US, CA, Santa Clara: ... requisition id: JR2002670NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference for...or PhD or equivalent experience in relevant field (Computer Engineering , Computer Science, EECS, AI).* 5+ years of relevant… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • DatologyAI (Redwood City, CA)
    …security, and observability. About You Have at least 5 years of professional software engineering experience. Expertise in Python and experience with deep ... are in office 4 days a week. About the Role We're looking for an engineer with deep experience building and operating large-scale training and inference systems.… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Amazon (Seattle, WA)
    …cloud-scale machine-learning accelerators. This role is for a senior software engineer in the Machine Learning Inference Applications team. This role is ... Overview AWS Neuron is the complete software stack for the AWS Inferentia and Trainium...and performance optimization of core building blocks of LLM Inference - Attention, MLP, Quantization, Speculative Decoding, Mixture of… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • OpenAI (San Francisco, CA)
    …missing to get the job done. Have at least 5 years of professional software engineering experience. Have or can quickly gain familiarity with PyTorch, NVidia ... About the Team Our Inference team brings OpenAI's most capable research and.... About the Role We are looking for an engineer who wants to take the world's largest and… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Menlo Ventures (San Francisco, CA)
    …and contribute to our innovative projects. Position Overview We are looking for a Software Engineer to work at the forefront of deploying our cutting-edge AI ... of our embodied systems. You will be responsible for optimizing AI inference processes from lightweight to billion-parameter models, ensuring our robots operate with… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Capital One (Fredericksburg, VA)
    …developing and applying state-of-the-art techniques for optimizing training and inference software to improve hardware utilization, latency, throughput, ... Lead AI Engineer (FM Hosting, LLM Inference ) Overview...engineering and mathematics, and your expertise in hardware, software , and AI enable you to see and exploit… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Menlo Ventures (San Francisco, CA)
    …cloud platforms. You may be a good fit if you: Have significant software engineering experience, particularly with distributed systems Are results-oriented, with ... to build beneficial AI systems. About the role Our Inference team is responsible for building and maintaining the...by serving our models via the industry's largest compute-agnostic inference deployments. We are responsible for the entire stack… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Sanas (Palo Alto, CA)
    Staff Software Engineer : Microservice Infrastructure & Real-Time ML Inference Sanas.ai is pioneering the future of human communication. Founded by a team of ... communication. About the role We're looking for a Staff Software Engineer (Backend) to design and build...(auth, interceptors, tracing, schema rollout). Qualifications 7+ years of Software Engineering experience, with a focus on… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • NVIDIA Corporation (Santa Clara, CA)
    Senior Technical Marketing Engineer - AI Inference at Scale page is loaded## Senior Technical Marketing Engineer - AI Inference at Scalelocations: US, ... solution briefs, presentations, explainer videos, and demos that highlight NVIDIA's AI inference capabilities.* Engage with Engineering & Product Teams - Work… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • General Motors (Sunnyvale, CA)
    …This job is eligible for relocation assistance. About the Team The ML Inference Platform is part of the AI Compute Platforms organization within Infrastructure ... of state-of-the‑art (SOTA) machine learning models for experimental and bulk inference , with a focus on performance, availability, concurrency, and scalability.… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • The Association of Technology, Management and Applied… (Morgan Hill, CA)
    …Server, Hadoop etc. Experienced in using design patterns and following best software engineering practices. An understanding of fundamental algorithms and ... capabilities. This job is responsible for defining and leading the engineering approach for complex features to deliver significant business outcomes. Key… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Pulse (San Francisco, CA)
    …experience is a plus About the Role Specialize in low-latency, high-throughput inference for OCR and multimodal models. Own profiling, batching, and autoscaling ... across single-tenant and multi-tenant environments. Responsibilities Build inference services with smart batching and caching Optimize kernels, tokenization, and… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • quadric.io, Inc (Burlingame, CA)
    …executes both NN graph code and conventional C++ DSP and control code. Role: The AI Inference Engineer in Quadric is the key bridge between the world of AI/LLM ... models and Quadric unique platforms. The AI Inference Engineer at Quadric will [1] port...Engineering . 5+ years of experience in AI/LLM model inference and deployment frameworks/tools experience with model quantization (PTQ,… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source