• Voltai Inc. (Palo Alto, CA)
    …presidents. About the Role You will develop, integrate, and optimize state-of-the-art CUDA kernels to power AI models that accelerate semiconductor design and ... and engineers, you'll help make Voltai the world's leading AI + semiconductor research organization. You'll also release your kernels and tooling as contributions to… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Amazon (Cupertino, CA)
    Sr. ML Kernel Performance Engineer , AWS Neuron, Annapurna Labs The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development ... or HPC such as GPUs, CPUs, FPGAs, or custom architectures - Experience with GPU kernel optimization and GPGPU computing such as CUDA , NKI, Triton, OpenCL, SYCL,… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Advanced Micro Devices (Santa Clara, CA)
    …KEY QUALIFICATIONS: Strong programming skills in C/C++ and Python Experience with GPU kernel programming using CUDA , HIP or OpenCL. Profficient on common ML ... is just the beginning. We see the benefits of AI everyday-enabling medical research , curbing credit card fraud, reducing congestion in cities, or simply making life… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • NVIDIA Corporation (Santa Clara, CA)
    We're now looking for a Senior Deep Learning Software Engineer for our cuDNN team!## **What you'll be doing: Develop production-quality software that ships as part ... across the codebase, including API design, software architecture, testing, and GPU kernel development.* Mentoring junior engineers on the team.## **What we need to… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Databricks Inc. (San Francisco, CA)
    A leading data and AI platform company in San Francisco is looking for a research engineer to enhance deep learning techniques and optimize performance on NVIDIA ... Ideal candidates will have a PhD in Computer Science and experience with CUDA and distributed training frameworks. Join our diverse team and take part in… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Menlo Ventures (San Francisco, CA)
    A leading data and AI company based in San Francisco is seeking a Research Engineer to optimize GPU training models and frameworks. The ideal candidate will have ... a strong background in CUDA and experience with distributed training. This role offers a competitive salary range of $166,000 - $225,000 USD, with additional… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Institute of Foundation Models (Sunnyvale, CA)
    …optimization Open‑source contributions or published research (MLSys, ICML, NeurIPS) CUDA or Triton kernel experience Experience with large‑scale pre‑training ... About the Institute of Foundation Models We are a dedicated research lab for building, understanding, using, and risk-managing foundation models. Our mandate is to… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Menlo Ventures (San Francisco, CA)
    …and that high‑quality AI models should be available to all. Job Description As a research engineer on the Scaling team, you will be responsible for keeping up ... into our products to make that possible. The Impact you will have As a research engineer on the Scaling Team at Databricks, you will: Drive performance… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Smallest Inc. (San Francisco, CA)
    …GPU architecture - SMs, warps, memory hierarchy, occupancy tuning Hands‑on experience with CUDA , kernel writing, and kernel ‑level debugging Experience with ... Role We're hiring a GPU Optimization Engineer who understands GPUs at a deep, architectural level - someone who knows exactly how to squeeze every last millisecond… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Genmo Inc. (San Francisco, CA)
    …Expert proficiency with GPU profiling tools (Nsight Systems, nvprof) Strong CUDA programming skills with production kernel development Deep understanding ... We are Genmo, a research lab dedicated to building open, state-of-the-art models...possible in video generation. We're seeking a GPU Performance Engineer to squeeze every last FLOP from our H100… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Apple Inc. (San Francisco, CA)
    Senior Software Engineer , Model Inference San Francisco Bay Area, California, United States Software and Services Join Apple Maps to help build the best map in the ... powering experiences across Maps. You will partner closely with research and product teams, take end-to-end ownership, and deliver...measurable results at global scale. Description As a Software Engineer on the Apple Maps team, you will lead… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Gimlet Labs, Inc (San Francisco, CA)
    …is redefining AI inference from the ground up, combining cutting-edge research with an integrated hardware-software stack that delivers breakthrough performance, ... in seconds. Gimlet is spun out of a Stanford research project under Professors Zain Asgar and Sachin Katti....previous successful exits. Gimlet Labs is seeking a Software Engineer focused on AI Performance. You will be researching… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Pantera Capital (San Francisco, CA)
    …batching, quantization, etc.) Understanding of GPU architectures or experience with GPU kernel programming using CUDA The cash compensation range for this ... Department AI We are looking for an AI Inference engineer to join our growing team. Our current stack...Our current stack is Python, Rust, C++, PyTorch, Triton, CUDA , Kubernetes. You will have the opportunity to work… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Virtue AI (San Francisco, CA)
    Kernel launch efficiency Reducing fragmentation and allocator overhead Experience with kernel - or runtime‑level optimization CUDA kernels, Triton kernels, or ... security platforms. Built on decades of foundational and award-winning research in AI security, its AI-native architecture unifies automated...our core team. What You'll Do As an Inference Engineer , you will own how models are served in… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Baseten (San Francisco, CA)
    …tool/function calling and multi-modal serving Profile and optimize TensorRT-LLM kernels, analyze CUDA kernel performance, implement custom CUDA operators, ... Sourcegraph, Writer, Abridge, Bland, and Zed. By uniting applied AI research , flexible infrastructure, and seamless developer tooling, we enable companies operating… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Databricks Inc. (San Francisco, CA)
    Staff Software Engineer - GenAI inference P-1285 About This Role As a staff software engineer for GenAI inference, you will lead the architecture, development, ... the inference engine that powers Databricks Foundation Model API. You'll bridge research advances and production demands, ensuring high throughput, low latency, and… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Menlo Ventures (San Francisco, CA)
    About This Role As a software engineer for GenAI inference, you will help design, develop, and optimize the inference engine that powers Databricks' Foundation Model ... API. You'll work at the intersection of research and production, ensuring our large language model (LLM) serving systems are fast, scalable, and efficient. Your work… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Senior Research Engineer

    NVIDIA (Santa Clara, CA)
    NVIDIA's AI Developer Tools organization is seeking a Senior Research Engineer to join our Quality team, where we're building the definitive benchmarks and ... important parallel computing platform. Our growing team operates at the intersection of CUDA domain expertise and cutting-edge AI research . While evaluation is… more
    NVIDIA (01/10/26)
    - Save Job - Related Jobs - Block Source
  • Senior GenAI Algorithms Engineer - Model…

    NVIDIA (Santa Clara, CA)
    …Face, vLLM, SGLang). You may also dive deeper into GPU-level optimization, including custom kernel development with CUDA and Triton. This role offers a unique ... with open-sourced inference frameworks. Seeking a Senior Deep Learning Algorithms Engineer to improve innovative generative AI models like LLMs, VLMs, multimodal… more
    NVIDIA (01/10/26)
    - Save Job - Related Jobs - Block Source
  • Software Engineer , Systems ML - Frameworks…

    Meta (Menlo Park, CA)
    …hardware architectures. The compiler stack, DL graph optimizations, and kernel authoring for specific hardware, directly impacts performance and deployment ... software codesign for AI domain specific problems. **Required Skills:** Software Engineer , Systems ML - Frameworks / Compilers / Kernels Responsibilities: 1.… more
    Meta (12/20/25)
    - Save Job - Related Jobs - Block Source