- Voltai Inc. (Palo Alto, CA)
- …presidents. About the Role You will develop, integrate, and optimize state-of-the-art CUDA kernels to power AI models that accelerate semiconductor design and ... and engineers, you'll help make Voltai the world's leading AI + semiconductor research organization. You'll also release your kernels and tooling as contributions to… more
- Amazon (Cupertino, CA)
- Sr. ML Kernel Performance Engineer , AWS Neuron, Annapurna Labs The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development ... or HPC such as GPUs, CPUs, FPGAs, or custom architectures - Experience with GPU kernel optimization and GPGPU computing such as CUDA , NKI, Triton, OpenCL, SYCL,… more
- Databricks Inc. (San Francisco, CA)
- A leading data and AI platform company in San Francisco is looking for a research engineer to enhance deep learning techniques and optimize performance on NVIDIA ... Ideal candidates will have a PhD in Computer Science and experience with CUDA and distributed training frameworks. Join our diverse team and take part in… more
- Menlo Ventures (San Francisco, CA)
- A leading data and AI company based in San Francisco is seeking a Research Engineer to optimize GPU training models and frameworks. The ideal candidate will have ... a strong background in CUDA and experience with distributed training. This role offers a competitive salary range of $166,000 - $225,000 USD, with additional… more
- Institute of Foundation Models (Sunnyvale, CA)
- …optimization Open‑source contributions or published research (MLSys, ICML, NeurIPS) CUDA or Triton kernel experience Experience with large‑scale pre‑training ... About the Institute of Foundation Models We are a dedicated research lab for building, understanding, using, and risk-managing foundation models. Our mandate is to… more
- Advanced Micro Devices (Santa Clara, CA)
- …KEY QUALIFICATIONS: Strong programming skills in C/C++ and Python Experience with GPU kernel programming using CUDA , HIP or OpenCL. Profficient on common ML ... is just the beginning. We see the benefits of AI everyday-enabling medical research , curbing credit card fraud, reducing congestion in cities, or simply making life… more
- NVIDIA Corporation (Santa Clara, CA)
- We're now looking for a Senior Deep Learning Software Engineer for our cuDNN team!## **What you'll be doing: Develop production-quality software that ships as part ... across the codebase, including API design, software architecture, testing, and GPU kernel development.* Mentoring junior engineers on the team.## **What we need to… more
- Menlo Ventures (San Francisco, CA)
- …and that high‑quality AI models should be available to all. Job Description As a research engineer on the Scaling team, you will be responsible for keeping up ... into our products to make that possible. The Impact you will have As a research engineer on the Scaling Team at Databricks, you will: Drive performance… more
- Smallest Inc. (San Francisco, CA)
- …GPU architecture - SMs, warps, memory hierarchy, occupancy tuning Hands‑on experience with CUDA , kernel writing, and kernel ‑level debugging Experience with ... Role We're hiring a GPU Optimization Engineer who understands GPUs at a deep, architectural level - someone who knows exactly how to squeeze every last millisecond… more
- Genmo Inc. (San Francisco, CA)
- …Expert proficiency with GPU profiling tools (Nsight Systems, nvprof) Strong CUDA programming skills with production kernel development Deep understanding ... We are Genmo, a research lab dedicated to building open, state-of-the-art models...possible in video generation. We're seeking a GPU Performance Engineer to squeeze every last FLOP from our H100… more
- Apple Inc. (San Francisco, CA)
- Senior Software Engineer , Model Inference San Francisco Bay Area, California, United States Software and Services Join Apple Maps to help build the best map in the ... powering experiences across Maps. You will partner closely with research and product teams, take end-to-end ownership, and deliver...measurable results at global scale. Description As a Software Engineer on the Apple Maps team, you will lead… more
- Gimlet Labs, Inc (San Francisco, CA)
- …is redefining AI inference from the ground up, combining cutting-edge research with an integrated hardware-software stack that delivers breakthrough performance, ... in seconds. Gimlet is spun out of a Stanford research project under Professors Zain Asgar and Sachin Katti....previous successful exits. Gimlet Labs is seeking a Software Engineer focused on AI Performance. You will be researching… more
- Pantera Capital (San Francisco, CA)
- …batching, quantization, etc.) Understanding of GPU architectures or experience with GPU kernel programming using CUDA The cash compensation range for this ... Department AI We are looking for an AI Inference engineer to join our growing team. Our current stack...Our current stack is Python, Rust, C++, PyTorch, Triton, CUDA , Kubernetes. You will have the opportunity to work… more
- Virtue AI (San Francisco, CA)
- …Kernel launch efficiency Reducing fragmentation and allocator overhead Experience with kernel - or runtime‑level optimization CUDA kernels, Triton kernels, or ... security platforms. Built on decades of foundational and award-winning research in AI security, its AI-native architecture unifies automated...our core team. What You'll Do As an Inference Engineer , you will own how models are served in… more
- Baseten (San Francisco, CA)
- …tool/function calling and multi-modal serving Profile and optimize TensorRT-LLM kernels, analyze CUDA kernel performance, implement custom CUDA operators, ... Sourcegraph, Writer, Abridge, Bland, and Zed. By uniting applied AI research , flexible infrastructure, and seamless developer tooling, we enable companies operating… more
- Databricks Inc. (San Francisco, CA)
- Staff Software Engineer - GenAI inference P-1285 About This Role As a staff software engineer for GenAI inference, you will lead the architecture, development, ... the inference engine that powers Databricks Foundation Model API. You'll bridge research advances and production demands, ensuring high throughput, low latency, and… more
- Menlo Ventures (San Francisco, CA)
- About This Role As a software engineer for GenAI inference, you will help design, develop, and optimize the inference engine that powers Databricks' Foundation Model ... API. You'll work at the intersection of research and production, ensuring our large language model (LLM) serving systems are fast, scalable, and efficient. Your work… more
- NVIDIA (Santa Clara, CA)
- NVIDIA's AI Developer Tools organization is seeking a Senior Research Engineer to join our Quality team, where we're building the definitive benchmarks and ... important parallel computing platform. Our growing team operates at the intersection of CUDA domain expertise and cutting-edge AI research . While evaluation is… more
- NVIDIA (Santa Clara, CA)
- …Face, vLLM, SGLang). You may also dive deeper into GPU-level optimization, including custom kernel development with CUDA and Triton. This role offers a unique ... with open-sourced inference frameworks. Seeking a Senior Deep Learning Algorithms Engineer to improve innovative generative AI models like LLMs, VLMs, multimodal… more
- Meta (Menlo Park, CA)
- …hardware architectures. The compiler stack, DL graph optimizations, and kernel authoring for specific hardware, directly impacts performance and deployment ... software codesign for AI domain specific problems. **Required Skills:** Software Engineer , Systems ML - Frameworks / Compilers / Kernels Responsibilities: 1.… more