- Luma AI (Palo Alto, CA)
- …models and attention implementations. Good to have experience Experience with high-performance Triton/ CUDA and writing custom PyTorch kernels and ops. Top ... engineers with significant experience solving hard problems in PyTorch, CUDA and distributed systems. You will work alongside the...candidates will be able to write fused kernels for common hot paths, understand when to make… more
- Cisco (San Jose, CA)
- …like pytorch. + Familiarity with **assembly or PTX/SASS** for debugging or optimizing CUDA kernels . + Familiarity with **NVMe storage offloads** , **IOAT/DPDK** ... the next generation of enterprise-grade AI infrastructure. As a principal engineer within our GPU and CUDA Runtime team, you will play a critical role in shaping… more
- Google (Mountain View, CA)
- …GPU/TPUs via ML frameworks (eg JAX, PyTorch) and low-level programming models (eg CUDA , OpenCL) + Experience in leveraging custom kernels and compiler ... Snapshot We are seeking a software engineer to define, drive, and critically contribute to...implement critical components across Model architecture, ML frameworks, custom kernels and platform, to deliver frontier models with maximum… more
- NVIDIA (Santa Clara, CA)
- …right down to the GPU HW. What you'll be doing: + Writing highly tuned compute kernels , mostly in C++ CUDA , to perform core deep learning operations (eg matrix ... We are now looking for a Senior Performance Software Engineer for Deep Learning Libraries! Do you enjoy tuning parallel algorithms and analyzing their performance?… more
- NVIDIA (Santa Clara, CA)
- …a high-performance execution environment, low-level GPU optimizations and developing custom GPU kernels in CUDA and/or Triton. This is an exceptional opportunity ... We are looking for a Senior Deep Learning Software Engineer to design and build our automated inference and...as TensorRT. + Prior experience in writing high-performance GPU kernels for machine learning workloads in frameworks such as… more
- NVIDIA (Santa Clara, CA)
- …best-in-class AI models. We are now looking for a Senior Deep Learning Software Engineer to develop and scale up our automated inference and deployment solution. As ... and HuggingFace to developing and improving high-performance kernel implementations in CUDA , TRT-LLM, and Triton. This is an exceptional opportunity for passionate… more
- Amazon (Sunnyvale, CA)
- …techniques - Optimize low-level details of the training stack, including CUDA kernels , communication collectives, network I/O. - Utilize, build ... in training Foundational Models/LLMs, and/or low-level optimization of ML training workflows, CUDA kernels , network I/O. Amazon is an equal opportunity employer… more
- NVIDIA (Santa Clara, CA)
- …to join our development efforts in the area of dense linear algebra kernels for high-performance libraries such as cuSOLVER. Around the world, leading commercial and ... together with other developers on designing, developing, and optimizing kernels for various algorithms including triangular factorizations, eigenvalue decompositions… more
- NVIDIA (Santa Clara, CA)
- …serving and deployment algorithms and optimizations using TensorRT LLM, VLLM, SGLang, Triton and CUDA kernels . Work and collaborate with a diverse set of teams ... are now looking for a Senior Deep Learning Software Engineer , LLM Performance! NVIDIA is seeking an experienced Deep...knowledge of CPU and GPU + GPU programming experience ( CUDA or OpenCL) GPU deep learning has provided the… more
- NVIDIA (Santa Clara, CA)
- …bring to bear open-source tools and plugins-including CUTLASS, OAI Triton, NCCL, and CUDA kernels -to implement and optimize model serving pipelines. What you'll ... NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference for our...and GPU is a plus. + GPU programming experience ( CUDA , OAI TRITON or CUTLASS) is a plus. Ways… more
- NVIDIA (Santa Clara, CA)
- …extrinsic. + Experience with development in CUDA language. The ability to implement CUDA kernels as part of training or inference pipelines. The base salary ... understand the world. We are now looking for an extraordinary Senior Perception Engineer to develop and productize NVIDIA's autonomous driving solutions. As a member… more
- NVIDIA (Santa Clara, CA)
- …extrinsic. + Experience with development in CUDA language. The ability to implement CUDA kernels as part of training or inference pipelines. The base salary ... understand the world. We are now looking for an extraordinary Senior Perception Engineer to develop and productize NVIDIA's autonomous driving solutions. As a member… more
- Amazon (Cupertino, CA)
- …(Inf1/Inf2) our cloud-scale Machine Learning accelerators. This role is for a Machine Learning Engineer on one of our AWS Neuron teams: - The ML Distributed Training ... of Annapurna's AI chips for both training and lightning‑fast inference. Beyond kernels , we shape next‑generation serving by upstreaming new features and driving… more
- Microsoft Corporation (Mountain View, CA)
- …the Firmware Center of Excellence, we're looking for a customer-focused, hands-on SW engineer to help us develop a suite of system validation and diagnostic tests ... geographic locations. We are looking for a Principal Firmware Engineer to join the team. Microsoft's mission is to...PVT characterization. + Knowledge of or experience with AI models/ kernels such as GPT, Gemm, etc. + Experience with… more
- Microsoft Corporation (Mountain View, CA)
- …in optimizing machine learning models for GPUs, including development of custom CUDA kernels for performance-critical workloads. Software Engineering IC6 - The ... We are looking for an experienced **Principal Software Engineer ** to join the Ads Engineering team and help advance the core capabilities of our Ads serving stack.… more
- Meta (Menlo Park, CA)
- …frameworks like PyTorch, Caffe2, TensorFlow, ONNX, TensorRT 10. OR AI high performance kernels : Experience with CUDA programming, OpenMP / OpenCL programming or ... directly towards PyTorch code and device optimization. **Required Skills:** Software Engineer - Systems ML - PyTorch Responsibilities: 1. Improve PyTorch's state… more
- NVIDIA (Santa Clara, CA)
- …analysis tools. As a member of the software development team, you will engineer and improve the core infrastructure for execution, automation, and debugging the ... from the crowd: + Know-how working on operating system kernels or writing device drivers with strong systems-level debugging...+ A knowledge of GPU APIs such as DirectX, CUDA , Vulkan or OpenGL + Experience with chip and/or… more
- NVIDIA (Santa Clara, CA)
- …fields, or equivalent experience. + 8+ years of experience as an AI/Software Engineer with proven track record coding in Python and/or C++ with popular AI ... optimizing model training/inference performance on GPUs. + Experience developing and optimizing GPU kernels for deep learning, with a focus on GEMM and attention … more