- Advanced Micro Devices, Inc. (San Jose, CA)
- …Cluster Bring‑up & Optimization: Oversee the technical onboarding of massive GPU clusters. Ensure your team can troubleshoot collective communication errors, debug ... framework issues, and optimize training /inference strategies. Utilization Engineering (The North Star Metric): Drive and maintain industry‑leading Customer GPU … more
- NVIDIA Corporation (Santa Clara, CA)
- Senior Solutions Architect, Cluster Design and Architecture - Networking page is loaded## Senior Solutions Architect, Cluster Design and Architecture - ... research to the world's fastest supercomputers. We are seeing a highly motivated Senior Solutions Architect to join the Cluster Design and Architecture team with a… more
- NVIDIA (Santa Clara, CA)
- …cluster design, balancing design principles with situational constraints to deliver high‑ performance , supportable GPU clusters. Support customers through first ... Senior Solutions Architect, Cluster Design and Architecture -...and HPC infrastructure. Responsibilities Partner with internal engineering on GPU cluster building and networking; convey architecture guidelines to… more
- Signify Technology (San Jose, CA)
- …$300K - $500K + Equity Location: Bay Area, CA An AI startup is seeking a Senior Systems Engineer to optimize deep learning performance at scale. In this role, ... bottlenecks across kernels, frameworks, and clusters. If you thrive on accelerating training and inference performance at scale, this is a chance to… more
- Adobe (San Jose, CA)
- …and optimize GPU -accelerated pipelines for both (customized) model training and inference-prioritizing performance , scalability, and reliability. Provide ... including production-scale deployments. 3+ years of experience leading large-scale, GPU -intensive GenAI systems ( training , inference, and optimization). Deep… more
- Adobe Inc. (San Jose, CA)
- …and optimize GPU -accelerated pipelines for both (customized) model training and inference-prioritizing performance , scalability, and reliability. Foster a ... including production-scale deployments. 3+ years of experience leading large-scale, GPU -intensive GenAI systems ( training , inference, and optimization).… more
- NVIDIA (Santa Clara, CA)
- … GPU architecture, server-level platforms, and rack-scale innovations that maximize performance and efficiency for AI inference & training . What you'll ... of artificial intelligence. Our data center platforms integrate high performance compute, networking, and a full-stack software ecosystem to...power AI at scale. We are looking for a Senior Technical Marketing Engineer focused on GPUs and scale-up… more
- NVIDIA (Santa Clara, CA)
- We are now looking for a Senior GPU Power Architect! The NVIDIA GPU Architecture group is looking for world class architects and software developers to join ... key GPU units to evaluate architectural tradeoffs in DL/ML ( training /inference) and graphics workloads + Drive improvements to architecture development processes… more
- NVIDIA (Santa Clara, CA)
- We are now looking for a Senior GPU & Deep Learning Architect! The NVIDIA GPU Architecture group is looking for world class architects and software ... our GPU architecture, especially for deep learning workloads, both training and inference, and maintain our leadership by developing new parallel programming… more
- NVIDIA (Santa Clara, CA)
- …and excellent communication and planning abilities. Experience working with High Performance Computing (HPC), GPUs, and high- performance networking (RDMA, ... science of computer graphics. With the invention of the GPU - the engine of modern visual computing -...the hardware all the way up to the AI training applications. + You'll be constantly innovating, discovering new… more
- NVIDIA (Santa Clara, CA)
- …for groundbreaking hardware, system, and software innovations aimed at significantly enhancing AI training performance and efficiency. + Lead and scale a high- ... We are now looking for a Senior Manager, for GPU and AI...and AI, particularly in the context of large-scale AI training workloads. + Solid track record in performance… more
- NVIDIA (Santa Clara, CA)
- …stand out from the crowd: + Possess comprehensive knowledge of AI model training /inference and secure workflows. + Experience defining GPU , servers, data center ... next era of computing. An era in which our GPU acts as the brains of computers, robots, and...make a lasting impact on the world! As a Senior Product Manager - Data Center at NVIDIA, you'll… more
- NVIDIA (Santa Clara, CA)
- …potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the ... behind the explosion of Machine Learning, Artificial Intelligence and High- Performance Computing. We are looking for a highly capable...consistent track record in technology and the skills for GPU product definition for Data Center. We are a… more
- Amazon (Santa Clara, CA)
- …and its potential to overcome some of the biggest challenges in High Performance Computing (HPC)? Do you have a unique combination of deep technical knowledge, ... analytical problems as massive scale? Amazon Web Services (AWS) is seeking a Senior Worldwide Specialist Solutions Architect focused on HPC to work with our… more
- Amazon (Santa Clara, CA)
- …and its potential to overcome some of the biggest challenges in High Performance Computing (HPC)? Do you have a unique combination of deep technical knowledge, ... analytical problems as massive scale? Amazon Web Services (AWS) is seeking a Senior Worldwide Specialist Solutions Architect focused on HPC to work with our… more
- NVIDIA (Santa Clara, CA)
- We are now looking for a Senior High- Performance LLM Training Engineer! NVIDIA is seeking experienced engineers specializing in performance analysis and ... software stack in frameworks like PyTorch and JAX for high- performance training on thousands of GPUs, while...to work across the full hardware & software stack-from GPU architecture to application code-to achieve optimal performance… more
- NVIDIA (Santa Clara, CA)
- …performance models, including on resource-constrained platforms. + Deep expertise in GPU performance optimizations, evidenced by benchmark wins or published ... impact on the world. We are looking for outstanding Senior High Performance AI Engineer to build...programming skills; solid software engineering fundamentals. + Experience with GPU programming and performance optimization (CUDA or… more
- NVIDIA (Santa Clara, CA)
- …+ Good understanding of Deep Learning frameworks like PyTorch and TensorFlow, distributed training and inference. + Knowledge of GPU cluster job scheduling ... AI researchers and SW/HW teams running AI workload in GPU cluster. As a member of the software development...analysis tools/platforms + Solid experience in large AI job performance analysis for training /inference workload + Knowledge… more
- Cadence Design Systems, Inc. (San Jose, CA)
- …for the entire lifecycle of our AI systems, from architecting and building high- performance GPU clusters to deploying and optimizing our most advanced AI ... reporting. Implement and manage monitoring solutions for system health, job statuses, GPU utilization, and container performance to proactively identify and… more
- NVIDIA (Santa Clara, CA)
- We are now looking for a Senior Performance Software Engineer for Deep Learning Libraries! Do you enjoy tuning parallel algorithms and analyzing their ... revolution in artificial intelligence! We're always striving for peak GPU efficiency on current and future-generation GPUs. To get...team on generating optimal assembly code + Deep learning training and inference performance teams on which… more