- NVIDIA (Santa Clara, CA)
- …in any of the leading Cloud environment [ AWS, Azure or GCP] + Experience with AI / HPC cluster job schedulers such as SLURM, LSF + In depth understating ... InfiniBand with IBOP and RDMA + Background with Software Defined Networking and AI / HPC cluster networking + Familiarity with deep learning frameworks like… more
- General Dynamics Information Technology (Fairfax, VA)
- …NACI (T1) **Job Family:** Systems Engineering **Skills:** Computing,Data Science,GPU Computing, HPC Cluster ,IT System Architecture **Experience:** 10 + years ... of related experience **Job Description:** GDIT is seeking a Senior HPC Architect to join our Scientific...for two HPC clusters; a 4000+ core HPC cluster that is GPU-focused and a… more
- Insight Global (Rockville, MD)
- Job Description HOW A SENIOR HPC ARCHITECT WILL MAKE AN IMPACT:...for two HPC clusters; a 4000+ core HPC cluster that is GPU-focused and a 1,500+ ... core HPC cluster , including monitoring performance and health of both clusters...diverse research community with needs in genomics, cryo-electron microscopy, AI /ML * Architect and design HPC clusters… more
- NVIDIA (Santa Clara, CA)
- …professionals. This entails ensuring the timely delivery of a varied spectrum of AI HPC data center projects. Furthermore, this role offers an opportunity ... and related by applying groundbreaking technical and operational knowledge to configure and maintain HPC AI network and server platforms. + Drives HPC team… more
- Microsoft Corporation (Redmond, WA)
- …of the Azure platform working collaboratively with many industry partners. As a Senior High-Performance Computing ( HPC ) Software Engineer, you will be critical ... in designing and delivering the next generations of AI training, AI inferencing, virtual desktop, video...- from fiber networking, switches, GPU differentiation, rack design, cluster design and more. This position offers a unique… more
- NVIDIA (Santa Clara, CA)
- …+ Background in running and instrumenting distributed LLM training on a multi gpu HPC cluster + Knowledge of LLM training features and libraries - Checkpointing, ... with distributed system software architecture + Basic understanding of HPC GPU cluster , slurm + Basic understanding...challenges no one else can solve. Our work in AI and the metaverse is transforming the world's largest… more
- NVIDIA (Santa Clara, CA)
- NVIDIA's Deep Learning Optimized Frameworks Group is looking for a deeply technical HPC cluster administrator to lead a diverse cluster of GPU-accelerated ... leadership in the design and implementation of groundbreaking GPU compute cluster that runs demanding deep learning, high performance computing, and computationally… more
- Capital One (San Francisco, CA)
- …to love the products and services we build. We are looking for an experienced Senior Distinguished Engineer, AI Systems, to help us build the foundations of our ... Francisco, United States of America, San Francisco, California Distinguished Engineer, Generative AI Systems (Remote Eligible) Our mission at Capital One is to… more
- NVIDIA (Santa Clara, CA)
- …Zabbix, etc. Familiarity with newer and emerging monitoring products. + Prior Experience with HPC cluster management tools such as Slurm, PBS, LSF, etc. + ... artificial intelligence. Join our team at NVIDIA as a Senior Site reliability engineer focused on HPC ...challenges no one else can solve. Our work in AI and the metaverse is transforming the world's largest… more
- NVIDIA (CA)
- …and/or CUDA. + Proficient in the Linux/GNU toolchain and operating as a user in HPC cluster environments. + Background in molecular modeling and life sciences. + ... To advance these efforts we are looking for a Senior Solutions Architect to work with life sciences customers...industry leader with vision on integrating NVIDIA technology into AI and HPC architectures for advanced applications,… more
- Microsoft Corporation (Redmond, WA)
- …of the Azure platform working collaboratively with many industry partners. As a Senior Software Engineer, you will be critical in designing and delivering the next ... generations of AI training, AI inferencing, virtual desktop, video...- from fiber networking, switches, gpu differentiation, rack design, cluster design and more. This position offers a unique… more
- NVIDIA (Santa Clara, CA)
- …Solution Architect! NVIDIA is the engine of modern artificial intelligence and Generative AI , the biggest technology breakthrough of our time. We're on a mission to ... make AI accessible to all, and we're seeking a passionate...solutions who are keen to build data centers and HPC infrastructure using NVIDIA's compute, networking, and software stacks.… more
- NVIDIA (Santa Clara, CA)
- …Ways to stand out from the crowd: + Have built , deployed and operated AI platforms on HPC clusters. Have built, deployed and operated cloud native system ... scientific computing cloud platform enables Physics based Numerical Simulation Solvers, AI based Training, Inference and Visualization workflow for physical science… more
- NVIDIA (Santa Clara, CA)
- …end software and firmware stack for these systems. We are looking for a Senior Software Architect who has deep expertise in designing server platforms and has added ... systems, particularly at the SW/HW interface. + Understanding of HPC or Deep learning workloads and use of accelerated...out from the crowd: + Knowledge of cloud and cluster level deployment and management systems. + Strong background… more
- SLAC National Accelerator Laboratory (Menlo Park, CA)
- …GPUs (28 A100, 280 RTX 2080Ti), current allocation at the NERSC Perlmutter (A100 cluster ), and other potential HPC centers where we apply for future allocations. ... experimental neutrino physics, and Artificial Intelligence and Machine Learning ( AI /ML) research. **About us:** **The Machine Learning Initiatives (MLI)** is… more