- Meta (Menlo Park, CA)
- … fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Network Engineer Responsibilities: 1. Design, develop, ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...daily basis. We need to build and evolve our network infrastructure that connects myriads of training accelerators like… more
- Meta (Menlo Park, CA)
- …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Systems Performance Engineer Responsibilities: 1. Lead ... 5. Work with cross functional teams and provide guidance on the AI network architecture including topologies, transport, congestion control techniques. **Minimum… more
- Meta (Menlo Park, CA)
- …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Systems Performance Engineer Responsibilities: 1. Active ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...daily basis. We need to build and evolve our network infrastructure that connects myriads of training accelerators like… more
- NVIDIA (Santa Clara, CA)
- …UCX for Deep Learning and HPC . We are looking for a motivated Performance engineer to influence the roadmap of our communication libraries. The DL and HPC ... are even higher at huge scales! This is an outstanding opportunity for someone with HPC and performance background to advance the state of the art in this space. Are… more
- NVIDIA (Santa Clara, CA)
- …to support their future chip design needs, understand their workflow characteristics, and engineer an efficient HPC environment. Work with IT and engineering ... intelligence to autonomous cars. We are now looking for a highly motivated HPC Operations Manager to join this multifaceted and innovative infrastructure team to… more
- Amazon (Cupertino, CA)
- …delivering and operating AWS cloud offerings that enable high performance and scalability in AI /ML and HPC workloads. You are intrigued by the continuous release ... Do you want to build the backbone of Generative AI cloud at AWS? Do you want to build...You'll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers,… more
- NVIDIA (Santa Clara, CA)
- …intelligence. Make the choice to join us today! As a member of the GPU AI / HPC Infrastructure team, you will provide leadership in the design and implementation ... usage of all datacenter resources including compute , storage, network and power. You will help build methodologies, tools...Experience analyzing and tuning performance for a variety of AI / HPC workloads. + Working knowledge of cluster… more
- Lenovo (San Jose, CA)
- …strong technical background in datacenter infrastructure and proven history of success selling HPC , AI , and GPU solutions. This candidate should also have ... Field Application Engineer - CSP **General Information** Req # WD00063015...Strong problem-solving and analytical abilities + Strong knowledge of AI / HPC , including both software frameworks and hardware.… more
- Meta (Menlo Park, CA)
- **Summary:** In this role, you will be a member of the Network . AI Software team and part of the bigger DC networking organization. The team develops and owns the ... Communications Library), which enables multi-GPU and multi-node data communication through HPC -style collectives. NCCL has been integrated into PyTorch and is on… more
- Meta (Menlo Park, CA)
- **Summary:** In this role, you will be a member of the Network . AI Software team and part of the bigger DC networking organization. The team develops and owns the ... Communications Library), which enables multi-GPU and multi-node data communication through HPC -style collectives. NCCL has been integrated into PyTorch and is on… more
- NVIDIA (Santa Clara, CA)
- …wave of artificial intelligence. We are looking for a highly motivated senior software engineer for an exciting role in our communication libraries and network ... for Deep Learning frameworks (eg NCCL for TensorFlow/Pytorch) and HPC programming interfaces (eg UCX for MPI/OpenSHMEM) on GPU...are growing fast. If you're a creative and autonomous engineer with real passion for technology, we want to… more
- NVIDIA (Santa Clara, CA)
- Joining NVIDIA's AI Efficiency Team means contributing to the infrastructure that powers our leading-edge AI research. This team focuses on optimizing efficiency ... and resiliency of ML workloads, as well as developing scalable AI infrastructure tools and services. Our objective is to deliver a stable, scalable environment for… more
- NVIDIA (Santa Clara, CA)
- We are looking for an experienced software engineer who excels solving customer problems with DGX clusters (GPU-based supercomputers interconnected with InfiniBand ... network fabric). The ideal candidate also has experience developing...MOFED, RDMA, ROCE and GPU Technology + Clustering or HPC Data-Center technologies including Upper Layer Protocols (ie, NCCL,… more
- NVIDIA (Santa Clara, CA)
- We are looking for a software engineer for our Sparse Linear Algebra team which develops key technologies and libraries such as cuSOLVER, cuSPARSE, cuDSS, and AmgX, ... come and join our team! What you will be doing: + developing scalable HPC math library software for various numerical methods including but not limited to sparse… more
- NVIDIA (Santa Clara, CA)
- …the development of CPU technology for architectures used for artificial intelligence ( AI ) / deep learning (DL), high-performance computing ( HPC ), cloud service ... for the CPU/Fabric teams to improve efficiency + Develop Network -on-Chip (NoC)/SoC tooling, working closely with architects, chip leads...challenges no one else can solve. Our work in AI and the metaverse is transforming the world's largest… more