- Unknown (San Jose, CA)
- …demands a visionary with a proven track record in large-scale networking architecture, HPC systems engineering , or related fields, and a deep understanding of ... infrastructure market. Key responsibilities for the SVP include holistic network design, architecture governance, technology evaluation, AI cluster… more
- Unknown (San Jose, CA)
- …and compute infrastructure, with a preference for those with a background in AI /ML, HPC , or hyperscale environments. Deep technical knowledge of GPU clusters, ... Head of Infrastructure About the Company Innovative provider of AI infrastructure solutions Industry Information Technology and Services Type Privately Held About… more
- Meta (Menlo Park, CA)
- …fabric and host networking, communications lib and scheduling infrastructure. **Required Skills:** AI / HPC Network Engineering Manager Responsibilities: ... daily basis. We need to build and evolve our network infrastructure that connects myriads of training accelerators like...responsible for design, model, develop, test, deploy and operate AI / HPC Networks at scale 2. Provide continual… more
- Meta (Menlo Park, CA)
- …5. Work with cross functional teams and provide guidance on the AI network architecture including topologies, transport, congestion control techniques **Minimum ... host networking, communications lib and scheduling infrastructure. **Required Skills:** AI / HPC System Performance Engineer Responsibilities: 1. Lead… more
- Meta (Menlo Park, CA)
- …operate in a multi-organization landscape. **Required Skills:** Technical Program Manager, AI Network Infra Responsibilities: 1. Lead technical program ... and AI operations initiatives supporting Meta's growing AI / HPC infrastructure for our Family of Apps...matrix organization covering a range of areas (Data Center, Network , Hardware Systems, Infrastructure Engineering , Software … more
- Meta (Menlo Park, CA)
- …many aspects of the system from models and runtime all the way to the AI hardware, optimizing across compute, network and storage. The team invests significantly ... develop and help productionize high performance software & hardware technologies for AI at datacenter scale. We achieve this via concurrent design and optimization… more
- Meta (Menlo Park, CA)
- **Summary:** In this role, you will be a member of the AI Networking Software team and part of the bigger DC networking organization. The team develops and owns the ... Communications Library), which enables multi-GPU and multi-node data communication through HPC -style collectives. NCCL has been integrated into PyTorch and is on… more
- Meta (Menlo Park, CA)
- **Summary:** In this role, you will be a member of the AI Networking Software team and part of the bigger DC networking organization. The team develops and owns the ... Communications Library), which enables multi-GPU and multi-node data communication through HPC -style collectives. NCCL has been integrated into PyTorch and is on… more
- Microsoft Corporation (Mountain View, CA)
- …maintenance. + Contributions to large-scale infrastructure operations, supercomputing centers, or AI hardware design. Software Engineering IC5 - The typical ... **Overview** Microsoft AI operates one of the world's most advanced...We work closely with research, hardware, datacenter, and platform engineering teams to develop predictive health models, failure detection… more
- Meta (Menlo Park, CA)
- **Summary:** In this role, you will be a member of the Network . AI Software team and part of the bigger DC networking organization. The team develops and owns the ... Communications Library), which enables multi-GPU and multi-node data communication through HPC -style collectives. NCCL has been integrated into PyTorch and is on… more
- Meta (Menlo Park, CA)
- …solutions to our challenges, and ship them into production. As part of our network engineering teams, you'll have the opportunity to work on cutting-edge ... and supporting cutting-edge technologies like AI , Generative AI , Recommendation engines, and Metaverse. Our network ...6. Design, develop, and deploy services to manage datacenter network switches and forwarding functions 7. Enhance HPC… more
- Meta (Menlo Park, CA)
- …solutions to our challenges, and ship them into production. As part of our network engineering teams, you'll have the opportunity to work on cutting-edge ... and supporting cutting-edge technologies like AI , Generative AI , Recommendation engines, and Metaverse. Our network ...6. Design, develop, and deploy services to manage datacenter network switches and forwarding functions 7. Enhance HPC… more
- Cisco (San Jose, CA)
- …may contact you directly if a relevant position opens. **Meet the Team** ** Engineering :** Open-minded, driven, diverse and deeply creative people at Cisco design the ... mobile/wireless, video, VoIP, collaboration, web, routing, switching, IPv6, data center, HPC , Telepresence and many more. Your work will impact billions globally.… more
- Cisco (San Jose, CA)
- …contact you directly if a relevant position opens. **Meet the Team ** Engineering : Open-minded, driven, diverse and deeply creative people at Cisco design the ... web, Internet of Things, routing, switching, IPv6, data center, HPC , Telepresence and many more. Your work will impact...product specifications. **Your Impact ** Join our Creative Hardware Engineering team and make a tangible impact across the… more
- Broadcom (San Jose, CA)
- …L2/L3 protocols especially RoCE( RDMA over Converged Ethernet ) protocol & use cases in AI /ML, HPC cluster is a plus + Having Knowledge of deep learning models ... PCI-E-based designs, and hands-on experience in Python programming. Good understanding of AI /ML clusters, Deep learning models, and GPU Micro benchmarks is a plus.… more
- Cisco (San Jose, CA)
- …may contact you directly if a relevant position opens. **Meet the Team** Engineering : Open-minded, driven, diverse and deeply creative people at Cisco design the ... data, collaboration, web, Internet of Things, routing, switching, IPv6, data center, HPC , Telepresence and many more. Your work will affect billions globally. Supply… more
- Cisco (San Jose, CA)
- …mobile/wireless, video, VoIP, collaboration, web, routing, switching, IPv6, data center, HPC , Telepresence and many more. Your work will impact billions globally. ... the way through to high volume manufacturing. Creative Hardware Engineering positions are available in: + ASIC Design and...data and infrastructure connect and protect organizations in the AI era - and beyond. We've been innovating fearlessly… more