- Meta (Menlo Park, CA)
- …and efficiency in our global network . **Required Skills:** Network Engineer , HPC Systems Network Strategy Responsibilities: 1. Design, ... meeting our demands; you will be responsible for conceiving, developing, and deploying software, hardware and network systems and tools that improve reliability… more
- NVIDIA (Santa Clara, CA)
- …UCX for Deep Learning and HPC . We are looking for a motivated Performance engineer to influence the roadmap of our communication libraries. The DL and HPC ... are even higher at huge scales! This is an outstanding opportunity for someone with HPC and performance background to advance the state of the art in this space. Are… more
- NVIDIA (Santa Clara, CA)
- …join our mission in integrating genomic solutions into mainstream healthcare. As a healthcare HPC engineer , you will join a dynamic development team focused on ... to understand their current and future challenges and provide outstanding HPC solutions. + Collaborate closely with hardware engineering, CUDA engineering, and… more
- Meta (Menlo Park, CA)
- …these systems we constantly look for opportunities across stack: network fabric and host networking, communications lib and scheduling infrastructure. **Required ... daily basis. We need to build and evolve our network infrastructure that connects myriads of training accelerators like...Skills:** AI/ HPC System Performance Engineer Responsibilities: 1. Lead… more
- NVIDIA (Santa Clara, CA)
- …familiarity with software testing and deployment, familiarity with distributed systems , and excellent communication and planning abilities. Experience working with ... High Performance Computing ( HPC ), GPUs, and high-performance networking (RDMA, Infiniband, RoCE) are strongly preferred. We also welcome out-of-the-box thinkers who… more
- NVIDIA (Santa Clara, CA)
- …high-performance environments. + Published work, patents, or advanced certifications in networking or HPC systems . NVIDIA is widely considered to be one of the ... engine of modern Artificial Intelligence, Advanced Networking, and High Performance Computing ( HPC ) - the biggest technology breakthroughs of our time. We're on a… more
- NVIDIA (Santa Clara, CA)
- …or distributed training systems . + Familiarity with datacenter automation, advanced network protocols, and supporting large HPC or AI clusters in production ... NVIDIA Deep Learning Frameworks Infrastructure team as a Senior Systems Engineer focusing on High-Performance AI &...environments. + Understanding of fast, distributed storage systems like Lustre and GPFS for AI/ HPC … more
- NVIDIA (Santa Clara, CA)
- …We deliver communication runtimes like NCCL and NVSHMEM for Deep Learning and HPC applications. We are looking for a motivated Partner Enablement Engineer ... guide our key partners and customers with NCCL. Most DL/ HPC applications run on large clusters with high-speed networking...Develop tools and automation to isolate issues on new systems and platforms, including cloud platforms (Azure, AWS, GCP,… more
- NVIDIA (Santa Clara, CA)
- … HPC /AI clusters at scale, with hands-on expertise with network topologies and large-scale switch/router deployments. + Familiarity with network ... making the impossible achievable, particularly within AI, ML, and HPC . Joining our team as a Storage & Networking...Joining our team as a Storage & Networking Product Engineer involves being part of a group that fosters… more
- Stanford University (Stanford, CA)
- …researchers from a variety of Stanford and SLAC organizations. The majority of the HPC systems are hosted in the Stanford Research Computing Facility (SRCF), ... Research Data Center Facility Engineer **Business Affairs: University IT (UIT), Stanford, California,...Stanford Research Computing. Research Computing offers High Performance Computing ( HPC ) hosting services, computational and data systems ,… more
- NVIDIA (Santa Clara, CA)
- …wave of artificial intelligence. We are looking for a highly motivated senior software engineer for an exciting role in our communication libraries and network ... crew that develops and maintains software for complex heterogeneous computing systems that power disruptive products in High Performance Computing and Deep… more
- NVIDIA (Santa Clara, CA)
- …in AI/ HPC data center cooling, including immersion and two-phase systems . + Experience building predictive digital twin frameworks combining physical modeling ... NVIDIA's AI Factories are built to accelerate AI and HPC workloads. At their core the Digital Twin (physics-based...tokens per watt across GPUs, cooling, power, and control systems . We are seeking a Senior AI Factory Digital… more
- NVIDIA (Santa Clara, CA)
- …Docker containers & Jenkins pipelines + Certifications in storage (eg, SNIA) or HPC systems or Storage Performance experience with mdtest or FIO tool. ... be. We are looking for a Senior Software Validation Engineer to lead software validation activities in the Datacenter...streamlining our testing processes. + Validation of distributed Storage systems (eg, Lustre) on AI/ HPC Datacenter scale… more
- Meta (Menlo Park, CA)
- **Summary:** In this role, you will be a member of the Network .AI Software team and part of the bigger DC networking organization. The team develops and owns the ... Communications Library), which enables multi-GPU and multi-node data communication through HPC -style collectives. NCCL has been integrated into PyTorch and is on… more
- Meta (Menlo Park, CA)
- …with talented engineers, and contribute to the development of Meta's hyper-scale network infrastructure. **Required Skills:** Software Engineer - Host Networking ... to join our teams and help build scalable distributed systems , develop innovative solutions to our challenges, and ship...6. Design, develop, and deploy services to manage datacenter network switches and forwarding functions 7. Enhance HPC… more
- Meta (Menlo Park, CA)
- …with talented engineers, and contribute to the development of Meta's hyper-scale network infrastructure. **Required Skills:** Software Engineer - Host Networking ... to join our teams and help build scalable distributed systems , develop innovative solutions to our challenges, and ship...6. Design, develop, and deploy services to manage datacenter network switches and forwarding functions 7. Enhance HPC… more
- Google (Sunnyvale, CA)
- Senior Software Engineer , AI/ML, Runtime Engines _corporate_fare_ Google _place_ Sunnyvale, CA, USA **Mid** Experience driving progress, solving problems, and ... + Experience in Machine Learning and High Performance Computing ( HPC ). + Ability to debug and program concurrent/parallel computations....on and is growing every day. As a software engineer , you will work on a specific project critical… more
- NVIDIA (Santa Clara, CA)
- NVIDIA's Enterprise Product Engineering involves crafting, constructing, and maintaining vital systems efficiently and reliably.. As a Senior Storage Product ... Engineer , you will take ownership of NVIDIA's Product Team's...environments. We focus on delivering high-performance, highly available storage systems that scale while enabling developers to innovate rapidly… more
- Super Micro Computer (San Jose, CA)
- …for Data Center, Cloud Computing, Enterprise IT, Hadoop/ Big Data, Hyperscale, HPC and IoT/Embedded customers worldwide. We are the #5 fastest growing company ... and business leaders to join us. Job Summary: Sr. Manufacturing Engineer is responsible for facilitate improvement projects and Kaizen (Rapid Continuous… more
- Broadcom (San Jose, CA)
- …experience. 2. Significant experience in RDMA protocol, QoS, Packet Classifications, Linux Systems programming, Linux kernel, Linux Network Drivers, Linux Kernel ... join the NIC product development team. As a Software Engineer , you will be responsible for designing and development...Experience analyzing and tuning performance for a variety of HPC workloads. 7. Excellent programming skills in C, C++… more