- Meta (Menlo Park, CA)
- …and efficiency in our global network . **Required Skills:** Network Engineer , HPC Systems Network Strategy Responsibilities: 1. Design, ... meeting our demands; you will be responsible for conceiving, developing, and deploying software, hardware and network systems and tools that improve reliability… more
- NVIDIA (Santa Clara, CA)
- …familiarity with software testing and deployment, familiarity with distributed systems , and excellent communication and planning abilities. Experience working with ... High Performance Computing ( HPC ), GPUs, and high-performance networking (RDMA, Infiniband, RoCE) are strongly preferred. We also welcome out-of-the-box thinkers who… more
- NVIDIA (Santa Clara, CA)
- …high-performance environments. + Published work, patents, or advanced certifications in networking or HPC systems . NVIDIA is widely considered to be one of the ... engine of modern Artificial Intelligence, Advanced Networking, and High Performance Computing ( HPC ) - the biggest technology breakthroughs of our time. We're on a… more
- LTD Global (Berkeley, CA)
- …computing ( HPC ) and data analysis for the organization. Our center provides essential HPC and data systems to more than 10,000 researchers working in areas ... Position overview: We are seeking a Site Reliability Engineer to join our Operations Group. This role...part of a 24/7 operations team that ensures our systems are accessible, reliable, secure, and available to the… more
- NVIDIA (Santa Clara, CA)
- …We deliver communication runtimes like NCCL and NVSHMEM for Deep Learning and HPC applications. We are looking for a motivated Partner Enablement Engineer ... guide our key partners and customers with NCCL. Most DL/ HPC applications run on large clusters with high-speed networking...Develop tools and automation to isolate issues on new systems and platforms, including cloud platforms (Azure, AWS, GCP,… more
- NVIDIA (Santa Clara, CA)
- … HPC /AI clusters at scale, with hands-on expertise with network topologies and large-scale switch/router deployments. + Familiarity with network ... making the impossible achievable, particularly within AI, ML, and HPC . Joining our team as a Storage & Networking...Joining our team as a Storage & Networking Product Engineer involves being part of a group that fosters… more
- Stanford University (Stanford, CA)
- …researchers from a variety of Stanford and SLAC organizations. The majority of the HPC systems are hosted in the Stanford Research Computing Facility (SRCF), ... Research Data Center Facility Engineer **Business Affairs: University IT (UIT), Stanford, California,...Stanford Research Computing. Research Computing offers High Performance Computing ( HPC ) hosting services, computational and data systems ,… more
- NVIDIA (Santa Clara, CA)
- …wave of artificial intelligence. We are looking for a highly motivated senior software engineer for an exciting role in our communication libraries and network ... crew that develops and maintains software for complex heterogeneous computing systems that power disruptive products in High Performance Computing and Deep… more
- NVIDIA (Santa Clara, CA)
- …in AI/ HPC data center cooling, including immersion and two-phase systems . + Experience building predictive digital twin frameworks combining physical modeling ... NVIDIA's AI Factories are built to accelerate AI and HPC workloads. At their core the Digital Twin (physics-based...tokens per watt across GPUs, cooling, power, and control systems . We are seeking a Senior AI Factory Digital… more
- NVIDIA (Santa Clara, CA)
- …Docker containers & Jenkins pipelines + Certifications in storage (eg, SNIA) or HPC systems or Storage Performance experience with mdtest or FIO tool. ... be. We are looking for a Senior Software Validation Engineer to lead software validation activities in the Datacenter...streamlining our testing processes. + Validation of distributed Storage systems (eg, Lustre) on AI/ HPC Datacenter scale… more
- Meta (Menlo Park, CA)
- …Meta's global data center networks. Our work covers the entire network lifecycle, including hardware development, capacity planning, distributed and centralized ... control systems , modeling/provisioning/automation, monitoring/troubleshooting/analytics, and simulation/design/failure analysis.We are actively seeking Software… more
- Meta (Menlo Park, CA)
- …Meta's global data center networks. Our work covers the entire network lifecycle, including hardware development, capacity planning, distributed and centralized ... control systems , modeling/provisioning/automation, monitoring/troubleshooting/analytics, and simulation/design/failure analysis.We are actively seeking Software… more
- quadric.io, Inc (Burlingame, CA)
- …battery operated smart-sensor systems to high-performance automotive or autonomous vehicle systems . Unlike other NPUs or neural network accelerators in the ... co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of...C++ DSP and control code. Role: The Corporate Applications Engineer is the key bridge between development engineering and… more
- NVIDIA (Santa Clara, CA)
- NVIDIA's Enterprise Product Engineering involves crafting, constructing, and maintaining vital systems efficiently and reliably.. As a Senior Storage Product ... Engineer , you will take ownership of NVIDIA's Product Team's...environments. We focus on delivering high-performance, highly available storage systems that scale while enabling developers to innovate rapidly… more
- Google (Sunnyvale, CA)
- Senior Staff Software Engineer , Networking _corporate_fare_ Google _place_ Kirkland, WA, USA; Sunnyvale, CA, USA **Advanced** Experience owning outcomes and decision ... of experience building and developing large - scale infrastructure, distributed systems or networks, or experience with compute technologies, storage, or hardware… more
- Super Micro Computer (San Jose, CA)
- …for Data Center, Cloud Computing, Enterprise IT, Hadoop/ Big Data, Hyperscale, HPC and IoT/Embedded customers worldwide. We are the #5 fastest growing company ... and business leaders to join us. Job Summary: Sr. Manufacturing Engineer is responsible for facilitate improvement projects and Kaizen (Rapid Continuous… more
- Broadcom (San Jose, CA)
- …experience. 2. Significant experience in RDMA protocol, QoS, Packet Classifications, Linux Systems programming, Linux kernel, Linux Network Drivers, Linux Kernel ... join the NIC product development team. As a Software Engineer , you will be responsible for designing and development...Experience analyzing and tuning performance for a variety of HPC workloads. 7. Excellent programming skills in C, C++… more
- NVIDIA (Santa Clara, CA)
- …GPU Computing. We are passionate about markets include gaming, automotive, vision, HPC , datacenters and networking in addition to our traditional OEM business. ... OS, FW and CUDA SW stack from design doc. + Installing and testing various systems OS, server firmware and SW stack. + Drive support for root cause analysis on… more
- NVIDIA (Santa Clara, CA)
- …a discipline that involves designing, building, and maintaining large-scale production systems with high efficiency and availability. It encompasses various areas, ... including software and systems engineering practices, storage, data management, and services. Production..., and ensuring low-latency data access for high-performance computing ( HPC ) and AI/ML workloads. Storage Production Engineers at NVIDIA… more
- Amazon (Cupertino, CA)
- …AWS cloud infrastructure that enables high performance and scalability in AI/ML and HPC workloads. You'll join a diverse AWS Hardware Engineering team of software, ... hardware, and network engineers, supply chain specialists, security experts, operations managers,...have tremendous interest in cloud scale and curious how systems and software decisions impact the user. You insist… more