- Meta (New York, NY)
- …and efficiency in our global network . **Required Skills:** Network Engineer , HPC Systems Network Strategy Responsibilities: 1. Design, ... meeting our demands; you will be responsible for conceiving, developing, and deploying software, hardware and network systems and tools that improve reliability… more
- Meta (New York, NY)
- …these systems we constantly look for opportunities across stack: network fabric and host networking, communications lib and scheduling infrastructure. **Required ... daily basis. We need to build and evolve our network infrastructure that connects myriads of training accelerators like...Skills:** AI/ HPC System Performance Engineer Responsibilities: 1. Lead… more
- Oracle (Albany, NY)
- …global Oracle Cloud Infrastructure (OCI). Primarily focused on the development and support of network fabric and systems through a combination of a deep level ... cluster networking domain and enable seamless, accelerated High-Performance Compute ( HPC ), Artificial Intelligence and Machine Learning advancements. We envision a… more
- Oracle (Albany, NY)
- …global Oracle Cloud Infrastructure (OCI). Primarily focused on the development and support of network fabric and systems through a combination of a deep level ... cluster networking domain and enable seamless, accelerated High-Performance Compute ( HPC ), Artificial Intelligence and Machine Learning advancements. We envision a… more
- Oracle (Albany, NY)
- …at OCI, we're running the RDMA network underneath your workload. A Principal Network Engineer on our team supports the design, deployment, and operations of ... team at OCI. We support and operate the RDMA/RoCE network fabrics for OCI's largest AI and HPC...OCI). Primarily focused on operation and support of RDMA/RoCE network fabrics and systems , through a combination… more
- Oracle (Albany, NY)
- …Cloud) AI Infrastructure Innovation team is pioneering the creation of next-generation AI/ HPC networking for GPU superclusters at massive scale. Our mission is to ... storage access. If you thrive at the intersection of large-scale distributed systems , high-speed networking, and AI workloads, this role offers the opportunity to… more
- Meta (New York, NY)
- **Summary:** In this role, you will be a member of the Network .AI Software team and part of the bigger DC networking organization. The team develops and owns the ... Communications Library), which enables multi-GPU and multi-node data communication through HPC -style collectives. NCCL has been integrated into PyTorch and is on… more
- Meta (New York, NY)
- …with talented engineers, and contribute to the development of Meta's hyper-scale network infrastructure. **Required Skills:** Software Engineer - Host Networking ... to join our teams and help build scalable distributed systems , develop innovative solutions to our challenges, and ship...6. Design, develop, and deploy services to manage datacenter network switches and forwarding functions 7. Enhance HPC… more
- Meta (New York, NY)
- …with talented engineers, and contribute to the development of Meta's hyper-scale network infrastructure. **Required Skills:** Software Engineer - Host Networking ... to join our teams and help build scalable distributed systems , develop innovative solutions to our challenges, and ship...6. Design, develop, and deploy services to manage datacenter network switches and forwarding functions 7. Enhance HPC… more
- Oracle (Albany, NY)
- …IS-IS, MPLS, RSVP-TE, VxLAN, EVPN, DNS, and DHCP + Experience in GPU/RDMA network environments, High Performance Compute ( HPC ), or InfiniBand technologies + ... Experience with network monitoring and telemetry solutions, network configuration management, linux systems administration + Experience leading… more
- Mount Sinai Health System (New York, NY)
- …clinical data warehouse team and a research data services team. The **_Senior Systems Administrator/ Engineer ,_** as a member of the Scientific Computing and Data ... is the principal technology expert for Windows and Linux systems , and help support high-performance computing ( HPC )...experience in designing, administering and troubleshooting Linux and Windows systems , storage systems , network and… more
- Oracle (Albany, NY)
- …Support the high-level thermal design direction and data center strategy for complex systems ranging from advanced computing ( HPC , GPU, FPGA Accelerators, etc.) ... **Job Description** As a Senior Principal Thermal Engineer , you will focus on the alignment of...hoses, and QD's for high end, high reliability compute systems + Establish a good working relationship with outside… more