• AI / HPC Network

    Meta (Menlo Park, CA)
    … fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Network Engineer Responsibilities: 1. Design, develop, ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...daily basis. We need to build and evolve our network infrastructure that connects myriads of GPUs together. In… more
    Meta (05/08/25)
    - Save Job - Related Jobs - Block Source
  • AI / HPC Systems Performance…

    Meta (Menlo Park, CA)
    …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Systems Performance Engineer Responsibilities: 1. Lead ... 5. Work with cross functional teams and provide guidance on the AI network architecture including topologies, transport, congestion control techniques. **Minimum… more
    Meta (04/20/25)
    - Save Job - Related Jobs - Block Source
  • AI / HPC Systems Performance…

    Meta (Menlo Park, CA)
    …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Systems Performance Engineer Responsibilities: 1. Active ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...daily basis. We need to build and evolve our network infrastructure that connects myriads of training accelerators like… more
    Meta (06/03/25)
    - Save Job - Related Jobs - Block Source
  • Hardware Systems Engineer , NPI AI

    Meta (Menlo Park, CA)
    …Meta Silicon hyperscalar bring up and validation. **Required Skills:** Hardware Systems Engineer , NPI AI Responsibilities: 1. Lead the bring-up, validation, and ... ASIC productization in datacenter applications. 3. Utilize experience in accelerator and network ASIC architecture, AI workloads/ML models to design and… more
    Meta (04/24/25)
    - Save Job - Related Jobs - Block Source
  • Software Engineer , SystemML - AI

    Meta (Menlo Park, CA)
    …space of GenAI/LLM scaling reliability and performance. **Required Skills:** Software Engineer , SystemML - AI Networking Responsibilities: 1. Tech-leading the ... this role, you will be a member of the AI Networking Software team and part of the bigger...Library), which enables multi-GPU and multi-node data communication through HPC -style collectives. NCCL has been integrated into PyTorch and… more
    Meta (04/22/25)
    - Save Job - Related Jobs - Block Source
  • Software Engineer , SystemML - AI

    Meta (Menlo Park, CA)
    …space of GenAI/LLM scaling reliability and performance. **Required Skills:** Software Engineer , SystemML - AI Networking Responsibilities: 1. Enabling reliable ... this role, you will be a member of the AI Networking Software team and part of the bigger...Library), which enables multi-GPU and multi-node data communication through HPC -style collectives. NCCL has been integrated into PyTorch and… more
    Meta (03/21/25)
    - Save Job - Related Jobs - Block Source
  • Hardware Systems Engineer , AI

    Meta (Menlo Park, CA)
    …in exploring, developing and productizing high-performance software and hardware technologies for AI at datacenter scale. Hardware Systems Engineer in RTP work ... and optimize these systems in production. **Required Skills:** Hardware Systems Engineer , AI Systems Responsibilities: 1. Interface with external vendors… more
    Meta (05/24/25)
    - Save Job - Related Jobs - Block Source
  • AI Applications Engineer

    quadric.io, Inc (Burlingame, CA)
    …(GPNPU) architecture. Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint ... or autonomous vehicle systems. Unlike other NPUs or neural network accelerators in the industry today that can only...C++ DSP and control code. Role: The Corporate Applications Engineer is the key bridge between development engineering and… more
    quadric.io, Inc (03/14/25)
    - Save Job - Related Jobs - Block Source
  • Production Systems Engineer , Sustaining

    Meta (Menlo Park, CA)
    …hardware requirements and specifications (eg, configuring hardware components, GPU, memory, network for AI / HPC workloads) **Public Compensation:** ... **Summary:** Meta is seeking an experienced Production Systems Engineer to join our Release to Production (RTP)...Responsibilities: 1. Develop robust, industry leading practices for supporting AI / HPC infrastructure at scale 2. Interface with… more
    Meta (05/20/25)
    - Save Job - Related Jobs - Block Source
  • R&D Applications Engineer

    Broadcom (San Jose, CA)
    …the latest Broadcom switch platforms and emerging network technologies optimized for AI and HPC workloads. + Contribute to hardware and low-level software ... Broadcom high-speed Ethernet switch solutions, specifically designed to accelerate AI /ML and High-Performance Computing ( HPC ) workloads. Our products… more
    Broadcom (05/21/25)
    - Save Job - Related Jobs - Block Source
  • Software Engineer , Accelerator Systems…

    Meta (Menlo Park, CA)
    HPC hardware requirements and specifications (eg, configuring hardware components, GPU, memory, network for AI / HPC workloads). 14. Understanding of the ... Qualifications:** Preferred Qualifications: 11. Full-stack experience and understanding of AI / HPC systems, from HW/infrastructure through the application layer,… more
    Meta (05/01/25)
    - Save Job - Related Jobs - Block Source
  • Sr Staff Engineer , ML Infrastructure…

    LinkedIn (Mountain View, CA)
    LinkedIn is the world's largest professional network , built to create economic opportunity for every member of the global workforce. Our products help people make ... About the Role We are seeking a Senior Staff Engineer to design, build, and maintain our large-scale GPU...our large-scale GPU infrastructure for machine learning (ML) and AI workloads. In this role, you will be the… more
    LinkedIn (04/18/25)
    - Save Job - Related Jobs - Block Source
  • Technical Marketing Engineering - Artificial…

    Cisco (San Jose, CA)
    …advanced data center networking technologies. We seek an experienced Technical Marketing Engineer (TME) specializing on NX-OS architectures incld AI networking ... Nexus 9000, NXOS, Nexus Dashboard Fabric Controller (NDFC), and Network fabric for Artificial Intelligence ( AI ) applications....Your Impact As a senior NX-OS Networking Technical Marketing Engineer , you will bridge the gap between our… more
    Cisco (05/07/25)
    - Save Job - Related Jobs - Block Source
  • Sr Staff FAE

    Broadcom (San Jose, CA)
    …End-2-End congestion techniques, working experience on debugging Embedded Software, knowledge of HPC and AI /ML data center operational models, deep knowledge of ... please Sign-In before you apply.** **Job Description:** Software Field Applications Engineer (FAE) is software technical lead for Broadcom ethernet controllers/… more
    Broadcom (05/13/25)
    - Save Job - Related Jobs - Block Source