• AI / HPC Network

    Meta (Menlo Park, CA)
    … fabric, host networking, communication libraries, and scheduling infrastructure. **Required Skills:** AI / HPC Network Engineer Responsibilities: 1. ... software, leveraging software defined networking principles. 14. Understanding of AI technologies and associated network technologies (IB/RDMA/RoCE) **Preferred… more
    Meta (03/04/25)
    - Save Job - Related Jobs - Block Source
  • AI / HPC Network

    Meta (Menlo Park, CA)
    … fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Network Engineer Responsibilities: 1. Design, develop, ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...daily basis. We need to build and evolve our network infrastructure that connects myriads of GPUs together. In… more
    Meta (02/06/25)
    - Save Job - Related Jobs - Block Source
  • AI / HPC Systems Performance…

    Meta (Menlo Park, CA)
    …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Systems Performance Engineer Responsibilities: 1. Lead ... 5. Work with cross functional teams and provide guidance on the AI network architecture including topologies, transport, congestion control techniques. **Minimum… more
    Meta (04/20/25)
    - Save Job - Related Jobs - Block Source
  • AI / HPC Systems Performance…

    Meta (Menlo Park, CA)
    …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Systems Performance Engineer Responsibilities: 1. Active ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...daily basis. We need to build and evolve our network infrastructure that connects myriads of training accelerators like… more
    Meta (03/05/25)
    - Save Job - Related Jobs - Block Source
  • Solutions Architect, HPC Systems…

    NVIDIA (Santa Clara, CA)
    NVIDIA is looking for an experienced GPU and network systems Solutions Architect & Engineer . Do you want to be part of a team that brings new Artificial ... Intelligence ( AI ) hardware and software technologies to production in customer...GPU server and networking system deployments as Solution Architect Engineer . Guide customer discussions on network design,… more
    NVIDIA (04/17/25)
    - Save Job - Related Jobs - Block Source
  • Senior Technical Marketing Engineer

    NVIDIA (Santa Clara, CA)
    …how you can make a lasting impact on the world. As a Senior Technical Marketing Engineer for AI Infrastructure, you will join a dedicated team that is passionate ... equivalent experience. + 5+ years of experience. + Proficiency in Python and C++ for AI and HPC applications. + Experience using large scale multi node GPU… more
    NVIDIA (04/30/25)
    - Save Job - Related Jobs - Block Source
  • Hardware Systems Engineer , NPI AI

    Meta (Menlo Park, CA)
    …approach to hyperscalar bring up and validation. **Required Skills:** Hardware Systems Engineer , NPI AI Lead Responsibilities: 1. Lead the bring-up, validation, ... with a focus on datacenter applications. 3. Utilize experience in accelerator and network architecture, AI workloads/ML models to design and implement robust… more
    Meta (04/26/25)
    - Save Job - Related Jobs - Block Source
  • Hardware Systems Engineer , NPI AI

    Meta (Menlo Park, CA)
    …Meta Silicon hyperscalar bring up and validation. **Required Skills:** Hardware Systems Engineer , NPI AI Responsibilities: 1. Lead the bring-up, validation, and ... ASIC productization in datacenter applications. 3. Utilize experience in accelerator and network ASIC architecture, AI workloads/ML models to design and… more
    Meta (04/24/25)
    - Save Job - Related Jobs - Block Source
  • Software Engineer , SystemML - AI

    Meta (Menlo Park, CA)
    …space of GenAI/LLM scaling reliability and performance. **Required Skills:** Software Engineer , SystemML - AI Networking Responsibilities: 1. Tech-leading the ... this role, you will be a member of the AI Networking Software team and part of the bigger...Library), which enables multi-GPU and multi-node data communication through HPC -style collectives. NCCL has been integrated into PyTorch and… more
    Meta (04/22/25)
    - Save Job - Related Jobs - Block Source
  • Software Engineer , SystemML - AI

    Meta (Menlo Park, CA)
    …space of GenAI/LLM scaling reliability and performance. **Required Skills:** Software Engineer , SystemML - AI Networking Responsibilities: 1. Enabling reliable ... this role, you will be a member of the AI Networking Software team and part of the bigger...Library), which enables multi-GPU and multi-node data communication through HPC -style collectives. NCCL has been integrated into PyTorch and… more
    Meta (03/21/25)
    - Save Job - Related Jobs - Block Source
  • Sr. Hardware Dev Engineer (AWS Generative…

    Amazon (Cupertino, CA)
    …and operating AWS cloud offerings that enable high performance and scalability in AI /ML and HPC workloads. AWS Infrastructure Services owns the design, planning, ... Do you want to build the backbone of Generative AI cloud at AWS? Do you want to build...You'll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers,… more
    Amazon (05/01/25)
    - Save Job - Related Jobs - Block Source
  • AI Applications Engineer

    quadric.io, Inc (Burlingame, CA)
    …(GPNPU) architecture. Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint ... or autonomous vehicle systems. Unlike other NPUs or neural network accelerators in the industry today that can only...C++ DSP and control code. Role: The Corporate Applications Engineer is the key bridge between development engineering and… more
    quadric.io, Inc (03/14/25)
    - Save Job - Related Jobs - Block Source
  • Production Systems Engineer , Sustaining

    Meta (Menlo Park, CA)
    …hardware requirements and specifications (eg, configuring hardware components, GPU, memory, network for AI / HPC workloads) **Public Compensation:** ... **Summary:** Meta is seeking an experienced Production Systems Engineer to join our Release to Production (RTP)...Responsibilities: 1. Develop robust, industry leading practices for supporting AI / HPC infrastructure at scale 2. Interface with… more
    Meta (04/23/25)
    - Save Job - Related Jobs - Block Source
  • System Engineer - Interconnect

    Meta (Menlo Park, CA)
    …custom AI accelerators. Meta is developing one of the world's highest performant AI / HPC clusters using custom-designed AI accelerators. In this role, you ... will have a unique opportunity to shape the future AI / HPC of Meta by specifying technical requirements...build some of the world's most open and efficient AI platforms. **Required Skills:** System Engineer -… more
    Meta (04/19/25)
    - Save Job - Related Jobs - Block Source
  • Software Engineer , Accelerator Systems…

    Meta (Menlo Park, CA)
    HPC hardware requirements and specifications (eg, configuring hardware components, GPU, memory, network for AI / HPC workloads). 14. Understanding of the ... Qualifications:** Preferred Qualifications: 11. Full-stack experience and understanding of AI / HPC systems, from HW/infrastructure through the application layer,… more
    Meta (05/01/25)
    - Save Job - Related Jobs - Block Source
  • Software Engineer , Accelerator Solutions…

    Meta (Menlo Park, CA)
    HPC hardware requirements and specifications (eg, configuring hardware components, GPU, memory, network for AI / HPC workloads). 18. Understanding of the ... Qualifications:** Preferred Qualifications: 15. Full-stack experience and understanding of AI / HPC systems, from hardware and infrastructure through the… more
    Meta (05/01/25)
    - Save Job - Related Jobs - Block Source
  • Senior System Software Engineer , NCCL…

    NVIDIA (Santa Clara, CA)
    …test design + Experience working with engineering or academic research community supporting HPC or AI + Practical experience with high performance networking: ... runtimes like NCCL and NVSHMEM for Deep Learning and HPC applications. We are looking for a motivated Partner...applications. We are looking for a motivated Partner Enablement Engineer to guide our key partners and customers with… more
    NVIDIA (04/22/25)
    - Save Job - Related Jobs - Block Source
  • Sr Staff Engineer , ML Infrastructure…

    LinkedIn (Mountain View, CA)
    LinkedIn is the world's largest professional network , built to create economic opportunity for every member of the global workforce. Our products help people make ... About the Role We are seeking a Senior Staff Engineer to design, build, and maintain our large-scale GPU...our large-scale GPU infrastructure for machine learning (ML) and AI workloads. In this role, you will be the… more
    LinkedIn (04/18/25)
    - Save Job - Related Jobs - Block Source
  • Senior Software QA Test Development…

    NVIDIA (Santa Clara, CA)
    …GPU Computing. We are passionate about markets include gaming, automotive, vision, HPC , datacenters and networking in addition to our traditional OEM business. ... NVIDIA is also well positioned as the ' AI Computing Company', and NVIDIA GPUs are the brains...test cases automation + Strong experience in FW, BMC/OpenBMC, Network protocol, internal/external enterprise storage devices, PCIe buses and… more
    NVIDIA (04/16/25)
    - Save Job - Related Jobs - Block Source
  • Signal and Power Integrity Engineer

    Google (Sunnyvale, CA)
    …unparalleled performance, efficiency, and integration. As a Signal Integrity/Power Integrity Engineer , you will lead chip and package design, ensuring optimal SI/PI ... chip and advanced packages. The ML, Systems, & Cloud AI (MSCA) organization at Google designs, implements, and manages...from developing our latest TPUs to running a global network , while driving towards shaping the future of hyperscale… more
    Google (04/18/25)
    - Save Job - Related Jobs - Block Source