• Network Engineer , HPC

    Meta (Menlo Park, CA)
    …and efficiency in our global network . **Required Skills:** Network Engineer , HPC Systems Network Strategy Responsibilities: 1. Design, ... meeting our demands; you will be responsible for conceiving, developing, and deploying software, hardware and network systems and tools that improve reliability… more
    Meta (10/16/25)
    - Save Job - Related Jobs - Block Source
  • AI/ HPC Systems Performance…

    Meta (Menlo Park, CA)
    …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI/ HPC Systems Performance Engineer Responsibilities: 1. Active ... daily basis. We need to build and evolve our network infrastructure that connects myriads of training accelerators like...a loss-less fabric interconnect. To improve performance of these systems we constantly look for opportunities across stack: … more
    Meta (10/16/25)
    - Save Job - Related Jobs - Block Source
  • Site Reliability Engineer

    LTD Global (Berkeley, CA)
    …computing ( HPC ) and data analysis for the organization. Our center provides essential HPC and data systems to more than 10,000 researchers working in areas ... Position overview: We are seeking a Site Reliability Engineer to join our Operations Group. This role...part of a 24/7 operations team that ensures our systems are accessible, reliable, secure, and available to the… more
    LTD Global (09/23/25)
    - Save Job - Related Jobs - Block Source
  • Research Data Center Facility Engineer

    Stanford University (Stanford, CA)
    …researchers from a variety of Stanford and SLAC organizations. The majority of the HPC systems are hosted in the Stanford Research Computing Facility (SRCF), ... Research Data Center Facility Engineer **Business Affairs: University IT (UIT), Stanford, California,...Stanford Research Computing. Research Computing offers High Performance Computing ( HPC ) hosting services, computational and data systems ,… more
    Stanford University (10/01/25)
    - Save Job - Related Jobs - Block Source
  • Software Engineer , SystemML - AI…

    Meta (Menlo Park, CA)
    …Communications Library), which enables multi-GPU and multi-node data communication through HPC -style collectives. NCCL has been integrated into PyTorch and is on ... (eg Large-Scale GenAI/LLM training) from the trainer down to the inter-GPU and network communication layer. And we are seeking for engineers to work on the… more
    Meta (10/16/25)
    - Save Job - Related Jobs - Block Source
  • Software Engineer - Datacenter networking

    Meta (Menlo Park, CA)
    …Meta's global data center networks. Our work covers the entire network lifecycle, including hardware development, capacity planning, distributed and centralized ... control systems , modeling/provisioning/automation, monitoring/troubleshooting/analytics, and simulation/design/failure analysis.We are actively seeking Software… more
    Meta (09/10/25)
    - Save Job - Related Jobs - Block Source
  • Software Engineer - Datacenter networking

    Meta (Menlo Park, CA)
    …Meta's global data center networks. Our work covers the entire network lifecycle, including hardware development, capacity planning, distributed and centralized ... control systems , modeling/provisioning/automation, monitoring/troubleshooting/analytics, and simulation/design/failure analysis.We are actively seeking Software… more
    Meta (08/01/25)
    - Save Job - Related Jobs - Block Source
  • AI Applications Engineer

    quadric.io, Inc (Burlingame, CA)
    …battery operated smart-sensor systems to high-performance automotive or autonomous vehicle systems . Unlike other NPUs or neural network accelerators in the ... co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of...C++ DSP and control code. Role: The Corporate Applications Engineer is the key bridge between development engineering and… more
    quadric.io, Inc (08/26/25)
    - Save Job - Related Jobs - Block Source
  • Software Engineer III, AI/ML, GPU…

    Google (Sunnyvale, CA)
    …latency and throughput. You will have experience with accelerators (TPUs or GPUs), or HPC . The ML, Systems , & Cloud AI (MSCA) organization at Google designs, ... Software Engineer III, AI/ML, GPU Inference, Optimization _corporate_fare_ Google...implements, and manages the hardware, software, machine learning, and systems infrastructure for all Google services (Search, YouTube, etc.)… more
    Google (10/15/25)
    - Save Job - Related Jobs - Block Source
  • System Software Engineer

    Broadcom (San Jose, CA)
    …experience. 2. Significant experience in RDMA protocol, QoS, Packet Classifications, Linux Systems programming, Linux kernel, Linux Network Drivers, Linux Kernel ... join the NIC product development team. As a Software Engineer , you will be responsible for designing and development...Experience analyzing and tuning performance for a variety of HPC workloads. 7. Excellent programming skills in C, C++… more
    Broadcom (08/30/25)
    - Save Job - Related Jobs - Block Source
  • Sr. Hardware Dev Engineer (AWS Generative…

    Amazon (Cupertino, CA)
    …cloud offerings that enable high performance and scalability in AI/ML and HPC workloads. AWS Infrastructure Services owns the design, planning, delivery, and ... to help. You'll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital… more
    Amazon (10/08/25)
    - Save Job - Related Jobs - Block Source
  • Senior Software Engineering Manager, TPU…

    Google (Sunnyvale, CA)
    …analysis and experience in performance modeling of High-Performance Computing ( HPC ) interconnect topologies. + Knowledge of computer architecture (Tensor Processing ... Like Google's own ambitions, the work of a Software Engineer goes beyond just Search. Software Engineering Managers have...enable cost effective performance and power of future ML systems such as fast iteration and innovation for ML… more
    Google (09/30/25)
    - Save Job - Related Jobs - Block Source