• AI / HPC Systems

    Meta (Concord, NH)
    …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Systems Performance Engineer Responsibilities: 1. ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...workloads that expects a loss-less fabric interconnect. To improve performance of these systems we constantly look… more
    Meta (03/22/25)
    - Save Job - Related Jobs - Block Source
  • Senior DevOps Engineer - Accelerated Computing

    NVIDIA (Westford, MA)
    …Experience with HPC hardware systems such as compute clusters and HPC software performance benchmarking on such systems . + System administrator level ... Our team builds software that finds its way into AI applications, self-driving cars, and some of the world's...builds and tests on a lot of architectures, operating systems , and devices. + Collecting a lot of data… more
    NVIDIA (04/25/25)
    - Save Job - Related Jobs - Block Source
  • Applied Science Research Lab Manager

    NVIDIA (Westford, MA)
    …power, cooling, space, and user environments. + Deep understanding of operating systems , computer networks, and high- performance hardware and software in ... scientific computing. + Deep knowledge of distributed resource scheduling systems and orchestration tools such as Slurm, K8s. +...to stand out from the crowd: + Knowledge of HPC and AI solution technologies from CPU's… more
    NVIDIA (04/11/25)
    - Save Job - Related Jobs - Block Source