• Senior HPC Cluster

    NVIDIA (Santa Clara, CA)
    …make a lasting impact on the world. We are seeking a highly skilled and experienced HPC Cluster Engineer to design, deploy, and operate GPU Compute Clusters ... + Provide leadership and strategic mentorship on the management of large-scale HPC systems including the deployment of compute, networking, and storage. + Develop… more
    NVIDIA (09/17/25)
    - Save Job - Related Jobs - Block Source
  • Senior AI- HPC Cluster

    NVIDIA (Santa Clara, CA)
    …+ Provide leadership and strategic mentorship on the management of large-scale HPC systems including the deployment of compute, networking, and storage. + Develop ... and operating large scale compute infrastructure. + Experience with AI/ HPC job schedulers and orchestrators, such as Slurm, K8s...such as Slurm, K8s or LSF. Applied experience with AI/ HPC workflows that use MPI and NCCL. + Proficient… more
    NVIDIA (07/31/25)
    - Save Job - Related Jobs - Block Source
  • Senior Platform Engineer

    Travelers Insurance Company (Hartford, CT)
    Senior Platform Engineer to support and manage our High-Performance Computing ( HPC ) Bright cluster environment, which is essential for our Large Language ... will provide backup support, management, and modernization of the High-Performance Compute Cluster (Nvidia Bright Cluster ) and GPU workloads, enabling Travelers'… more
    Travelers Insurance Company (08/15/25)
    - Save Job - Related Jobs - Block Source
  • Senior HPC Engineer

    Texas A&M University System (College Station, TX)
    Job Title Senior HPC Engineer Agency Texas A&M University Department Technology Services - IT Enterprise Operations Proposed Minimum Salary Commensurate Job ... members' faculty and staff providing cutting-edge research and super computing needs. As a Senior High Performance Computing Engineer ( HPC ), you will provide… more
    Texas A&M University System (10/03/25)
    - Save Job - Related Jobs - Block Source
  • Senior GPU and HPC Infrastructure…

    NVIDIA (Santa Clara, CA)
    …and planning abilities. Experience working with High Performance Computing ( HPC ), GPUs, and high-performance networking (RDMA, Infiniband, RoCE) are strongly ... will be harnessing multiple data streams, ranging from GPU hardware diagnostics to cluster and network telemetry. + Work on software that manages NVLINK topography… more
    NVIDIA (07/10/25)
    - Save Job - Related Jobs - Block Source
  • Senior HPC Support Engineer

    NVIDIA (Seattle, WA)
    We are seeking a motivated Senior HPC Technical Support Engineer - AI Infrastructure focusing on InfiniBand, NVLink and AI GPU Cluster technology, ... + InfiniBand, RDMA, NVLink and NVIDIA GPU Technology + Clustering or HPC Data-Center technologies including Upper Layer Protocols (ie, MPI, NCCL) + Additional… more
    NVIDIA (09/09/25)
    - Save Job - Related Jobs - Block Source
  • Senior HPC Linux System…

    Leidos (Atlanta, GA)
    **Description** The Public Health and Human Services Operation of Leidos is seeking a ** Senior HPC ** **Linux System Administrator** to lead a team of system ... planning, coordinating infrastructure support activities, leading and mentoring system administrators + HPC and cluster management: Proven experience with HPC more
    Leidos (09/30/25)
    - Save Job - Related Jobs - Block Source
  • Senior Solutions Architect, HPC

    NVIDIA (TX)
    NVIDIA is looking for an experienced GPU and network systems Solutions Architect & Engineer . Do you want to be part of a team that brings new Artificial Intelligence ... center GPU server and networking system deployments as Solution Architect Engineer . Guide customer discussions on network design, compute/storage and support bring… more
    NVIDIA (09/03/25)
    - Save Job - Related Jobs - Block Source
  • Senior ML Platform Engineer , AI…

    NVIDIA (Santa Clara, CA)
    …Make the choice to join us today! As a member of the GPU AI/ HPC Infrastructure team, you will provide leadership in the design and implementation of ground ... + Provide leadership and strategic guidance on the management of large-scale HPC systems including the deployment of compute, networking, and storage. + Develop… more
    NVIDIA (08/21/25)
    - Save Job - Related Jobs - Block Source
  • Senior Site Reliability Engineer

    NVIDIA (Santa Clara, CA)
    …artificial intelligence. Join our team at NVIDIA as a Senior Site reliability engineer focused on HPC storage and play a crucial role in designing, ... software + Experience with RDMA (InfiniBand or RoCE) fabrics + Background with HPC cluster management tools such as Slurm, PBS, LSF, etc. + Passionate and… more
    NVIDIA (08/21/25)
    - Save Job - Related Jobs - Block Source
  • Senior Solutions Architect - AI…

    NVIDIA (MA)
    …with an interest in advancing artificial intelligence (AI) and high-performance computing ( HPC ) in academic and research environments? We are looking for a Solutions ... background in building and deploying research computing clusters, deploying AI and HPC workloads, and optimizing system performance at scale. What you'll be doing:… more
    NVIDIA (09/17/25)
    - Save Job - Related Jobs - Block Source
  • Associate Director, Sr Principal Systems…

    Bristol Myers Squibb (Princeton, NJ)
    …. **Summary:** Bristol Myers Squibb is looking for an experienced Sr Principal Systems Engineer in HPC /AI infrastructure to work with our technology teams and ... various stakeholders to design, manage, and support cutting-edge HPC /AI infrastructure platforms to serve our community of researchers and scientists, who are using… more
    Bristol Myers Squibb (09/30/25)
    - Save Job - Related Jobs - Block Source
  • Senior Research Engineer

    NVIDIA (Santa Clara, CA)
    NVIDIA is searching for a senior or principal engineer who specializes in building cutting-edge infrastructure for large-scale foundation model training in the ... to support multi-modal foundation models for robotics. + Optimize GPU and cluster utilization for efficient model training and fine-tuning on massive datasets. +… more
    NVIDIA (09/05/25)
    - Save Job - Related Jobs - Block Source
  • Senior System Software Engineer

    NVIDIA (Santa Clara, CA)
    …The data center platforms like GB200 NVL72 by NVIDIA are redefining AI, HPC , and cloud computing. To accommodate leading workloads globally, our diagnostic systems ... hardware technologies. We're in search of a visionary technical leader to engineer and propel innovation in diagnostics for NVIDIA's partner ecosystem. This role… more
    NVIDIA (09/10/25)
    - Save Job - Related Jobs - Block Source
  • Senior Cloud Services Software…

    NVIDIA (Santa Clara, CA)
    …seeking a distributed software engineer to join our team! As a Senior engineer , you'll be instrumental in developing and optimizing AI infrastructure ... engineers. What You Will Be Doing: As a software engineer specializing in backend development, you'll work in a...well as container technologies like Docker and Kubernetes, and HPC /AI platforms such as Slurm. Ways to stand out… more
    NVIDIA (08/08/25)
    - Save Job - Related Jobs - Block Source
  • Senior Platform Management Engineer

    General Motors (Austin, TX)
    …TX, or Warren, MI three times per week, at minimum. **The Role** As a Senior Platform Management Engineer , you will provide technical and business support across ... You will leverage your functional understanding of High Performance Compute ( HPC ) architecture, infrastructure, and peripheral products to identify, diagnose, and… more
    General Motors (08/26/25)
    - Save Job - Related Jobs - Block Source
  • Senior Firmware Engineer - CSP…

    NVIDIA (Santa Clara, CA)
    NVIDIA is seeking a Senior Firmware Engineer to join our CSP Engagements team, focusing on system software for Datacenter products such as GB200. This role ... see: + Deep expertise in data center server architectures, HPC systems, and hardware-software co-design. + Deep expertise in...out from the crowd: + Knowledge of cloud and cluster level deployment and management systems. + Experience with… more
    NVIDIA (10/01/25)
    - Save Job - Related Jobs - Block Source
  • Senior Systems Software Security…

    NVIDIA (CO)
    …server architecture. In-depth understanding of the different deployment models for GPUs (eg, HPC , AI cluster , single- or multi-GPU servers). + Experience in Data ... NVIDIA is searching for a highly motivated, creative engineer with experience in system software security to join the Data Center Systems Software team. In this… more
    NVIDIA (09/26/25)
    - Save Job - Related Jobs - Block Source
  • Senior Software QA Test Development…

    NVIDIA (Santa Clara, CA)
    …GPU Computing. We are passionate about markets include gaming, automotive, vision, HPC , datacenters and networking in addition to our traditional OEM business. ... Linux experience, reliability testing with various telemetries, scale out cluster , test plan development, track record in developing AI...are rapidly growing. If you're a creative and autonomous engineer with a real passion for technology, we want… more
    NVIDIA (09/24/25)
    - Save Job - Related Jobs - Block Source
  • Senior Systems Administrator…

    Mount Sinai Health System (New York, NY)
    …and implements backup policies. + Assist in the management and maintenance of HPC cluster and data center work, including troubleshooting for resolving system ... data warehouse team and a research data services team. The **_Senior Systems Administrator/ Engineer ,_** as a member of the Scientific Computing and Data group, is… more
    Mount Sinai Health System (09/22/25)
    - Save Job - Related Jobs - Block Source