• Performance Benchmarking

    Oracle (Seattle, WA)
    …Design and code solutions for performance benchmarking . + Troubleshoot performance problems on RDMA clusters and perform cluster performance ... team strives to be the go-to experts on RDMA cluster architecture and its relationship to AI/ML/HPC performance...with 5+ years of relevant experience + Experience with benchmarking and troubleshooting or optimizing performance of… more
    Oracle (11/25/25)
    - Save Job - Related Jobs - Block Source
  • Senior HPC Cluster Engineer - EDA

    NVIDIA (Santa Clara, CA)
    …lasting impact on the world. We are seeking a highly skilled and experienced HPC Cluster Engineer to design, deploy, and operate GPU Compute Clusters for EDA and ... high- performance computing workloads used across multiple teams and projects....experience crafting and operating large scale compute infrastructure, including cluster configuration managements tools such as BCM or Ansible.… more
    NVIDIA (09/17/25)
    - Save Job - Related Jobs - Block Source
  • Senior AI and ML HPC Cluster

    NVIDIA (Santa Clara, CA)
    …breaking GPU compute clusters that run demanding deep learning, high performance computing, and computationally intensive workloads. We seek a technical leader ... compute, networking, and storage design for large scale, high performance workloads, effective resource utilization in a heterogeneous compute environment,… more
    NVIDIA (10/19/25)
    - Save Job - Related Jobs - Block Source
  • Senior AI-HPC Cluster Engineer

    NVIDIA (Santa Clara, CA)
    …graphics. Design and implement GPU compute clusters for deep learning and high- performance computing. What you'll be doing: + Provide leadership and strategic ... user needs. + Support our researchers to run their workloads including performance analysis and optimizations. + Conduct root cause analysis and suggest corrective… more
    NVIDIA (10/30/25)
    - Save Job - Related Jobs - Block Source
  • Senior System Software Engineer - AI…

    NVIDIA (Santa Clara, CA)
    …developing tools for AI researchers and SW/HW teams running AI workload in GPU cluster . As a member of the software development team, we will work with users ... debugging tricky failures and issues to help improve the performance and efficiency of the system. What you'll be...common encountered problems like memory or networking + Create benchmarking and simulation technologies for AI system or GPU… more
    NVIDIA (09/19/25)
    - Save Job - Related Jobs - Block Source
  • Senior DGX Cloud Performance

    NVIDIA (Santa Clara, CA)
    performance and AI workloads on large scale systems + Experience with performance modeling and benchmarking at scale + Strong background in Computer ... seeking highly skilled Parallel and Distributed Systems engineers to drive the performance analysis, optimization, and modeling to define the architecture and design… more
    NVIDIA (11/21/25)
    - Save Job - Related Jobs - Block Source
  • (USA) Principal, Software Engineer

    Walmart (Sunnyvale, CA)
    …atop for advanced debugging. + Perform deep analysis of OSD, MON, MDS, RGW performance and optimize cluster parameters. + Debug network congestion, packet loss, ... hardware (NVMe SSDs, RDMA NICs, high-density HDDs) and their impact on storage performance . + Evaluate next-gen server SKUs, perform benchmarking , and compare… more
    Walmart (11/20/25)
    - Save Job - Related Jobs - Block Source
  • HPC Systems Engineer

    University of Pennsylvania (Philadelphia, PA)
    …seeking a highly qualified and motivated High Performance Computing (HPC) Systems Engineer to join the team. PARCC's main cluster (Betty), delivers HPC, ... research community. + Optimize, monitor, and troubleshoot HPC file systems for performance and reliability. + Conduct system benchmarking and develop automated… more
    University of Pennsylvania (10/11/25)
    - Save Job - Related Jobs - Block Source
  • Senior MLOps Engineer , GenAI Framework

    NVIDIA (Santa Clara, CA)
    …and cloud compute technologies, eg: SLURM, Lustre, k8s + Software and hardware Benchmarking on high- performance computing systems. Your base salary will be ... dedicated and motivated senior build and continuous integration (CI/CD) engineer for its GenAI Frameworks (Megatron-LM (https://github.com/NVIDIA/Megatron-LM) and NeMo… more
    NVIDIA (11/14/25)
    - Save Job - Related Jobs - Block Source
  • Senior DGX Cloud Performance

    NVIDIA (Santa Clara, CA)
    performance and AI workloads on large scale systems + Experience with performance modeling and benchmarking at scale + Strong background in Computer ... seeking highly skilled Parallel and Distributed Systems engineers to drive the performance analysis, optimization, and modeling to define the architecture and design… more
    NVIDIA (10/22/25)
    - Save Job - Related Jobs - Block Source
  • Principal / Sr. Principal HPC Network…

    Northrop Grumman (Annapolis Junction, MD)
    …making history. We are looking for you to join our team as a High- Performance Computing ( **HPC** ) **Network Engineer ** based out of **Annapolis Junction ... **Responsibilities** + Monitor and maintain performance of network within a high- performance compute cluster + Contribute to design of new high-… more
    Northrop Grumman (11/07/25)
    - Save Job - Related Jobs - Block Source
  • Senior Software Engineer - NIM Factory…

    NVIDIA (Santa Clara, CA)
    …Kubernetes deployment patterns for NIMs, including GPU scheduling, autoscaling, and multi- cluster rollouts. + Optimize container performance : layer layout, ... understanding difference inference backends (vLLM, SGLang, TRT-LLM) + Background in benchmarking and optimizing inference container performance and startup… more
    NVIDIA (09/19/25)
    - Save Job - Related Jobs - Block Source
  • Staff Software Engineer , Level 6

    Snap Inc. (Seattle, WA)
    …coalescing, and slot-aware load balancing. + Implement robust failover, replication, and cluster topology management and optimize cpu performance , memory usage, ... privacy at the forefront. We're looking for a Software Engineer to join Snap Inc on our Core Infrastructure...layers or custom client lib). + Develop and maintain high- performance caching proxies or client side libraries for request… more
    Snap Inc. (09/12/25)
    - Save Job - Related Jobs - Block Source
  • Human Factors Engineer

    Ford Motor Company (Dearborn, MI)
    …and more! Vehicle Engineering Attribute Engineers deliver customer centric vehicle performance . The Product Excellence and Human Factors (PEHF) team ensures that ... experience fully meet customer expectations. The role of Human Factors Program Engineer involves following programs from early inception until launch and ensuring… more
    Ford Motor Company (11/25/25)
    - Save Job - Related Jobs - Block Source