• Senior GPU Cluster

    NVIDIA (Santa Clara, CA)
    …working with distributed system software architecture + Basic understanding of HPC GPU cluster , slurm + Basic understanding of Machine learning concepts and ... experience for customer as well as engineers supporting the cluster . Much of our software development focuses...running and instrumenting distributed LLM training on a multi gpu HPC cluster + Knowledge of LLM… more
    NVIDIA (08/13/24)
    - Save Job - Related Jobs - Block Source
  • Senior High Performance Computing…

    NVIDIA (Santa Clara, CA)
    …for a deeply technical HPC cluster administrator to lead a diverse cluster of GPU -accelerated systems and provide architectural mentorship to product teams ... team, you will provide leadership in the design and implementation of groundbreaking GPU compute cluster that runs demanding deep learning, high performance… more
    NVIDIA (06/26/24)
    - Save Job - Related Jobs - Block Source
  • Senior Software Test Development…

    NVIDIA (Santa Clara, CA)
    We are looking for a highly experienced AI Senior Software Test development engineer in NVIDIA's Deep Learning SWQA team. The position is in NVIDIA Deep Learning ... to validate robustness and measure the performance of NVIDIA's Deep Learning software and GPU Infrastructure for autonomous driving, healthcare, speech… more
    NVIDIA (09/06/24)
    - Save Job - Related Jobs - Block Source
  • Senior Software Test Development…

    NVIDIA (Santa Clara, CA)
    …to validate robustness and measure the performance of NVIDIA's Deep Learning software and GPU Infrastructure for autonomous driving, healthcare, speech ... We are looking for a Software Test development engineer in NVIDIA's Deep Learning...improve test automation. + Experience in validating Data Center GPU based infrastructure (multi-GPUS, multi-nodes, cluster ). +… more
    NVIDIA (09/05/24)
    - Save Job - Related Jobs - Block Source
  • Senior Software Engineer, Server…

    NVIDIA (Santa Clara, CA)
    NVIDIA's invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More ... recently, GPU deep learning ignited modern deep learning - the...by doing failure analysis for whole system and architecting software and firmware to be fault resilient. You will… more
    NVIDIA (08/22/24)
    - Save Job - Related Jobs - Block Source
  • Senior Software Architect - Data…

    NVIDIA (Santa Clara, CA)
    software and firmware stack for these systems. We are looking for a Senior Software Architect who has deep expertise in designing server platforms and has ... We are building innovative server systems for GPU accelerated applications, such as Deep Learning. Data...customers. What you'll be doing: + You will lead software activities for NVIDIA's deep learning server platforms, from… more
    NVIDIA (07/16/24)
    - Save Job - Related Jobs - Block Source
  • Senior DevOps Engineer - DGX Cloud

    NVIDIA (Santa Clara, CA)
    …be used for a variety of AI workloads. This includes working on custom software related to GPU asset provisioning, configuration, and lifecycle management across ... deployments and toil elimination. We view DevOps as a software engineering discipline and expect significant contributions to our...You will be harnessing multiple data streams, ranging from GPU hardware diagnostics to cluster and network… more
    NVIDIA (08/29/24)
    - Save Job - Related Jobs - Block Source
  • Senior Software QA Test Development…

    NVIDIA (Santa Clara, CA)
    NVIDIA is the world leader in GPU Computing. We are passionate about markets include gaming, automotive, vision, HPC, datacenters and networking in addition to our ... Computing Company', and NVIDIA GPUs are the brains powering Deep Learning software frameworks, analytics, data centers, and driving autonomous vehicles. We have some… more
    NVIDIA (09/05/24)
    - Save Job - Related Jobs - Block Source
  • Senior Site Reliability Engineer - Internal…

    NVIDIA (Santa Clara, CA)
    …DPUs NVIDIA has continuously reinvented itself over two decades. Our invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern ... computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI - the next...fixing problems before they occur + Building automation for Cluster bring up and scaled up operation. + Improving… more
    NVIDIA (09/12/24)
    - Save Job - Related Jobs - Block Source
  • Senior Cloud Services Software

    NVIDIA (Santa Clara, CA)
    …seeking a distributed software engineer to join our team! As a Senior engineer, you'll be instrumental in developing and optimizing AI infrastructure services to ... resiliency for DGX Cloud. Your expertise in cloud services software architecture that drives the full resilience stack that...that allows the framework to be integrated with the cluster scheduler visibly to the users + Strong understanding… more
    NVIDIA (09/18/24)
    - Save Job - Related Jobs - Block Source
  • Senior Software Development Engineer…

    NVIDIA (Santa Clara, CA)
    …a crucial role in testing, test content development and validating our software releases, ensuring that our products meet exceptional quality standards. With our ... Develop detailed test plans and perform testing for Compute software releases on different platforms, such as Tesla GPUs,...team. + Be responsible for testing cloud services, new GPU /system bring-up, Security Products and CUDA releases. + Enhance… more
    NVIDIA (09/04/24)
    - Save Job - Related Jobs - Block Source
  • Senior AI-HPC Storage Engineer

    NVIDIA (Santa Clara, CA)
    …and models + Familiarity with InfiniBand with IBOP and RDMA + Background with Software Defined Networking and AI/HPC cluster networking + Familiarity with deep ... reinvented itself over two decades. Our invention of the GPU in 1999 sparked the growth of the PC...[ AWS, Azure or GCP] + Experience with AI/HPC cluster job schedulers such as SLURM, LSF + In… more
    NVIDIA (09/18/24)
    - Save Job - Related Jobs - Block Source
  • Senior Technical Program Manager, NPI…

    NVIDIA (Santa Clara, CA)
    …ranges from customer integrable HGX GPU accelerators, through turnkey modular DGX GPU /CPU products, to complete rack and L11 cluster solutions. You will be ... by great technology-and amazing people. We are looking to hire a Senior Technical Program Manager to lead the manufacturing operationalization efforts of… more
    NVIDIA (08/15/24)
    - Save Job - Related Jobs - Block Source
  • Senior Solutions Architect, NPN

    NVIDIA (Santa Clara, CA)
    …end-to-end Machine Learning and Deep Learning solutions, using NVIDIA's compute, networking, and software stacks. Don't think this is a high-level slideshow job - we ... on-premises and cloud based. + 12+ years of proven experience with cluster management and related tools, including Docker Containers, Slurm, Kubernetes, and Ansible.… more
    NVIDIA (09/18/24)
    - Save Job - Related Jobs - Block Source
  • Senior Linux Systems Engineer

    NVIDIA (Santa Clara, CA)
    …container runtimes, drivers+containers, and containerization of various high performance computing cluster software elements within a variety of environments. + ... next era of computing. An era in which our GPU acts as the brains of computers, robots, and...impact on the world. We are looking for a Senior Linux Software Engineer to join the… more
    NVIDIA (08/24/24)
    - Save Job - Related Jobs - Block Source
  • Research Associate - Neutrino

    SLAC National Accelerator Laboratory (Menlo Park, CA)
    …computer science departments. Computing resources available for this work include local GPU clusters with NVIDIA GPUs (28 A100, 280 RTX 2080Ti), current allocation ... at the NERSC Perlmutter (A100 cluster ), and other potential HPC centers where we apply...considered. + Knowledge in statistics, data analysis, algorithms and software development will be required. Strong background in AI/ML,… more
    SLAC National Accelerator Laboratory (08/26/24)
    - Save Job - Related Jobs - Block Source