• Cluster Deployment Operations…

    NVIDIA (Santa Clara, CA)
    …people to make them operational in production? We are seeking a dedicated Cluster Deployment Operations Engineer to support product deployments and issues by ... years of experience in at least two of the following: HPC/large-scale cluster administration, Linux systems engineering, infrastructure automation (eg, Ansible,… more
    NVIDIA (12/18/25)
    - Save Job - Related Jobs - Block Source
  • Senior Platform and EngOps Engineer

    NVIDIA (Santa Clara, CA)
    …DevOps tools to automate software updates, perform maintenance tasks, and monitor cluster availability, ensuring seamless operations. + Take ownership of daily ... cluster failures and issues, troubleshooting them promptly to maintain...in deploying and administrating clusters, servers, switches, and related infrastructure . + Automation expert with hands on skills in… more
    NVIDIA (11/01/25)
    - Save Job - Related Jobs - Block Source
  • AI and ML HPC Cluster Engineer

    NVIDIA (Santa Clara, CA)
    …the world's most advanced computing workloads. NVIDIA is looking for an AI/ML HPC Cluster Engineer to join our MARS team. You will provide technical engagement ... mission, our team, Managed AI Superclusters (MARS) builds and scales the infrastructure , platforms, and tools that enable researchers and engineers to develop the… more
    NVIDIA (01/10/26)
    - Save Job - Related Jobs - Block Source
  • Senior HPC Cluster Engineer - EDA

    NVIDIA (Santa Clara, CA)
    …lasting impact on the world. We are seeking a highly skilled and experienced HPC Cluster Engineer to design, deploy, and operate GPU Compute Clusters for EDA and ... of 5 years of proven experience crafting and operating large scale compute infrastructure , including cluster configuration managements tools such as BCM or… more
    NVIDIA (12/10/25)
    - Save Job - Related Jobs - Block Source
  • R&D Engineer , VCF Cluster

    Broadcom (Bellevue, WA)
    …you apply.** **Job Description:** **About Broadcom** Broadcom Inc. is a global infrastructure technology leader built on 50 years of innovation, collaboration, and ... We design, develop, and supply a broad range of semiconductor and infrastructure software solutions. Our category-leading product portfolios serve the world's most… more
    Broadcom (12/24/25)
    - Save Job - Related Jobs - Block Source
  • Performance Benchmarking Engineer

    Oracle (Seattle, WA)
    **Job Description** OCI AI Infrastructure is at the forefront of building cutting-edge GPU supercomputers that scale to tens of thousands of GPUs without ... team strives to be the go-to experts on RDMA cluster architecture and its relationship to AI/ML/HPC performance. We...+ Troubleshoot performance problems on RDMA clusters and perform cluster performance validation, including on very novel and not… more
    Oracle (11/25/25)
    - Save Job - Related Jobs - Block Source
  • Senior Software Engineer - Market Data…

    Bloomberg (New York, NY)
    Senior Software Engineer - Market Data Platform, Cluster Management Location New York Business Area Engineering and CTO Ref # 10046371 **Description & ... in it for you:** As a Market Data Platform engineer , you will: + Get hands-on experience working on...and diagnosing unexpected issues in production. The market data infrastructure you'll help build and improve is mission-critical for… more
    Bloomberg (11/15/25)
    - Save Job - Related Jobs - Block Source
  • Senior AI and ML HPC Cluster

    NVIDIA (Santa Clara, CA)
    …Make the choice to join us today! As a member of the GPU AI/HPC Infrastructure team, you will provide leadership in the design and implementation of ground breaking ... + Minimum 5+ years of experience designing and operating large scale compute infrastructure + Experience with AI/HPC advanced job schedulers, such as Slurm, K8s,… more
    NVIDIA (01/10/26)
    - Save Job - Related Jobs - Block Source
  • Senior AI-HPC Cluster Engineer

    NVIDIA (Santa Clara, CA)
    …Minimum of 6 years of experience crafting and operating large scale compute infrastructure . + Experience with AI/HPC job schedulers and orchestrators, such as Slurm, ... staying ahead of new technologies and effective approaches in the HPC and AI/ML infrastructure fields. Ways to stand out from the crowd: + Experience with NVIDIA… more
    NVIDIA (10/30/25)
    - Save Job - Related Jobs - Block Source
  • Senior GPU and HPC Infrastructure

    NVIDIA (Santa Clara, CA)
    NVIDIA is hiring engineers to scale up its AI Infrastructure . We expect you to have a strong programming background, knowledge of datacenter hardware, operations, ... help advance NVIDIA's capacity to build and deploy leading infrastructure solutions for a broad range of AI-based applications...multiple data streams, ranging from GPU hardware diagnostics to cluster and network telemetry. + Work on software that… more
    NVIDIA (01/08/26)
    - Save Job - Related Jobs - Block Source
  • Sr. Devops Infrastructure Engineer

    TEKsystems (Hillsboro, OR)
    Description We're looking for a Senior DevOps and Infrastructure Engineer to work in IPP's ( Infrastructure , Planning and Process) Cloud Infrastructure ... Deep Learning, Artificial Intelligence and Driverless Cars to cater to their infrastructure needs. These cloud services provide almost half a million automated jobs… more
    TEKsystems (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Staff Software Engineer , AI…

    Google (Sunnyvale, CA)
    Staff Software Engineer , AI Infrastructure _corporate_fare_ Google _place_ Kirkland, WA, USA; Sunnyvale, CA, USA **Advanced** Experience owning outcomes and ... on and is growing every day. As a software engineer , you will work on a specific project critical...cluster interconnects and networking. We're building the AI infrastructure for the future, so if you are interested… more
    Google (01/10/26)
    - Save Job - Related Jobs - Block Source
  • Senior Infrastructure Software…

    NVIDIA (Santa Clara, CA)
    We are now looking for a Senior Infrastructure Software Engineer for Deep Learning Libraries! NVIDIA's Deep Learning Libraries Group is seeking excellent ... kernel libraries. The mission is to design and develop scalable, modular infrastructure that streamlines development, builds, and tests across NVIDIA's diverse set… more
    NVIDIA (12/13/25)
    - Save Job - Related Jobs - Block Source
  • Senior Research Engineer , Foundation Model…

    NVIDIA (Santa Clara, CA)
    NVIDIA is searching for a senior or principal engineer who specializes in building cutting-edge infrastructure for large-scale foundation model training in the ... to support multi-modal foundation models for robotics. + Optimize GPU and cluster utilization for efficient model training and fine-tuning on massive datasets. +… more
    NVIDIA (12/05/25)
    - Save Job - Related Jobs - Block Source
  • Staff Software Engineer , AI/ML…

    Google (Sunnyvale, CA)
    Staff Software Engineer , AI/ML Infrastructure _corporate_fare_ Google _place_ Kirkland, WA, USA; Sunnyvale, CA, USA **Advanced** Experience owning outcomes and ... on and is growing every day. As a software engineer , you will work on a specific project critical...clusters using the latest technologies for AI acceleration and cluster interconnects and networking. The AI and Infrastructure more
    Google (01/10/26)
    - Save Job - Related Jobs - Block Source
  • Principal Software Engineer - AI…

    Oracle (Montgomery, AL)
    …Oracle Cloud Infrastructure (OCI) is looking for a Principal Software Engineer to lead the development of scalable, resilient, and secure infrastructure ... efficiency at cloud hyperscale + Contribute to the evolution of OCI's infrastructure into next-gen cluster and automation frameworks Disclaimer: **Certain US… more
    Oracle (11/25/25)
    - Save Job - Related Jobs - Block Source
  • (Senior) Software Engineer

    pony.ai (Fremont, CA)
    …public at NASDAQ in November 2024. Responsibilities As a (Senior) Kubernetes Engineer , you will: + Design, operate, and optimize Kubernetes clusters across hybrid ... CRDs, APIs) to automate and productize internal use cases. + Own cluster lifecycle management including upgrades, patching, configuration, and governance. + Define… more
    pony.ai (12/16/25)
    - Save Job - Related Jobs - Block Source
  • Staff Software Engineer - Compute…

    LinkedIn (Mountain View, CA)
    …needs of the team. Job Description As a staff member of the Compute Infrastructure team at LinkedIn, you will be charged with building the next-generation ... infrastructure and platforms for LinkedIn. This is a unique...multi-clusters solutions, automate upgrades, and intelligently detect and remediate cluster health, etc. In this role, you will be… more
    LinkedIn (11/21/25)
    - Save Job - Related Jobs - Block Source
  • Staff Software Engineer , Emerging On-prem…

    Google (Sunnyvale, CA)
    Staff Software Engineer , Emerging On-prem AI Infrastructure _corporate_fare_ Google _place_ Kirkland, WA, USA; Sunnyvale, CA, USA **Advanced** Experience owning ... + 5 years of experience building and developing large-scale infrastructure , distributed systems or networks, or experience with compute...on and is growing every day. As a software engineer , you will work on a specific project critical… more
    Google (12/18/25)
    - Save Job - Related Jobs - Block Source
  • Senior IS Systems Engineer

    Ochsner Health (New Orleans, LA)
    …provides advanced engineering, administration, and operational support for core infrastructure platforms with a focus on Nutanix, virtualization technologies, Active ... services, and enterprise DNS DHCP IPAM NTP through InfoBlox. The engineer ensures platform stability, optimizes performance, strengthens identity and network… more
    Ochsner Health (11/27/25)
    - Save Job - Related Jobs - Block Source