• AI / HPC Systems

    Meta (Menlo Park, CA)
    …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Systems Performance Engineer Responsibilities: 1. ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...workloads that expects a loss-less fabric interconnect. To improve performance of these systems we constantly look… more
    Meta (08/01/25)
    - Save Job - Related Jobs - Block Source
  • Senior AI - HPC Cluster Engineer…

    NVIDIA (Santa Clara, CA)
    …analyzing and tuning performance for a variety of AI / HPC workloads. Excellent problem-solving to analyze complex systems , identify bottlenecks, and ... and implement GPU compute clusters for deep learning and high- performance computing. What you'll be doing: + Provide leadership...storage systems like Lustre and GPFS for AI / HPC workload. Experience working with deep learning… more
    NVIDIA (07/31/25)
    - Save Job - Related Jobs - Block Source
  • Principal Systems Development Engineer…

    Dell Technologies (Round Rock, TX)
    **Principal Systems Development Engineer for AI and HPC solutions team** Our customers' system requirements are usually highly complex. Bringing together ... hardware and software systems design, Systems Development Engineering operates at...or 6+ years with a master's degree * High Performance Computer skills sets with experience working and managing… more
    Dell Technologies (07/20/25)
    - Save Job - Related Jobs - Block Source
  • Senior Observability Architect, AI

    NVIDIA (Santa Clara, CA)
    …looking for a technical leader to define a vision and roadmap for distributed observability systems for large-scale AI and HPC clusters and workloads and ... and visualization to spectacularly improve efficiency, performance , and productivity of AI and HPC workloads. You will lead technical teams to develop,… more
    NVIDIA (05/15/25)
    - Save Job - Related Jobs - Block Source
  • Senior HPC and AI Networking…

    NVIDIA (Santa Clara, CA)
    …fit for you, we'd love to hear from you! NVIDIA is seeking a Senior High Performance Computing ( HPC ) and AI Networking Performance Research and Analysis ... In this exciting role, you will profile and analyze AI workloads on large GPUs and CPUs scale clusters...and platforms, such as HCAs, Switches, CPUs, GPUs, and Systems . You will develop performance analysis tools… more
    NVIDIA (07/11/25)
    - Save Job - Related Jobs - Block Source
  • AI Infrastructure Engineer - HPC

    Cisco (San Jose, CA)
    AI Infrastructure Engineer - HPC Apply (https://jobs.cisco.com/jobs/Login?projectId=1443781) + Location:San Jose, California, US + Alternate LocationAnywhere is ... and managing the internal NVIDIA DGX and Cisco-UCS based AI platforms at Cisco. You will provide leadership in...SaltStack, Puppet and/or Chef + Deep understanding of operating systems , computer networks, and high- performance applications. +… more
    Cisco (07/15/25)
    - Save Job - Related Jobs - Block Source
  • Software Systems Engineer for AI

    Dell Technologies (Round Rock, TX)
    …with a bachelor's degree or 6+ years with a master's degree * High Performance Computer systems , setup management and use *Advanced understanding of appropriate ... **Principal Systems Development Engineer** Our customers' system requirements are...across extended teams * Experience managing and using High Performance Clusters, including knowledge in slurm, Linux and Kubernettes… more
    Dell Technologies (07/20/25)
    - Save Job - Related Jobs - Block Source
  • Senior Software Architect, AI

    NVIDIA (Santa Clara, CA)
    …group at NVIDIA has openings for software architects in the field of AI and high- performance networking and system software. We research, develop, and ... and usable. + Creating proofs-of-concept to evaluate and motivate extensions in AI Frameworks (PyTorch/NEMO), HPC programming models (MPI, OpenSHMEM, PGAS), new… more
    NVIDIA (07/31/25)
    - Save Job - Related Jobs - Block Source
  • Senior Solution Architect, HPC

    NVIDIA (Santa Clara, CA)
    …Be Doing: + Primary responsibilities will include building and enabling robust AI / HPC infrastructure for customers + Support operational and reliability aspects ... of large-scale AI clusters, focusing on performance at scale,...in working with customers + Expertise with parallel file systems (eg Lustre, GPFS, BeeGFS, WekaIO) and high-speed interconnects… more
    NVIDIA (06/18/25)
    - Save Job - Related Jobs - Block Source
  • Senior Storage Engineer, HPC & GPU

    Samsung SDS America (Ridgefield Park, NJ)
    …highly skilled and experienced Data Center Storage Engineer with exposure to High Performance Computing ( HPC ) and GPU Infrastructure. The ideal candidate will ... for HPC and GPU-intensive workloads. + Evaluate and implement high- performance storage technologies, including NVMe, SSD, parallel file systems (eg,… more
    Samsung SDS America (06/21/25)
    - Save Job - Related Jobs - Block Source
  • Sr. Worldwide Specialist Solutions Architect,…

    Amazon (Herndon, VA)
    …computing and its potential to overcome some of the biggest challenges in High Performance Computing ( HPC )? Do you have a unique combination of deep technical ... C++, Python, CUDA, Bash - Deep GPU knowledge in HPC and/or AI /ML frameworks. Preferred Qualifications -...life sciences or related discipline. - Working knowledge of HPC schedulers and distributed/parallel file systems , underlying… more
    Amazon (06/12/25)
    - Save Job - Related Jobs - Block Source
  • HPC System Administrator (Must Live…

    Lenovo (Morrisville, NC)
    …be expected to work effectively in providing remote technical services in the areas of HPC & AI platforms and solutions. Also, you will be responsible for ... HPC System Administrator (MUST LIVE IN METRO BOSTON...world's largest PC company with a full-stack portfolio of AI -enabled, AI -ready, and AI -optimized devices… more
    Lenovo (08/01/25)
    - Save Job - Related Jobs - Block Source
  • Principal HPC Software Engineer

    GliaCell Technologies (MD)
    …develops, tests, deploys, documents, maintains, and enhances complex and diverse software for HPC (high performance computing) systems based upon documented ... requirements. + The HPC systems might include, but are not limited to, processing-intensive analytics, novel algorithm development, manipulation of extremely… more
    GliaCell Technologies (05/13/25)
    - Save Job - Related Jobs - Block Source
  • Senior Software Architect - Deep Learning…

    NVIDIA (Santa Clara, CA)
    …vision? What you will be doing: + Investigate opportunities to improve communication performance by identifying bottlenecks in today's systems . + Design and ... implement new communication technologies to accelerate AI and HPC workloads. + Explore innovative solutions in HW and SW for our next generation platforms as… more
    NVIDIA (07/29/25)
    - Save Job - Related Jobs - Block Source
  • Senior HPC Engineer, Infrastructure…

    NVIDIA (Santa Clara, CA)
    …and to power data centers. Join the team building many of the largest and fastest AI / HPC systems in the world! NVIDIA is looking for someone with the ... and internal teams to analyze, define, and implement large-scale AI / HPC projects. These efforts include a combination...they begin rolling out some of the most sophisticated systems in the world! + Provide feedback to internal… more
    NVIDIA (06/12/25)
    - Save Job - Related Jobs - Block Source
  • Sr. Software Development Engineer, HPC /ML…

    Amazon (Cupertino, CA)
    Description We are seeking an experienced engineer to work on distributed AI /ML systems . This role involves working on collective operations - the fundamental ... operations that enable AI to scale across multiple accelerators & servers. Most...building networking solutions that for Machine Learning (ML) and High- Performance Computing ( HPC ) workloads on AWS. We… more
    Amazon (07/29/25)
    - Save Job - Related Jobs - Block Source
  • Senior Software Engineer - HPC

    NVIDIA (Santa Clara, CA)
    …long term maintenance strategy. What you'll be doing: + Design highly available and scalable systems to meet the demands of our HPC clusters + Evaluate new and ... graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI and enabled the next era of computing. NVIDIA is a "learning… more
    NVIDIA (05/28/25)
    - Save Job - Related Jobs - Block Source
  • Senior HPC Architect, Networking

    NVIDIA (Santa Clara, CA)
    …improved workflows and develop new, leading differentiated solutions. You will interact with HPC , OS, GPU compute, and systems specialist to architect, develop ... parallel computing. More recently, GPU deep learning ignited modern AI - the next era of computing. NVIDIA is...looking for an outstanding hands-on architect/engineer for a Senior HPC architect role to support deployment and bringup of… more
    NVIDIA (07/09/25)
    - Save Job - Related Jobs - Block Source
  • Senior Site Reliability Engineer, HPC

    NVIDIA (Santa Clara, CA)
    …experience. Ways to stand out from the crowd: + Experience analyzing and tuning performance for a variety of HPC or EDA workloads. + Solid understanding ... NVIDIA is the leader in AI , machine learning and datacenter acceleration. NVIDIA is...and operate these clusters at high reliability, efficiency, and performance and drive foundational improvements and automation to improve… more
    NVIDIA (07/03/25)
    - Save Job - Related Jobs - Block Source
  • HPC Middleware Developer

    NVIDIA (Santa Clara, CA)
    …Networking Protocols InfiniBand, Ethernet + Knowledge in computer architecture and operating systems + Experience in performance optimizations + MSc or ... We are now looking for a senior HPC software engineer. As a member of our the High Performance Computing Software development team, you will be responsible for… more
    NVIDIA (06/30/25)
    - Save Job - Related Jobs - Block Source