- Bristol-Myers Squibb Company (Princeton, NJ)
- …Summary: Bristol Myers Squibb is looking for an experienced Sr Principal Systems Engineer in HPC / AI infrastructure to work with ... technology teams and various stakeholders to design, manage, and support cutting-edge HPC / AI infrastructure platforms to serve our community of researchers and… more
- VetJobs (Princeton, NJ)
- …personal lives. Summary: Bristol Myers Squibb is looking for an experienced Sr Principal Systems Engineer in HPC / AI infrastructure to work with our ... technology teams and various stakeholders to design, manage, and support cutting-edge HPC / AI infrastructure platforms to serve our community of researchers and… more
- Armada (Bellevue, WA)
- …Partner with engineering to design GPU allocation, scheduling, and orchestration systems Define multi-tenant GPU sharing capabilities with performance isolation ... to understand GPU workload patterns and requirements Define customer personas for edge AI , ML training, inference, and HPC workloads Collaborate with sales teams… more
- Meta (Austin, TX)
- …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Systems Performance Engineer Responsibilities: 1. ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...workloads that expects a loss-less fabric interconnect. To improve performance of these systems we constantly look… more
- Dell Technologies (Round Rock, TX)
- **Principal Systems Development Engineer for AI and HPC solutions team** Our customers' system requirements are usually highly complex. Bringing together ... hardware and software systems design, Systems Development Engineering operates at...or 6+ years with a master's degree * High Performance Computer skills sets with experience working and managing… more
- NVIDIA (Santa Clara, CA)
- …looking for a technical leader to define a vision and roadmap for distributed observability systems for large-scale AI and HPC clusters and workloads and ... and visualization to spectacularly improve efficiency, performance , and productivity of AI and HPC workloads. You will lead technical teams to develop,… more
- NVIDIA (Santa Clara, CA)
- …fit for you, we'd love to hear from you! NVIDIA is seeking a Senior High Performance Computing ( HPC ) and AI Networking Performance Research and Analysis ... In this exciting role, you will profile and analyze AI workloads on large GPUs and CPUs scale clusters...and platforms, such as HCAs, Switches, CPUs, GPUs, and Systems . You will develop performance analysis tools… more
- Cisco (San Jose, CA)
- AI Infrastructure Engineer - HPC Apply (https://jobs.cisco.com/jobs/Login?projectId=1443781) + Location:San Jose, California, US + Alternate LocationAnywhere is ... and managing the internal NVIDIA DGX and Cisco-UCS based AI platforms at Cisco. You will provide leadership in...SaltStack, Puppet and/or Chef + Deep understanding of operating systems , computer networks, and high- performance applications. +… more
- Dell Technologies (Round Rock, TX)
- …with a bachelor's degree or 6+ years with a master's degree * High Performance Computer systems , setup management and use *Advanced understanding of appropriate ... **Principal Systems Development Engineer** Our customers' system requirements are...across extended teams * Experience managing and using High Performance Clusters, including knowledge in slurm, Linux and Kubernettes… more
- NVIDIA (Santa Clara, CA)
- …Be Doing: + Primary responsibilities will include building and enabling robust AI / HPC infrastructure for customers + Support operational and reliability aspects ... of large-scale AI clusters, focusing on performance at scale,...in working with customers + Expertise with parallel file systems (eg Lustre, GPFS, BeeGFS, WekaIO) and high-speed interconnects… more
- Samsung SDS America (Ridgefield Park, NJ)
- …highly skilled and experienced Data Center Storage Engineer with exposure to High Performance Computing ( HPC ) and GPU Infrastructure. The ideal candidate will ... for HPC and GPU-intensive workloads. + Evaluate and implement high- performance storage technologies, including NVMe, SSD, parallel file systems (eg,… more
- Caris Life Sciences (Irving, TX)
- …A Senior HPC Architect is responsible for designing and optimizing high- performance computing ( HPC ) systems , leveraging their expertise in parallel ... analysis tools and techniques to identify and address performance bottlenecks. + Knowledge of HPC hardware...scientific software and other 3rd party software applications on HPC systems + Experience with HPC… more
- Amazon (Herndon, VA)
- …computing and its potential to overcome some of the biggest challenges in High Performance Computing ( HPC )? Do you have a unique combination of deep technical ... C++, Python, CUDA, Bash - Deep GPU knowledge in HPC and/or AI /ML frameworks. Preferred Qualifications -...life sciences or related discipline. - Working knowledge of HPC schedulers and distributed/parallel file systems , underlying… more
- GliaCell Technologies (MD)
- …develops, tests, deploys, documents, maintains, and enhances complex and diverse software for HPC (high performance computing) systems based upon documented ... requirements. + The HPC systems might include, but are not limited to, processing-intensive analytics, novel algorithm development, manipulation of extremely… more
- NVIDIA (Santa Clara, CA)
- …vision? What you will be doing: + Investigate opportunities to improve communication performance by identifying bottlenecks in today's systems . + Design and ... implement new communication technologies to accelerate AI and HPC workloads. + Explore innovative solutions in HW and SW for our next generation platforms as… more
- NVIDIA (Santa Clara, CA)
- …and to power data centers. Join the team building many of the largest and fastest AI / HPC systems in the world! NVIDIA is looking for someone with the ... and internal teams to analyze, define, and implement large-scale AI / HPC projects. These efforts include a combination...they begin rolling out some of the most sophisticated systems in the world! + Provide feedback to internal… more
- Amazon (Cupertino, CA)
- Description We are seeking an experienced engineer to work on distributed AI /ML systems . This role involves working on collective operations - the fundamental ... operations that enable AI to scale across multiple accelerators & servers. Most...building networking solutions that for Machine Learning (ML) and High- Performance Computing ( HPC ) workloads on AWS. We… more
- NVIDIA (Santa Clara, CA)
- …long term maintenance strategy. What you'll be doing: + Design highly available and scalable systems to meet the demands of our HPC clusters + Evaluate new and ... graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI and enabled the next era of computing. NVIDIA is a "learning… more
- Amazon (Herndon, VA)
- …to identify, develop, and execute growth opportunities leveraging GenAI, Agentic AI , and integrated, complex systems architectures to deliver transformational ... * Collaborate with solution architects and subject matter expertise to design scalable AI /ML and HPC solutions that meet complex mission requirements across… more
- NVIDIA (Santa Clara, CA)
- …improved workflows and develop new, leading differentiated solutions. You will interact with HPC , OS, GPU compute, and systems specialist to architect, develop ... parallel computing. More recently, GPU deep learning ignited modern AI - the next era of computing. NVIDIA is...looking for an outstanding hands-on architect/engineer for a Senior HPC architect role to support deployment and bringup of… more