- Meta (Menlo Park, CA)
- …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Systems Performance Engineer Responsibilities: 1. ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...a loss-less fabric interconnect with minimal latency. To improve performance of these systems we constantly look… more
- Meta (Columbus, OH)
- …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Systems Performance Engineer Responsibilities: 1. ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...workloads that expects a loss-less fabric interconnect. To improve performance of these systems we constantly look… more
- Meta (Austin, TX)
- … testing with focus on automation. 22. Experience in developing or debugging AI / HPC systems , performance optimizations, including familiarity with ... and/or similar languages. **Preferred Qualifications:** Preferred Qualifications: 16. Proficiency in High- Performance Computing ( HPC ) or AI system… more
- Meta (Menlo Park, CA)
- … system architecture at rack level and at scale, as well as debugging AI / HPC systems , performance optimizations, including familiarity with relevant ... of issues. RTP team also helps in exploring, developing and productizing high- performance software and hardware technologies for AI at datacenter scale.RTP… more
- Meta (Menlo Park, CA)
- …hardware and software components, co-design 15. Experience in developing or debugging AI / HPC systems , performance optimizations, including familiarity ... or supporting production hardware at scale 9. Experience in deploying and productionizing AI / HPC systems and/or related components at scale 10. Experience in… more
- Deloitte (Columbus, OH)
- …day-to-day operations of the High- Performance Computing ( HPC ) and AI infrastructure, ensuring all systems meet or exceed requirements for scalability, ... Responsibilities: + System support and management of infrastructure for HPC and AI systems , this...system performance , ensuring the efficient execution of AI models and HPC applications. Implement techniques… more
- Amazon (Annapolis Junction, MD)
- …and storage - 5+ years building or optimizing computational applications for large scale HPC systems (eg physics based simulations) to take advantage of high ... Description Amazon Web Services is seeking a High Performance Computing ( HPC ) Solutions Architect to...the world's technology? Come join us! Key job responsibilities HPC is growing in importance as these systems… more
- NVIDIA (Santa Clara, CA)
- …to work effectively with diverse teams and individuals. + Experience analyzing and tuning performance for a variety of AI / HPC workloads. + Passion for ... GPU compute clusters that run demanding deep learning, high performance computing, and computationally intensive workloads. We seek a...storage systems like Lustre and GPFS for AI / HPC workloads + Familiarity with deep learning… more
- NVIDIA (Santa Clara, CA)
- …designing and operating large scale storage infrastructure. + Experience analyzing and tuning performance for a variety of AI / HPC workloads. + Experience ... join us today! As a member of the GPU AI / HPC Infrastructure team, you will provide leadership...solutions to enable runs of demanding deep learning, high performance computing, and computationally intensive workloads. We seek an… more
- NVIDIA (Santa Clara, CA)
- …looking for a technical leader to define a vision and roadmap for distributed observability systems for large-scale AI and HPC clusters and workloads and ... and visualization to spectacularly improve efficiency, performance , and productivity of AI and HPC workloads. You will lead technical teams to develop,… more
- Meta (Bellevue, WA)
- …of RDMA workloads that expects a loss-less fabric interconnect. To enhance the performance of these systems , we continuously seek opportunities for improvement ... host networking, communication libraries, and scheduling infrastructure. **Required Skills:** AI / HPC Network Engineer Responsibilities: 1. Design, develop, test… more
- Meta (Menlo Park, CA)
- …requirements of RDMA workloads that expects a loss-less fabric interconnect. To improve performance of these systems we constantly look for opportunities across ... fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Network Engineer Responsibilities: 1. Design, develop, test and… more
- Cisco (Research Triangle Park, NC)
- …and technologies. Preferred Qualifications * Deep understanding of operating systems , computer networks, and high- performance applications. * Established ... Showcase the power of Cisco: our people, products, processes, systems , and data. Please join us and make this...and managing the internal NVIDIA DGX and Cisco-UCS based AI platforms at Cisco. You will provide leadership in… more
- Samsung SDS America (Ridgefield Park, NJ)
- …highly skilled and experienced Data Center Storage Engineer with exposure to High Performance Computing ( HPC ) and GPU Infrastructure. The ideal candidate will ... for HPC and GPU-intensive workloads. + Evaluate and implement high- performance storage technologies, including NVMe, SSD, parallel file systems (eg,… more
- Caris Life Sciences (Irving, TX)
- …A Senior HPC Architect is responsible for designing and optimizing high- performance computing ( HPC ) systems , leveraging their expertise in parallel ... analysis tools and techniques to identify and address performance bottlenecks. + Knowledge of HPC hardware...scientific software and other 3rd party software applications on HPC systems + Experience with HPC… more
- General Dynamics Information Technology (Fairfax, VA)
- …with commonly used HPC applications and services (ie, schedulers, high performance file systems , modules for installing applications, compilers, MPI, OpenMP, ... **Public Trust/Other Required:** None **Job Family:** Scientists **Skills:** High Performance Computing ( HPC ),Researching,Supercomputing **Experience:** 10 + years… more
- Caris Life Sciences (Irving, TX)
- …Performance Computing) Engineer is responsible for implementing, and maintaining a High Performance Computing ( HPC ) systems primarily running on Linux ... network settings, storage systems , and parallel file systems like GPFS. + Monitoring system performance ,...of computing resources. + Implementing security measures to protect HPC systems and data from unauthorized access.… more
- General Dynamics Information Technology (Annapolis Junction, MD)
- …Required:** None **Job Family:** Systems Engineering **Skills:** Complex Systems ,High- Performance Computing ( HPC ) Systems ,Linux,Management Tools, ... of related experience **US Citizenship Required:** Yes **Job Description:** HPC Systems Engineer GDIT is seeking a...+ Participate in the design of information and operational systems + Monitor and test application performance … more
- GliaCell Technologies (MD)
- …develops, tests, deploys, documents, maintains, and enhances complex and diverse software for HPC (high performance computing) systems based upon documented ... requirements. + The HPC systems might include, but are not limited to, processing-intensive analytics, novel algorithm development, manipulation of extremely… more
- NVIDIA (Santa Clara, CA)
- …vision? What you will be doing: + Investigate opportunities to improve communication performance by identifying bottlenecks in today's systems . + Design and ... implement new communication technologies to accelerate AI and HPC workloads. + Explore innovative solutions in HW and SW for our next generation platforms as… more