- Micron Technology, Inc. (San Jose, CA)
- …position in the Artificial Intelligence ( AI ), Machine Learning (ML) and High Performance Computing ( HPC ) business segments. You will be working on innovative ... you will be charged with defining and accomplishing the strategy for a High Performance Memory product portfolio that will further fortify Micron's leadership… more
- Micron Technology, Inc. (San Jose, CA)
- …in growing the Artificial Intelligence ( AI ), Machine Learning (ML) and High- Performance Computing ( HPC ) business segments. You will be working on innovative ... of Work (SOWs), business term sheets, and other customer-facing documents for high- performance memory products. + Represent the Product Management team in Product… more
- Meta (Menlo Park, CA)
- …fabric and host networking, communications lib and scheduling infrastructure. **Required Skills:** AI / HPC System Performance Engineer Responsibilities: ... a loss-less fabric interconnect with minimal latency. To improve performance of these systems we constantly look...teamwork and close collaboration 3. Responsible for the overall performance of the communication system , including … more
- Meta (Menlo Park, CA)
- …These workloads expect a loss-less fabric interconnect with minimal latency. To improve performance of these systems we constantly look for opportunities across ... host networking, communications lib and scheduling infrastructure. **Required Skills:** AI / HPC Network Engineering Manager Responsibilities: 1. Manage engineers… more
- Meta (Menlo Park, CA)
- …on existing accelerator systems and guiding the future of models and AI HW at Meta. This drives improved performance , new model architectures and ... RCCL, UCC and MPI. 7. Guide Meta's AI HW requirements and design focusing on performance at System and Silicon levels. Co-design and optimize our AI HW… more
- Stanford University (Stanford, CA)
- …same on any machine (Docker) and, when a laptop isn't enough, using campus high- performance computing ( HPC ) or a small cloud server to process larger datasets ... Data Science & AI Librarian, Stanford Law School **School of Law,...for grants. + **Liaise with campus** data science institutes, HPC , and central library data services; use HPC… more
- Deloitte (San Francisco, CA)
- …Solutions Architect) + 2+ years of experience with GPU computing (CUDA, OpenCL) and HPC system software stack Information for applicants with a need for ... in the cloud or on prem + Adopt best engineering practices in automation, HPC and AI /GenAI infrastructure and design patterns + Define and lead technology… more
- quadric.io, Inc (Burlingame, CA)
- …wide variety of edge and endpoint devices, ranging from battery operated smart-sensor systems to high- performance automotive or autonomous vehicle systems . ... using Quadric's SDK. This senior technical role demands expertise in system architecture, algorithm optimization, and the ability to provide technical leadership… more
- Honeywell (San Jose, CA)
- …with cross-functional teams to optimize computing resources and improve system performance . + Monitor and troubleshoot computing systems to ensure optimal ... As a **Lead IT Engineer for High Performance Computing ( HPC )** here at Honeywell,...requires extensive knowledge of troubleshooting and engineering Linux operating systems , InfiniBand networking, parallel file system storage,… more
- Insight Global (Palo Alto, CA)
- …with GPU architecture and parallel computing. - Background in kernel optimization and HPC systems . - Proficiency in CUDA and familiarity with NVIDIA's ... Description Insight Global is looking to hire a Senior Performance Engineer for a client in the quantum computing...include: - Lead the design and build of specialized HPC environments. - Scale machine learning models on GPU… more
- IBM (San Jose, CA)
- …Python. Rust, CUDA * Familiarity with executing HPC workloads * Familiarity with HPC system performance evaluation. At IBM, we pride ourselves on being ... technical areas in the context of hybrid cloud, AI systems , networking, security, high-speed networked-storage, accelerators, and HPC principles. The… more
- Meta (Menlo Park, CA)
- …following machine learning/deep learning domains: Distributed ML Training, GPU architecture, ML systems , AI infrastructure, high performance computing, ... large-scale GPU training and inference fleet through an observable, reliable and high- performance distributed AI /GPU communication stack. Currently, one of the… more
- Meta (Menlo Park, CA)
- …for network devices, transport stacks, and AI workloads 2. Debug complex system -level issues and lead performance tuning exercises to optimize software stack ... networks, powering our global data centers and supporting cutting-edge technologies like AI , Generative AI , Recommendation engines, and Metaverse. Our network… more
- Meta (Menlo Park, CA)
- …tools, libraries, and frameworks (eg, PyTorch, CUDA) 19. Full-stack experience and understanding of AI / HPC systems , with a focus on the application layer and ... of Meta's accelerators collective communications software library and optimizing distributed AI /ML workloads' performance . This is an opportunity to work… more
- Broadcom (San Jose, CA)
- …you apply.** **Job Description:** Ethernet NIC product portfolio is designed for high performance computing and networking applications including AI and ML. This ... of the next generation of Ethernet NIC solutions for AI /ML and High performance computing applications. We...is an added advantage. 6. Experience analyzing and tuning performance for a variety of HPC workloads.… more
- Meta (Menlo Park, CA)
- …for network devices, transport stacks, and AI workloads 2. Debug complex system -level issues and lead performance tuning exercises to optimize software stack ... FPGAs, sensors, fan control, power etc), Board Support Package (BSP), Operating Systems , Kernel, Bootloader, Power Management, Real-Time Operating System (RTOS),… more
- Meta (Menlo Park, CA)
- …for network devices, transport stacks, and AI workloads 2. Debug complex system -level issues and lead performance tuning exercises to optimize software stack ... FPGAs, sensors, fan control, power etc), Board Support Package (BSP), Operating Systems , Kernel, Bootloader, Power Management, Real-Time Operating System (RTOS),… more
- Meta (Menlo Park, CA)
- …levels 9. Experience in leading teams working on high performance computing ( HPC ) and AI /ML systems , including: 10. GPU/ASIC-based kernel development and ... systems for our fleet 4. Technical management 5. Experience in systems architecture, performance , workload-analysis and large scale distributed systems … more
- Meta (Menlo Park, CA)
- …10. Experience in leading teams working on high performance computing ( HPC ) and AI /ML systems , including: GPU/ASIC-based kernel development and ... ROCm), distributed systems for large scale training and serving, and systems architecture and performance 11. Accelerator (GPU/ASIC) kernel development and… more
- Stanford University (Stanford, CA)
- …with peers to ensure that proposed solutions do not adversely affect overall system performance and that solutions are consistent with local policies. * ... groups' meetings and presentations to assist with identifying promising tools and systems and to discuss their computational challenges and requirements. * Engage… more