- Meta (Austin, TX)
- …fabric and host networking, communications lib and scheduling infrastructure. **Required Skills:** AI / HPC System Performance Engineer Responsibilities: ... a loss-less fabric interconnect with minimal latency. To improve performance of these systems we constantly look...teamwork and close collaboration 3. Responsible for the overall performance of the communication system , including … more
- Amazon (Austin, TX)
- …computing and its potential to overcome some of the biggest challenges in High Performance Computing ( HPC )? Do you have a unique combination of deep technical ... C++, Python, CUDA, Bash - Deep GPU knowledge in HPC and/or AI /ML frameworks. Preferred Qualifications -...life sciences or related discipline. - Working knowledge of HPC schedulers and distributed/parallel file systems , underlying… more
- Amazon (Austin, TX)
- …and operating AWS cloud offerings that enable high performance and scalability in AI /ML and HPC workloads. You are intrigued by the continuous release of ... Want to do industry leading work delivering continuous price performance improvements in the cloud for AI ...have tremendous interest in cloud scale and curious how systems and software decisions impact the user. You insist… more
- Oracle (Austin, TX)
- …Responsibilities + Lead architecture, system design, and implementation for high- performance RDMA solutions across OCI's AI / HPC platforms, including ... If you thrive at the intersection of large-scale distributed systems , high-speed networking, and AI workloads, this... performance tuning at scale. + Familiarity with AI / HPC stacks and workloads: NCCL/RCCL/MPI, Slurm or… more
- Oracle (Austin, TX)
- …the forefront of building a cutting-edge, ultra-high- performance GPU platform designed to support AI /ML/ HPC workloads. This is your chance to be part of the ... AI revolution, creating systems that allow customers...and diagnostic services. These are essential for running distributed AI /ML/ HPC workloads across thousands of GPUs, leveraging… more
- Oracle (Austin, TX)
- …the forefront of building a cutting-edge, ultra-high- performance GPU platform designed to support AI /ML/ HPC workloads. This is your chance to be part of the ... AI revolution, creating systems that allow customers...and diagnostic services. These are essential for running distributed AI /ML/ HPC workloads across thousands of GPUs, leveraging… more
- Oracle (Austin, TX)
- …the forefront of building a cutting-edge, ultra-high- performance GPU platform designed to support AI /ML/ HPC workloads. This is your chance to be part of the ... AI revolution, creating systems that allow customers...and diagnostic services. These are essential for running distributed AI /ML/ HPC workloads across thousands of GPUs, leveraging… more
- Meta (Austin, TX)
- … AI product introductions and AI operations initiatives supporting Meta's growing AI / HPC infrastructure for our Family of Apps . They will be responsible ... deliver on shared goals 10. The ideal candidate will have experience in AI / HPC product development and operations, demonstrated experience in the Network… more
- Deloitte (Austin, TX)
- …Engineer, Solutions Architect) + 2+ years of experience with GPU computing (CUDA, OpenCL) and HPC system software stack The wage range for this role takes into ... in the cloud or on prem + Adopt best engineering practices in automation, HPC and AI /GenAI infrastructure and design patterns + Define and lead technology… more
- Oracle (Austin, TX)
- …network fabric** , supporting millions of devices, multi-region interconnects, and high- performance compute ( HPC / AI /GPU) environments. + Integrate ML ... Development Team within OCI's Network Availability organization. This team builds the AI , analytics, and automation systems that power OCI's self-healing cloud… more
- Oracle (Austin, TX)
- …the forefront of building a cutting-edge, ultra-high- performance GPU platform designed to support AI /ML/ HPC workloads. This is your chance to be part of the ... AI revolution, creating systems that allow customers...and diagnostic services. These are essential for running distributed AI /ML/ HPC workloads across thousands of GPUs, leveraging… more
- Oracle (Austin, TX)
- …networking, HPC , or GPU infrastructure. + Expertise in designing data feedback systems that improve AI model performance through continuous learning. + ... a Principal Software Developer (IC4) with deep expertise in AI /ML system design, large-scale data engineering, and...platforms. In this role, you will design and deliver AI -powered systems for predictive incident detection, automated… more
- Oracle (Austin, TX)
- …automation, and diagnostic services. These are essential for running distributed AI /ML/ HPC workloads across thousands of GPUs, leveraging technologies like ... CPU, Network, Storage with the goal to optimize customer experience and customer workload performance on our AI infrastructure. + Develop "best-in-class" AI … more
- Oracle (Austin, TX)
- …the forefront of building a cutting-edge, ultra-high- performance GPU platform designed to support AI /ML/ HPC workloads. This is your chance to be part of the ... AI revolution, working with systems that allow...scale from tens to thousands of GPUs without compromising performance . Our team is responsible for designing and developing… more
- Amazon (Austin, TX)
- …operating AWS cloud offerings that enable high performance and scalability in AI /ML and HPC workloads. Utility Computing (UC) AWS Utility Computing (UC) ... Want to do industry leading work delivering continuous price performance improvements in the cloud for AI ...and external customers to understand project requirements and facilitate system development ontop of your server design. You will… more
- Deloitte (Austin, TX)
- …for a hands-on technologist with deep expertise in HPC systems , GPU-accelerated infrastructure, and large-scale AI deployments-combined with the leadership's ... We are seeking an accomplished HPC / AI Platform Engineering Manager to lead...build CI/CD pipelines for containerized research and production environments. Systems & Automation + Oversee Linux system … more
- Oracle (Austin, TX)
- …metrics, logs, eBPF/perf, chaos/failure testing, and SLO-driven operations. Knowledge of AI / HPC workload patterns and their implications for storage, query ... **Job Description** OCI (Oracle Cloud) AI Infrastructure Innovation team is inventing the next...If you thrive at the intersection of large-scale distributed systems , database internals, and cloud platforms, this role offers… more
- Google (Austin, TX)
- …equipment utilization and cycle time tracking performance to target, and the system is expected to scale to incorporate AI /ML model applications to drive ... Silicon Manufacturing Decision Systems Lead _corporate_fare_ Google _place_ Austin, TX, USA...where the focus is on the manufacturing of complex HPC compute and AI SOC's. The … more
- Advanced Micro Devices, Inc (Austin, TX)
- …our mission is to build great products that accelerate next-generation computing experiences-from AI and data centers, to PCs, gaming and embedded systems . ... perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career. PRINCIPAL...we advance your career. PRINCIPAL EPYC CUSTOMER PLATFORM ARCHITECT/ SYSTEMS DESIGNER THE ROLE: It's a pivotal role where… more
- Oracle (Austin, TX)
- …AMD) and OEM/ODM partners to optimize performance , reliability, and scalability across AI and HPC workloads designing and leading the hardware / server ... AMD) and OEM/ODM partners to optimize performance , reliability, and scalability across AI and HPC workloads designing and leading the hardware / server… more