- Cisco (San Jose, CA)
- AI Infrastructure Engineer - HPC Apply (https://jobs.cisco.com/jobs/Login?projectId=1443781) + Location:San Jose, California, US + Alternate ... and communicate advanced technical concepts. A talented and passionate engineer comfortable working in high-pressure, large-scale enterprise environments. **What You… more
- NVIDIA (Santa Clara, CA)
- …and intelligence. Make the choice to join us today! As a member of the GPU AI / HPC Infrastructure team, you will provide leadership in the design and ... years of experience designing and operating large scale compute infrastructure + Experience with AI / HPC ...are growing fast. If you're a creative and autonomous engineer with real passion for technology, we want to… more
- NVIDIA (Santa Clara, CA)
- …and intelligence. Make the choice to join us today! As a member of the GPU AI / HPC Infrastructure team, you will provide leadership in the design and ... of distributed storage services. + Design, implement an on-prem AI / HPC infrastructure supplemented with cloud...are growing fast. If you're a creative and autonomous engineer with real passion for technology, we want to… more
- NVIDIA (Santa Clara, CA)
- NVIDIA is looking for a Senior HPC Engineer to join its Infrastructure Specialists team. Academic, commercial and government groups around the world are ... will be doing: + Primary responsibilities will include deploying, managing, and validating AI / HPC infrastructure in Linux-based environments for new and… more
- Meta (Menlo Park, CA)
- …and host networking, comms lib and scheduling infrastructure . **Required Skills:** AI / HPC Systems Performance Engineer Responsibilities: 1. Active member ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially to support ever increasing uses cases of AI . This results in a dramatic… more
- Amazon (Cupertino, CA)
- Description We are seeking an experienced engineer to work on distributed AI /ML systems. This role involves working on collective operations - the fundamental ... operations that enable AI to scale across multiple accelerators & servers. Most...systems is valued, and experience with high-speed networking or HPC interconnects is valued highly. If you like solving… more
- NVIDIA (Santa Clara, CA)
- …a Senior Software Engineer to join our mission to continue improving our HPC infrastructure . Our team builds and operates sophisticated infrastructure to ... parallel computing. More recently, GPU deep learning ignited modern AI and enabled the next era of computing. NVIDIA...to provide better tools to build and manage this infrastructure . Ideal candidate is strong in software development, designing… more
- NVIDIA (Santa Clara, CA)
- NVIDIA is the leader in AI , machine learning and datacenter acceleration. NVIDIA is expanding that leadership into datacenter networking with ethernet switches, NICs ... parallel computing. More recently, GPU deep learning ignited modern AI - the next era of computing. NVIDIA is...diverse team today! As a member of the Hardware Infrastructure Farm team, you will provide leadership in the… more
- Amazon (Santa Clara, CA)
- …technologies in a multi-user environment. - High level understanding of the underlying infrastructure platform and resources to run HPC services. - Experience ... C++, Python, CUDA, Bash - Deep GPU knowledge in HPC and/or AI /ML frameworks. Preferred Qualifications -...the cloud computing delivery model as it relates to HPC . - Knowledge of the underlying infrastructure … more
- NVIDIA (Santa Clara, CA)
- …intelligence. Make the choice, join our diverse team today! As a member of the GPU AI / HPC Infrastructure team, you will provide leadership in the design and ... You will also be maintaining and building deep learning AI - HPC GPU clusters at scale and supporting...GPUs cluster. + Deep understanding of GPU computing and AI infrastructure . + Passion for solving complex… more
- Deloitte (San Jose, CA)
- …availability in the cloud or on prem + Adopt best engineering practices in automation, HPC and AI /GenAI infrastructure and design patterns + Define and lead ... Senior AI Engineer /Solutions Architect - SFL Scientific...engineers, project managers, and industry experts to develop robust AI infrastructure and deployment services for our… more
- Meta (Menlo Park, CA)
- …learning domains: Distributed ML Training, GPU architecture, ML systems, AI infrastructure , high performance computing, performance optimizations, or ... space of GenAI/LLM scaling reliability and performance. **Required Skills:** Software Engineer , SystemML - AI Networking Responsibilities: 1. Tech-leading the… more
- NVIDIA (Santa Clara, CA)
- …Product or Technical Marketing. + 7+ years of experience in deep learning engineering, HPC systems, AI infrastructure , or technical evangelism roles. + ... seeking a highly technical and creative Senior Technical Marketing Engineer to join our team to showcase the innovations...and software unlock performance at scale! + "Dogfood" NVIDIA's AI infrastructure tools like NeMo, Megatron, and… more
- Meta (Menlo Park, CA)
- …in exploring, developing and productizing high-performance software and hardware technologies for AI at datacenter scale.Hardware Systems Engineer in RTP work ... and optimize these systems in production. **Required Skills:** Hardware Systems Engineer , AI Systems Responsibilities: 1. Interface with external vendors… more
- LinkedIn (Mountain View, CA)
- …practices across the company. Strategic Roadmapping Influence the long-term roadmap for ML/ AI infrastructure , factoring in technology trends, product needs, and ... About the Role We are seeking a Senior Staff Engineer to design, build, and maintain our large-scale GPU... to design, build, and maintain our large-scale GPU infrastructure for machine learning (ML) and AI … more
- Meta (Menlo Park, CA)
- …health and lifecycle of servers in production. **Required Skills:** Production Systems Engineer , Fleet AI Systems Responsibilities: 1. Drive interfacing with ... **Summary:** Meta is seeking an experienced Production Systems Engineer to join our Release to Production (RTP)...a leading contributor 18. 3+ years of experience supporting AI or HPC systems and/or related systems,… more
- Amazon (Cupertino, CA)
- …delivering and operating AWS cloud offerings that enable high performance and scalability in AI /ML and HPC workloads. AWS Infrastructure Services owns the ... Do you want to build the backbone of Generative AI cloud at AWS? Do you want to build...design, planning, delivery, and operation of all AWS global infrastructure . In other words, we're the people who keep… more
- NVIDIA (Santa Clara, CA)
- …field; + 10+ years of full-time industry experience in large-scale MLOps and AI infrastructure ; + Proven experience designing and optimizing distributed training ... NVIDIA is searching for a senior or principal engineer who specializes in building cutting-edge infrastructure...experience at building large-scale LLM and multimodal LLM training infrastructure ; + Contributions to popular open-source AI … more
- Google (Sunnyvale, CA)
- …software for the Google distributed cloud team. Your work will enable Google's Cloud's AI , ML and HPC workloads to run efficiently on NVIDIA GPUs across ... practical experience. + 8 years of experience in Cloud Infrastructure Systems and Distributed Systems architecture, including deployment, scaling,...on and is growing every day. As a software engineer , you will work on a specific project critical… more
- Broadcom (San Jose, CA)
- …solutions, specifically designed to accelerate AI /ML and High-Performance Computing ( HPC ) workloads. Our products serve as critical infrastructure in ... hyperscale data centers, AI clusters, HPC environments, telecommunications infrastructure , and beyond. In this position, you will collaborate closely with… more