- Amazon (Santa Clara, CA)
- …technologies in a multi-user environment. - High level understanding of the underlying infrastructure platform and resources to run HPC services. - Experience ... the cloud computing delivery model as it relates to HPC . - Knowledge of the underlying infrastructure ...in a customer-facing, sales-aligned role such as consultant, solutions engineer or solutions architect. - Track record of implementing… more
- Oracle (Santa Clara, CA)
- …debug software programs for databases, applications, tools, networks etc.As an AI/ML Infrastructure Engineer on the GPU Strategic Customers Engineering team, you ... Rust, Go, Java, or Scala + Proven experience designing, implementing, and managing infrastructure for AI/ML or HPC workloads. + Understanding machine learning… more
- Google (Sunnyvale, CA)
- Staff Quality and Reliability Engineer , Google Cloud _corporate_fare_ Google _place_ Sunnyvale, CA, USA **Advanced** Experience owning outcomes and decision ... and its integration within AI/ML-driven systems. As a Quality and Reliability Engineer for Google Cloud, you will lead the development of Design-for-Reliability… more
- SLAC National Accelerator Laboratory (Menlo Park, CA)
- …opportunity to work on challenges that push technological boundaries while mentoring junior staff and guiding the evolution of our HPC capabilities. **Your ... Senior High Performance Computing Engineer Job ID 6383 Location SLAC - Menlo...role in managing and optimizing our High Performance Computing ( HPC ) environment in support of these groundbreaking scientific projects.… more
- Microsoft Corporation (Mountain View, CA)
- **Overview** Help build the infrastructure that powers training, evaluation, and data platforms for reliable deployment of world-class foundational AI models. We are ... across engineering and research to design, evolve, and operate core research infrastructure , so that product teams can train faster, evaluate more rigorously, and… more
- Amazon (Cupertino, CA)
- Description We are seeking an experienced engineer to work on distributed AI/ML systems. This role involves working on collective operations - the fundamental ... systems is valued, and experience with high-speed networking or HPC interconnects is valued highly. If you like solving...software components that are critical building blocks for EC2 infrastructure . Every instance in EC2 is running some type… more
- NVIDIA (Santa Clara, CA)
- NVIDIA's Observability team is seeking a Senior/ Staff Engineer to compose and build the next-generation, multi-region observability platform. This platform ... and cost efficiency of telemetry pipelines while supporting high-volume workloads (AI/ML, HPC clusters, GPU infrastructure ) + Embedding security guidelines into… more
- Amazon (Cupertino, CA)
- …AWS cloud offerings that enable high performance and scalability in AI/ML and HPC workloads. AWS Infrastructure Services owns the design, planning, delivery, and ... operation of all AWS global infrastructure . In other words, we're the people who keep...work safely and cooperatively with other employees, supervisors, and staff ; adhere to standards of excellence despite stressful conditions;… more
- Amazon (Cupertino, CA)
- …next-generation infrastructure that powers breakthrough innovation in AI/ML and HPC workloads. If you're passionate about pushing the limits of performance, ... Experience taking a leading role in building complex software or computing infrastructure that has been successfully delivered to customers - Experience debugging,… more
- Amazon (Cupertino, CA)
- …next-generation infrastructure that powers breakthrough innovation in AI/ML and HPC workloads. If you're passionate about pushing the limits of performance, ... include: work safely and cooperatively with other employees, supervisors, and staff ; adhere to standards of excellence despite stressful conditions; communicate… more