- NVIDIA (Santa Clara, CA)
- …the first people to make them operational in production? We are seeking a dedicated Cluster Deployment Operations Engineer to support product deployments ... team, acting as the link between engineering and the NVIS field team for cluster deployment and management solutions! We bridge the gap between product roadmaps… more
- NVIDIA (Santa Clara, CA)
- …the world's most advanced computing workloads. NVIDIA is looking for an AI/ML HPC Cluster Engineer to join our MARS team. You will provide technical engagement ... and problem solving on the management of large-scale HPC systems including the deployment of compute, networking, and storage. You will be working with a team of… more
- Cadence Design Systems, Inc. (San Jose, CA)
- …world of technology. We are seeking a highly skilled and experienced AI Systems Engineer to join our team. This is a hands-on, senior individual contributor role ... that will be pivotal in leading the development, operations , and support of our entire AI infrastructure. You...services on both GCP and Azure. + Hands-on GPU Cluster Management: Take a leadership role in the configuration,… more
- NVIDIA (Santa Clara, CA)
- … operations , and networking, familiarity with software testing and deployment , familiarity with distributed systems, and excellent communication and planning ... management systems (Kubernetes, SLURM.) Hands-on experience in Machine Learning Operations . Hands-on experience with Bright Cluster Manager. + Hands-on… more
- NVIDIA (Santa Clara, CA)
- …Artifactory, Jira) in hybrid on-premise and cloud environments. + Assist with cluster operations and system administration (managing: servers, team accounts, ... dedicated and motivated senior build and continuous integration (CI/CD) engineer for its GenAI Frameworks (Megatron-LM (https://github.com/NVIDIA/Megatron-LM) and NeMo… more
- NVIDIA (Santa Clara, CA)
- We are looking for Senior Software Development Engineer in Test (SDET) to join our New GPU Integration (NPI) team for NVIDIA's Enterprise Compute SWQA team. Are you ... to have your skills on the team! As an engineer on this New Platform GPU Integration team, you...tools to significantly enhance our testing capabilities and streamlining operations for more efficient and accurate results. + Improve… more
- NVIDIA (Santa Clara, CA)
- …experience. Ways to stand out from the crowd: + Knowledge of cloud and cluster level deployment and management systems. + Experience with GPU computing (CUDA), ... NVIDIA is seeking a Senior Firmware Engineer to join our CSP Engagements team, focusing...hardware and software, driving technical solutions from concept through deployment . What you will be doing: + Design and… more
- Meta (Fremont, CA)
- …Introduction (NPI) Roadmap success. As the Network is essential to every mega cluster and Meta's data center success, its roadmap is rapidly expanding in volume, ... Engineering, Legal, Finance, Accounting, Compliance) across multiple domains.The NPI Operations Organization supports complex roadmaps to deliver NPI programs on… more