- NVIDIA (Santa Clara, CA)
- …building and running private and public clouds at production scale. As part of the DGX Cloud team, you'll have the opportunity to support our customers' journeys ... compute infrastructure and codify reliability best-practices in the broader DGX Cloud platform ecosystem. What you'll be...automate it where the ROI of building and maintaining automation is worth it. + Practice sustainable blameless incident… more
- NVIDIA (Santa Clara, CA)
- …GPU deep learning. What you will be doing: + You will be part of an DGX Cloud team responsible for production systems that enable large scalable GPU clusters to ... competency in managing and automating large-scale distributed systems independent of cloud providers. Advanced hands-on experience and deep understanding of managing… more
- NVIDIA (Santa Clara, CA)
- …the buildout and integration of NCPs and CSPs into this marketplace. As a software engineer on the DGX Cloud Lepton Marketplace team, you'll play a ... One of DGX Cloud 's top priorities is to...workflow orchestration, and drive improvements in testing, observability, and automation to ensure high quality, fault-tolerant solutions What we… more
- NVIDIA (Santa Clara, CA)
- Joining NVIDIA's DGX Cloud AI Efficiency Team means contributing to the infrastructure that powers our innovative AI research. This team focuses on optimizing ... to work efficiently with a wide variety of DGXC cloud AI systems as they seek out opportunities for...and build consensus + Passion for "it just works" automation , eliminating repetitive tasks, and enabling team members Ways… more
- NVIDIA (Santa Clara, CA)
- …continuous delivery, and deployment, as well as open-source cloud -enabling technologies like Kubernetes, containers, and virtualization. Their responsibilities ... workloads. Production Engineers at NVIDIA ensure that our internal and external-facing GPU cloud services have reliability and uptime as promised to the users while… more
- NVIDIA (Seattle, WA)
- …a lasting impact on the world. NVIDIA seeks a driven Senior Software Engineer to advance our platform software, using open-source container runtimes and Kubernetes. ... orchestrators such as Kubernetes. + Join the core group working on Cloud Native technologies, improving NVIDIA accelerators in the Kubernetes environment. +… more
- NVIDIA (Santa Clara, CA)
- …that automates GPU asset provisioning, configuration, and lifecycle management across cloud providers. You'll contribute to this platform to build end-to-end ... automation of datacenter operations, break/fix, and lifecycle management for...in architecting and managing large-scale distributed systems, independent of cloud providers. Deep knowledge of datacenter operations and GPU… more
- NVIDIA (Santa Clara, CA)
- …for a passionate member to join our DGX Cloud Engineering Team as a Cloud Software Engineer . In this role, you will play a significant part in helping to ... guide the future of AI & GPUs in the Cloud . NVIDIA DGX Cloud is...features/improvements of existing products. + Drive performance tuning and automation . + Support, maintain, and document software functionality. What… more
- Cisco (San Jose, CA)
- AI Infrastructure Engineer - HPC Apply (https://jobs.cisco.com/jobs/Login?projectId=1443781) + Location:San Jose, California, US + Alternate LocationAnywhere is the ... of a hardware and software engineering team that designs and develops Hybrid- Cloud compute platforms and capabilities that are crucial to keeping Cisco's critical… more
- NVIDIA (Santa Clara, CA)
- …see how you can make a lasting impact on the world. NVIDIA's Cloud data centers host ground-breaking products across high-performance computing to machine learning ... heart of our data centers is the ability to engineer mechanical and electrical designs in close coupling to...designs in close coupling to NVIDIA's industry-leading GPU and DGX products. We are seeking a Data Center Electrical… more
- NVIDIA (Santa Clara, CA)
- …to open-source testing tools or frameworks with strong knowldege of cloud -scale validation, infrastructure automation , or virtualization. + Prior experience ... and data center offerings. If you are a dedicated engineer with a deep understanding of firmware and date...robust, secure, and high-performing solutions for AI, HPC, and cloud -scale systems. You will: + Define End-to-End Test Strategy:… more