- Zscaler (San Jose, CA)
- …+ 8+ years of experience working in infrastructure operations, DevOps, or site reliability roles + Demonstrated expertise in system observability, including ... speed and agility with a cloud-first strategy. We are seeking an experienced Senior Staff Infrastructure Operations Engineer to join our team. This critical… more
- NVIDIA (Santa Clara, CA)
- Join our team in Santa Clara, CA, USA as a Senior Site Reliability Engineer . At NVIDIA, you'll be part of the team shaping the future of computing and ... guaranteeing the smooth operation of our brand-new technologies. Our mission is to leverage AI's power to build outstanding and pioneering solutions that have a significant impact on the world. What you'll be doing: + Own the solutions you build, collaborating… more
- NVIDIA (Santa Clara, CA)
- NVIDIA is looking for a Senior Site Reliability Engineer to work in IPP (Infrastructure, Planning and Process). IPP is a global organization within ... NVIDIA. This group works with various other groups within NVIDIA Software such as Graphics Processors, Mobile Processors, Deep Learning, Artificial Intelligence and Driverless Cars to cater to their infrastructure needs. These cloud services provide almost… more
- NVIDIA (Santa Clara, CA)
- …us accelerate the next wave of artificial intelligence. Join our team at NVIDIA as a Senior Site reliability engineer focused on HPC storage and play ... a crucial role in designing, implementing, and optimizing on-prem High-Performance Computing (HPC) storage solutions while harnessing the power of cloud computing. You will be responsible for crafting and deploying distributed storage solutions, build… more
- Tarana Wireless (Milpitas, CA)
- …internet speeds worldwide, bridging the digital divide in ways previously thought impossible. As a Senior Site Reliability Engineer , you will help us ... manage software that runs on the cloud and remotely manages millions of radio devices. You will work on a team and be a main point of contact during off shore hours and responsible for all aspects of cloud operations, such as: + Infrastructure as Code + Manage… more
- NVIDIA (Santa Clara, CA)
- …secure production environments. We are seeking a deeply skilled Senior Staff Site Reliability Engineer (SRE) to advance our enterprise security ... position requires a strong software engineering background, but focuses on reliability , scalability, and operational excellence. A strong candidate excels in… more
- Google (Sunnyvale, CA)
- Senior Staff Software Engineer , Site Reliability Engineering _corporate_fare_ Google _place_ Sunnyvale, CA, USA; New York, NY, USA **Advanced** ... Reliability Engineering (https://landing.google.com/sre/book.html) or read acareer profile (https://careers.google.com/stories/ site - reliability -engineering-profile-google/) about why a Software Engineer… more
- Google (Sunnyvale, CA)
- Senior Systems Engineer , Site Reliability Engineering, Google Cloud _corporate_fare_ Google _place_ Sunnyvale, CA, USA; New York, NY, USA **Mid** ... + Master's degree in Computer Science or Engineering. **About the job** Site Reliability Engineering (SRE) combines software and systems engineering to… more
- NVIDIA (Santa Clara, CA)
- …We take great pride in providing excellent, comprehensive support to our customers! Sr Site Reliability Engineer in this role will significantly impact and ... in Computer Science or related field. + 8+ years of experience in site reliability engineering and/or software development roles. + Fluency in Python + In-depth… more
- NVIDIA (Santa Clara, CA)
- Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and ... internal and external facing GPU cloud services run maximum reliability and uptime as promised to the users and...be doing: + Design, implement and support operational and reliability aspects of large scale Kubernetes clusters with focus… more
- Google (Sunnyvale, CA)
- Senior Customer Reliability Engineer , Reliability Incident Management _corporate_fare_ Google _place_ New York, NY, USA; Austin, TX, USA; +2 more; +1 ... years of experience in a technical role such as Site Reliability Engineering, Technical Solutions Engineering, or...to connect with customers, employees and partners. As a Senior Customer Reliability Engineer , you… more
- NVIDIA (Santa Clara, CA)
- …NTP/PTP, DHCP, and LDAP. This includes building for performance and reliability at global scale, covering automation, monitoring, high availability, capacity ... data for reporting, alerting, monitoring. + Collaborate with NVIDIA leadership, senior engineers, program managers, and product managers to develop compelling IT… more
- NVIDIA (Santa Clara, CA)
- GeForce Now is looking for a Manager, Network Site Reliability Engineer (SRE) to enhance our network infrastructure and operations. We are looking for a ... be doing: + Cultivate a top-performing team of Network Site Reliability Engineers through encouraging a culture...Artificial Intelligence, and Autonomous Vehicles. If you're a creative engineer who enjoys autonomy and shares our passion for… more
- Insight Global (Santa Clara, CA)
- …fast-paced Infrastructure, Planning and Processes organization where you will be working as a Senior SRE Engineer . The position will be part of a fast-paced crew ... that develops and maintains sophisticated internal cloud provisioning products. The team works with various other business units such as Graphics Processors, Mobile Processors, Deep Learning, Artificial Intelligence and Driverless Cars to cater to their… more
- Amazon (Cupertino, CA)
- …designs cutting AI platforms for the world's largest Cloud Services provider. As a Senior Reliability Engineer you will engage with an experienced ... Description The Trainium Manufacturing, Quality and Reliability (MQR) Team is part of AWS Annapurna Labs focused on Machine Learning products that… more
- Palo Alto Networks (Santa Clara, CA)
- …complement their technical skills. You will work with a team of senior level IaC Automation Engineers leading projects designing, implementing, and maintaining PANWs ... global compute infrastructure. **Your Impact** + Design, implement and provide support for IT infrastructure compute components + Install, support and maintain software infrastructure according to best practices, including routers, Load balancers, switches,… more
- LinkedIn (Mountain View, CA)
- …equivalent role at a high-growth or web-scale technology company Suggested Skills + Site Reliability Engineering (SRE) + Leadership + Large scale infrastructure ... in Sunnyvale, CA or San Francisco, CA. **Responsibilities** + Serve as a senior technical leader driving the long-term reliability and observability strategy… more
- Walmart (Sunnyvale, CA)
- …of orders daily through our high-performance checkout services running in Edge and Cloud. As a Site Reliability Engineer in the CPC Team, you will work with ... criteria (for example, probability of failure, frequency of failure) to measure site reliability . Monitors site reliability conditions and new … more
- Google (Sunnyvale, CA)
- Senior Data Center Electrical Engineer , Special Projects _corporate_fare_ Google _place_ Austin, TX, USA; Midlothian, TX, USA; +11 more; +10 more ... new builds, infrastructure upgrades, and renovations. + Collaborate effectively with the Engineer of Record (EOR) to respond to site -specific engineering… more
- Google (Sunnyvale, CA)
- Senior Product Engineer , Machine Learning Accelerators _corporate_fare_ Google _place_ Sunnyvale, CA, USA **Mid** Experience driving progress, solving problems, ... insights by delivering AI and Infrastructure at unparalleled scale, efficiency, reliability and velocity. Our customers include Googlers, Google Cloud customers, and… more