- Cornerstone onDemand (Dublin, CA)
- …with a focus on designing, implementing, and managing cloud -based solutions. As a Site Reliability Engineer , you will play a key role in ensuring ... We are seeking a highly skilled Site Reliability Engineer with...the availability, performance, and security of our cloud infrastructure. **In this role you will:** + Lead… more
- Cornerstone onDemand (Dublin, CA)
- …with a focus on designing, implementing, and managing cloud -based solutions. As a Site Reliability Engineer , you will play a key role in ensuring ... We are seeking a highly skilled ** Site Reliability Engineer ** with...the availability, performance, and security of our cloud infrastructure. **In this role you will:** + Lead… more
- Walmart (Sunnyvale, CA)
- …through our high-performance checkout services running in Edge and Cloud . As a Site Reliability Engineer in the CPC Team, you will work with L2, Other ... and high impact problems. This role is part of Cloud Powered Checkout team and will build the next...example, probability of failure, frequency of failure) to measure site reliability . Monitors site … more
- NVIDIA (Santa Clara, CA)
- Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and ... SRE at NVIDIA ensures that our internal and external facing GPU cloud services run maximum reliability and uptime as promised to the users and at the same… more
- NVIDIA (Santa Clara, CA)
- …and drive foundational improvements and automation to improve researchers productivity. As a Site Reliability Engineer , you are responsible for the big ... and operating large scale compute infrastructure + Proven experience in site reliability engineering for high-performance computing environments with operational… more
- Lockheed Martin (Sunnyvale, CA)
- **Description:** As a Site Reliability Engineer , you will: * Design, implement, and maintain highly available and scalable systems and infrastructure to ... security and integrity of the classified system **Basic Qualifications:** * Experience in site reliability engineering, DevOps, or a related field, with a focus… more
- NVIDIA (Santa Clara, CA)
- …culture? If so, we have a great opportunity for you! NVIDIA is seeking a Senior Site Reliability Engineer (SRE) for the Data Science & ML Platform(s) team. ... strong background in SRE practices, systems, networking, coding, capacity management, cloud operations, continuous delivery and deployment, and open-source cloud … more
- General Motors (Mountain View, CA)
- …+ Participate in on-call engineering duty to support production. + Instill Site Reliability best practice through automation, data insights, and observability ... an OEM - comprehensive control over both in-vehicle and cloud software - to deliver seamless solutions to our...future for generations to come. In this SRE SW Engineer role, you will develop and maintain key elements… more
- Palo Alto Networks (Santa Clara, CA)
- …configuration management with a framework such as Terraform, Helm + Experience in Site Reliability Engineering, Production Engineering, or DevOps + Passion for ... large infrastructure and is one of the largest GCP customers. As a Senior Staff DevOps Engineer for the App Services team, you will be part of a team supporting the… more
- NVIDIA (Santa Clara, CA)
- Join our team in Santa Clara, CA, USA as a Senior Site Reliability Engineer . At NVIDIA, you'll be part of the team shaping the future of computing and ... reports. + Deliver SRE solutions in a globally distributed, multi- cloud hybrid environment - AWS, GCP, and On-prem. +...TCP/IP fundamentals. + Expertise with at least one major cloud service provider - AWS, GCP, Azure. + Demonstrated… more
- Google (Sunnyvale, CA)
- …+ Excellent problem-solving skills for monitoring and troubleshooting serving systems. Site Reliability Engineering (SRE) combines software and systems ... large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google Cloud 's services-both our internally critical and our externally-visible systems-have… more
- Palo Alto Networks (Santa Clara, CA)
- …forefront of building and maintaining highly reliable, scalable, and secure cloud infrastructure within a FedRAMP compliant environment. You'll drive operational ... This includes automation, architecture, performance, observability, troubleshooting, security, and reliability . Our Infrastructure Platform stack includes Terraform, Kubernetes, GitLab… more
- Rubrik (Palo Alto, CA)
- …do:** * Deploy and operate security solutions and supporting infrastructure in cloud and datacenter environments in support of internal customer security needs and ... and services with the objective of achieving and exceeding availability and reliability goals * Manage and streamline monitoring systems to enhance observability and… more
- Amazon (Cupertino, CA)
- …of all AWS global infrastructure. In other words, we're the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, ... technologies. - You will have a fundamental understanding of Reliability statistics/ Reliability tests and/or solid understanding of...In other words, we're the people who keep the cloud running. We support all AWS data centers and… more
- Celonis (Redwood City, CA)
- …and resilience of our platform. The team applies advanced software engineering and Site Reliability Engineering (SRE) principles to drive system reliability , ... + Join a highly technical, collaborative, and innovation-driven team that blends Site Reliability Engineering with modern Software Engineering practices to build… more
- Amazon (Cupertino, CA)
- …platforms for the world's largest Cloud Services provider. As a Senior Reliability Engineer you will engage with an experienced cross-disciplinary staff to ... Description The Trainium Manufacturing, Quality and Reliability (MQR) Team is part of AWS Annapurna Labs focused on Machine Learning products that designs cutting AI… more
- LinkedIn (Mountain View, CA)
- …role at a high-growth or web-scale technology companySuggested Skills:- Site Reliability Engineering (SRE)-Leadership-Large scale infrastructureLinkedIn is ... with multi-region architecture, capacity planning, and failover strategies in large-scale cloud or hybrid environmentsBackground in CI/CD, platform reliability ,… more
- Celonis (Redwood City, CA)
- …and resilience of our platform. The team applies advanced software engineering and Site Reliability Engineering (SRE) principles to drive system reliability , ... SRE & Software Engineering. + Responsible for the design, implementation, reliability and management of cloud -based FedRAMP-compliant applications and platforms.… more
- Google (Sunnyvale, CA)
- …and technologies. + Experience in building large-scale operations capabilities in Site Reliability Engineering. Google Cloud 's software engineers ... on and is growing every day. As a software engineer , you will work on a specific project critical...will work on a specific project critical to Google Cloud 's needs with opportunities to switch teams and projects… more
- Nutanix (San Jose, CA)
- **Principal Engineer - Nutanix Cloud Manager (NCM)** **Hungry, Humble, Honest, with Heart.** **The Opportunity** About Nutanix Cloud Manager (NCM) Nutanix ... vision, empowering enterprises to manage, optimize, and secure their cloud environments through one integrated platform. NCM simplifies application lifecycle… more