- Cornerstone onDemand (Dublin, CA)
- …with a focus on designing, implementing, and managing cloud -based solutions. As a Site Reliability Engineer , you will play a key role in ensuring ... We are seeking a highly skilled Site Reliability Engineer with...the availability, performance, and security of our cloud infrastructure. **In this role you will:** + Lead… more
- Palo Alto Networks (Santa Clara, CA)
- …and Alerts Management - Clear understanding of incident and alerts management in Site Reliability Engineering + DevOps/SRE Expertise - 4+ years of experience ... as a DevOps/SRE engineer with a passion for technology and a strong motivation for high reliability at the service level + Cloud Proficiency - High… more
- NVIDIA (Santa Clara, CA)
- Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and ... SRE at NVIDIA ensures that our internal and external facing GPU cloud services run maximum reliability and uptime as promised to the users and at the same… more
- NVIDIA (Santa Clara, CA)
- …and drive foundational improvements and automation to improve researchers productivity. As a Site Reliability Engineer , you are responsible for the big ... and operating large scale compute infrastructure + Proven experience in site reliability engineering for high-performance computing environments with operational… more
- General Motors (Mountain View, CA)
- …+ Participate in on-call engineering duty to support production. + Instill Site Reliability best practice through automation, data insights, and observability ... an OEM - comprehensive control over both in-vehicle and cloud software - to deliver seamless solutions to our...future for generations to come. In this SRE SW Engineer role, you will develop and maintain key elements… more
- MongoDB (San Francisco, CA)
- …office, we provide hybrid work accommodation. **Role Overview** We are seeking a talented Site Reliability Engineer (SRE) Lead with a strong networking ... data platform, MongoDB Atlas, is the only globally distributed, multi- cloud database and is available in more than 115...available in more than 115 regions across AWS, Google Cloud , and Microsoft Azure. Atlas allows customers to build… more
- JPMorgan Chase (Palo Alto, CA)
- …You've discovered the perfect environment to have a major impact. As a **Principal Site Reliability Engineer ** at JPMorgan Chase within the **Enterprise ... qualifications, capabilities, and skills** + Formal training or certification on site reliability engineering concepts and 10+ years applied experience.… more
- Palo Alto Networks (Santa Clara, CA)
- …configuration management with a framework such as Terraform, Helm + Experience in Site Reliability Engineering, Production Engineering, or DevOps + Passion for ... large infrastructure and is one of the largest GCP customers. As a Senior Staff DevOps Engineer for the App Services team, you will be part of a team supporting the… more
- Palo Alto Networks (Santa Clara, CA)
- …runs a large hybrid infrastructure and is one of the largest GCP customers. As a Site Reliability Engineer , you will be part of a team supporting the ... Kubenetes cluster with autoscaling enabled + Experience in Production Engineering, DevOps, or Site Reliability + Expertise in the public cloud (GCP or AWS),… more
- Palo Alto Networks (Santa Clara, CA)
- …As a Senior Staff SRE with the Cortex Observability team, you will: + Cloud Expertise: Utilize your expertise in monitoring cloud platforms, particularly GCP, to ... optimize our infrastructure, leveraging cloud -native technologies + Monitoring Expertise: Improve monitoring processes, alerts, and metrics. Work with development… more
- NVIDIA (Santa Clara, CA)
- Join our team in Santa Clara, CA, USA as a Senior Site Reliability Engineer . At NVIDIA, you'll be part of the team shaping the future of computing and ... reports. + Deliver SRE solutions in a globally distributed, multi- cloud hybrid environment - AWS, GCP, and On-prem. +...TCP/IP fundamentals. + Expertise with at least one major cloud service provider - AWS, GCP, Azure. + Demonstrated… more
- ServiceNow, Inc. (Santa Clara, CA)
- …experiences in the future. **As a Senior Staff Machine Learning Engineer - Site Reliability Engineer you will:** + Contribute to the design, development ... It all started in sunny San Diego, California in 2004 when a visionary engineer , Fred Luddy, saw the potential to transform how we work. Fast forward to today -… more
- MongoDB (San Francisco, CA)
- …industry-leading developer data platform, MongoDB Atlas, is the only globally distributed, multi- cloud database and is available in more than 115 regions across AWS, ... Google Cloud , and Microsoft Azure. Atlas allows customers to build...AI-powered applications. We are looking for an experienced Staff Engineer for our SRE, InfraSec team, to guide the… more
- Palo Alto Networks (Santa Clara, CA)
- …forefront of building and maintaining highly reliable, scalable, and secure cloud infrastructure within a FedRAMP compliant environment. You'll drive operational ... This includes automation, architecture, performance, observability, troubleshooting, security, and reliability . Our Infrastructure Platform stack includes Terraform, Kubernetes, GitLab… more
- Rubrik (Palo Alto, CA)
- …do:** * Deploy and operate security solutions and supporting infrastructure in cloud and datacenter environments in support of internal customer security needs and ... and services with the objective of achieving and exceeding availability and reliability goals * Manage and streamline monitoring systems to enhance observability and… more
- Celonis (Redwood City, CA)
- …and resilience of our platform. The team applies advanced software engineering and Site Reliability Engineering (SRE) principles to drive system reliability , ... + Join a highly technical, collaborative, and innovation-driven team that blends Site Reliability Engineering with modern Software Engineering practices to build… more
- MongoDB (San Francisco, CA)
- …industry-leading developer data platform, MongoDB Atlas, is the only globally distributed, multi- cloud database and is available in more than 115 regions across AWS, ... Google Cloud , and Microsoft Azure. Atlas allows customers to build...to build and run applications anywhere-on premises, or across cloud providers. With offices worldwide and over 175,000 new… more
- Amazon (Cupertino, CA)
- …platforms for the world's largest Cloud Services provider. As a Senior Reliability Engineer you will engage with an experienced cross-disciplinary staff to ... Description The Trainium Manufacturing, Quality and Reliability (MQR) Team is part of AWS Annapurna Labs focused on Machine Learning products that designs cutting AI… more
- LinkedIn (Mountain View, CA)
- …role at a high-growth or web-scale technology companySuggested Skills:- Site Reliability Engineering (SRE)-Leadership-Large scale infrastructureLinkedIn is ... with multi-region architecture, capacity planning, and failover strategies in large-scale cloud or hybrid environmentsBackground in CI/CD, platform reliability ,… more
- Celonis (Redwood City, CA)
- …and resilience of our platform. The team applies advanced software engineering and Site Reliability Engineering (SRE) principles to drive system reliability , ... SRE & Software Engineering. + Responsible for the design, implementation, reliability and management of cloud -based FedRAMP-compliant applications and platforms.… more