- EPAM Systems (San Francisco, CA)
- EPAM is hiring a **Remote Lead Site Reliability Engineer ** . If you are looking for a high-impact, exciting role with a company that leads the globe in the ... of SRE, mentor and train other engineers around proactive reliability decision making and planning + Review code instrumentation...on SLIs/SLOs **Requirements** + 5+ years of SRE or Systems Engineering experience + 2+ years as team lead… more
- Mastercard (San Francisco, CA)
- …decisions, drives innovation and delivers better business results. **Title and Summary** Lead Engineer - Test Automation, Pipelines and Site Reliability Lead ... Engineer - Test Automation, Pipelines and Site Reliability Overview: Join our mission-driven team building the future...and SRE Lead, you will lead projects to enable reliability , security, and velocity on our Priceless Platform… more
- Rubrik (Palo Alto, CA)
- …services run smoothly and have the capacity for future growth. As a Senior Site Reliability Engineer , you will be responsible for: + Ensure we maintain high ... , availability and efficiency improvements to Rubrik's Polaris Cloud Platform + Good mix of software and system...years of experience as a Development, DevOps or Site Reliability Engineer Willing to provide 24/7 coverage… more
- Rubrik (Palo Alto, CA)
- …an impact on product stability and success. **What you'll do:** As a Senior Site Reliability Engineer , you will be responsible for: + Manage and run backend ... , availability and efficiency improvements to Rubrik's Polaris Cloud Platform + Good mix of software and system...years of experience as a Development, DevOps or Site Reliability Engineer + Willing to provide 24/7… more
- Cisco (San Francisco, CA)
- …performance, change management, capacity planning, monitoring and emergency response. As a Site Reliability Engineer on the team, you will focus on helping the ... cloud applications to users. Our Internet and cloud intelligence platform is like a 'Google maps of the Internet',...leveraged in our context. * Good understanding of Unix/Linux systems , the kernel, system libraries, file … more
- Splunk (San Francisco, CA)
- …on the Azure cloud platform . + You enjoy building and running distributed systems at scale in production. You understand the challenges and trade-offs to be made ... _Cloud_ Services group is looking for aS _ite Reliability_ Engineer to help lead, design and build the next...systems to production. + Deep understanding of linux systems (network stack, file system , OS services)… more
- Cisco (San Francisco, CA)
- …availability, performance, expansion, strategic direction, monitoring, and emergency response. As a Site Reliability Engineer on the team, you will focus on all ... The team is tasked with building and running our platform 's global agent infrastructure, with a focus on aspects...to, AWS. * Possess a solid understanding of Unix/Linux systems , the kernel, system libraries, file … more
- General Motors (Palo Alto, CA)
- …exciting journey toward a better future **.** **Responsibilities:** + Lead Site Reliability engineering effort to improve anomaly detection, platform stability ... get to do in this role:** + Implement scalable, reliable, secure SRE and Observability platform to monitor health of our production system and provide a holistic… more
- EPAM Systems (San Francisco, CA)
- …a candidate to join a multi-functional SRE team with the focus on Google Cloud Platform . You should have cloud engineering experience in such areas acting as the SME ... operation automation and monitoring, identifying TOIL within the team's existing systems and processes, and recommending, and implementing automated solutions to… more
- Cisco (San Francisco, CA)
- …build and implement scalable and well-tested solutions. * Good understanding of Unix/Linux systems , the kernel, system libraries, file systems , and ... become a black box they can't understand. Our Internet and cloud intelligence platform delivers the only collectively powered view of the Internet, cloud and SaaS… more
- Lacework (Mountain View, CA)
- …Security Platform , the world's best real-time cloud-native threat detection system . Our team develops and supports services that perform automated operations in ... best practices alongside engineering/operations teams to improve the scalability and reliability of internal processes. + Participate in an on-call rotation. Your… more
- Fastly (San Francisco, CA)
- …support for Fastly's Compute Platform . We are looking for a Senior Principal Engineer who thrives on designing and navigating the path from today's systems ... connected with the things they love. Fastly's edge cloud platform enables customers to create great digital experiences quickly,...For:** + 10+ years of hands-on experience as an engineer building complex, reliable, and highly efficient systems… more
- Zoom (San Jose, CA)
- …clusters within different infrastructures. You will also design and implement reliability best practices to accomplish a highly available service (99.99%). ... issues and set up CI/CD pipelines with version control systems . Finally, you will monitor and resolve issues in...together. We set out to build the best collaboration platform for the enterprise, and today help people communicate… more
- Fastly (San Francisco, CA)
- …stay better connected with the things they love. Fastly's edge cloud platform enables customers to create great digital experiences quickly, securely, and reliably ... possible - at the edge of the Internet. The platform is designed to take advantage of the modern...services + Analyze and improve telemetry collection, instrumentation, monitoring systems , and dashboards + Participate in incident reviews to… more
- Zoom (San Jose, CA)
- …to production. You will design, develop, deploy, monitor, and scale DevOps Platform Services. You will own and create documentation of our Disaster Recovery, ... be technical yourself in order to drive meaningful improvements to our production software systems . What we're looking for + Experience in SRE or DevOps (at least 8+… more
- Rubrik (Palo Alto, CA)
- …data management platform for the largest enterprises in the world. As an engineer on the RSC Platform team, you'll be working on the fundamental ... think about reliability at every layer of the stack, from individual infrastructure systems to the reliability of services owned by product teams at Rubrik.… more
- Meta (Fremont, CA)
- …required to be effective in this role. **Required Skills:** SiteOps Global Production Platform Engineer Responsibilities: 1. Serve as data center operations SME ... global data center industry, both in design and operations. Our Production Platform Engineers are responsible for the operational performance of the (compute,… more
- Amazon (East Palo Alto, CA)
- …with upper-management and clear opportunities for career advancement. As a Senior Software Engineer on the RDS Platform team with a focus on security, ... Amazon RDS team is looking for a Software Development Engineer to lead the RDS Security to innovate new...5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems… more
- Amazon (East Palo Alto, CA)
- …a dynamic environment that values creativity, collaboration, and continuous learning. As a Software Engineer on the RDS Platform team with a focus on security, ... we're reshaping the future of cloud computing. Our RDS Platform team is dedicated to building secure and scalable...2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems… more
- DoorDash (San Francisco, CA)
- …and reliable data to drive many business and product decisions. The Data Platform owns all the infrastructure necessary to run an operationally efficient analytical ... of major focus are Machine Learning Infrastructure and workflow, Experimentation Platform , Knowledge Graphs and various Data Science and Analytics related tooling.… more