- Rivian (Palo Alto, CA)
- …more sustainable for everyone. Role Summary We are seeking a Senior Site Reliability Engineer (SRE) specializing in Observability to join RivianVW's Data ... Computer Science, Engineering, or equivalent practical experience. Experience : 5+ years in Site Reliability Engineering or a related role with a strong emphasis… more
- NVIDIA Corporation (Santa Clara, CA)
- Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and ... you'll be doing: Design, implement and support operational and reliability aspects of large scale Observability &...operational and reliability aspects of large scale Observability & Telemetry collection platform with a focus on… more
- IBM (San Jose, CA)
- …heart of IBM, where growth and innovation thrive. Your role and responsibilities As a Site Reliability Engineer , you will work in an agile, collaborative ... deploying the latest software updates & fixes. Your primary responsibilities include: 24x7 Observability : Be part of a worldwide team that monitors the health of… more
- Rivian (Palo Alto, CA)
- A leading automotive technology firm is seeking a Senior Site Reliability Engineer specializing in Observability to enhance their Data Platform. This ... role involves designing observability systems, collaborating with... systems, collaborating with cross-functional teams, and ensuring the reliability of production environments. The ideal candidate will have… more
- Amiri Recruiting (Mountain View, CA)
- Site Reliability Engineer Onsite- Bay Area, CA Skills Relevant Skills and Experience What You'll Do (Day-to-Day) Own and manage our cloud infrastructure (GCP ... Code (Terraform). Monitor system health and performance using Grafana and other observability tools. Ensure high availability, reliability , and uptime across… more
- IBM Computing (San Jose, CA)
- A leading technology company in San Jose is seeking a Site Reliability Engineer to monitor and maintain production systems. The role requires experience in ... candidates will join a fast-paced environment focused on innovation and reliability , contributing to high-performing systems. This position offers the opportunity to… more
- IBM Computing (San Jose, CA)
- A leading global technology firm is seeking a Site Reliability Engineer to join their team in San Jose, California. This role involves monitoring production ... systems, troubleshooting issues, and utilizing CI/CD tools for deployment. Candidates should have 1-3 years of experience in system monitoring and automation, as well as proficiency in Linux. The company offers a collaborative environment focused on innovation… more
- Replit, Inc. (Foster City, CA)
- A leading software development platform is seeking a Staff Site Reliability Engineer to enhance the reliability and scalability of its infrastructure ... programming skills in Python or Go, along with extensive experience in Site Reliability Engineering and cloud-native technologies like Kubernetes. This full-time… more
- Replit, Inc. (Foster City, CA)
- …performance of Replit's infrastructure that serves millions of developers worldwide. As a Staff Site Reliability Engineer , you will bridge the gap between ... by removing traditional barriers to application creation. About the role: Join our Site Reliability Engineering (SRE) team and help ensure the reliability… more
- Replit, Inc. (Foster City, CA)
- …and performance of Replit's infrastructure that serves millions of developers worldwide. As a Site Reliability Engineer , you will bridge the gap between ... by removing traditional barriers to application creation. About the role: Join our Site Reliability Engineering team and help ensure the reliability ,… more
- eBay Inc. (San Jose, CA)
- …the team and the role: The Traffic team is responsible for the reliability , performance, and security of network traffic across our edge and core infrastructure. ... data paths, evolve our kernel and userspace networking stack, and build observability that powers real-time insight and rapid incident response. Our work spans… more
- Cisco Systems (San Jose, CA)
- Meet the Team The Splunk Observability Application Platform Team is a dynamic group of engineers responsible for the core platform powering Splunk Observability ... Cloud. Our platform is the foundation for advanced observability capabilities that enable our customers to thrive. We operate in small, high-performing teams,… more
- Develop Health Inc. (Menlo Park, CA)
- …rapidly following a major funding round. About The Role: We're hiring an AI Engineer to take models from prototype to production and drive real clinical impact ... optimizing inputs, retrieval, and context to deliver cutting‑edge performance and reliability . Design and maintain evaluation pipelines that rigorously measure model… more
- Menlo Ventures (Mountain View, CA)
- …and performance of our systems, incorporating a blend of Cloud Engineering and Site Reliability Engineering (SRE) practices. This role requires a strong ... Develop infrastructure-as-code using tools such as Terraform, CloudFormation, or similar. Site Reliability Engineering (SRE) Implement SRE practices to ensure… more
- Develop Health Inc. (Menlo Park, CA)
- …to bring frontier LLM capabilities into real‑world workflows, ensuring performance, usability, and reliability . Working on‑ site in Menlo Park at least three days ... a major funding round. About The Role: We're hiring a Senior Full Stack Engineer to design and ship the systems and interfaces that power our AI‑driven healthcare… more
- Cisco Systems (San Jose, CA)
- …and Cisco's global engineering capabilities. Our work spans networking, security, observability , and customer experience - designing and deploying foundation models ... that enhance reliability , strengthen security, prevent downtime, and deliver predictive insights...security, prevent downtime, and deliver predictive insights across Splunk Observability , Security, and Platform at enterprise scale. You'll be… more
- Cisco Systems (San Jose, CA)
- …distillation, and reinforcement learning to improve model performance, scalability, and reliability . Support the training and fine‑tuning of Large and Small Language ... AI‑based anomaly detection. Familiarity with AI‑driven DevOps automation and model observability . Exposure to edge computing environments. Experience on various AI… more
- Kognitos (Mountain View, CA)
- …this hybrid role, you'll operate at the intersection of Developer Productivity and Site Reliability Engineering (SRE). You'll design, implement, and maintain the ... Lead Infrastructure Engineer We're looking for a Lead Infrastructure ...scale the core systems powering developer velocity and platform reliability at Kognitos. If you're passionate about Terraform, scalable… more
- jobr.pro (Mountain View, CA)
- …engineers. Drive adoption of best practices in distributed system design, observability , reliability engineering, and modern DevOps tooling . Cross-Functional ... per week) About the Role ID.me is seeking a Staff Software Development Engineer to help design and build the next-generation SaaS Commerce Platform and Developer… more
- NVIDIA Corporation (Santa Clara, CA)
- Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and ... engineering. What we need to see: 10+ years of experience in Site Reliability Engineering, Platform Engineering, or Cloud Architect roles. BS degree… more