Site Reliability Engineer Observability Jobs in Sunnyvale, CA

53 jobs (page 1)

Categories

All Categories

Engineering (11)

Software/IT (6)

Site Reliability Engineer…

Rivian (Palo Alto, CA)

…more sustainable for everyone. Role Summary We are seeking a Senior Site Reliability Engineer (SRE) specializing in Observability to join RivianVW's Data ... Computer Science, Engineering, or equivalent practical experience. Experience : 5+ years in Site Reliability Engineering or a related role with a strong emphasis… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
Senior Site Reliability…

NVIDIA Corporation (Santa Clara, CA)

Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and ... you'll be doing: Design, implement and support operational and reliability aspects of large scale Observability &...operational and reliability aspects of large scale Observability & Telemetry collection platform with a focus on… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
Senior SRE - Observability & Telemetry…

Rivian (Palo Alto, CA)

A leading automotive technology firm is seeking a Senior Site Reliability Engineer specializing in Observability to enhance their Data Platform. This ... role involves designing observability systems, collaborating with... systems, collaborating with cross-functional teams, and ensuring the reliability of production environments. The ideal candidate will have… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
Senior Site Reliability…

Promote Project (Santa Clara, CA)

Senior Site Reliability Engineer ML Platforms Location 60000 - 135000 a year (s) Description Are you passionate about building and maintaining large-scale ... culture? If so, we have a great opportunity for you! NVIDIA is seeking a Senior Site Reliability Engineer (SRE) for the Data Science & ML Platform(s) team.… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
Principal Staff Site Reliability…

NVIDIA Corporation (Santa Clara, CA)

…NTP/PTP, DHCP, and LDAP. This includes building for performance and reliability at global scale, covering automation, monitoring, high availability, capacity ... optimizations (SR-IOV/ DPU)* Experience with Technologies like eBPF and XDP for Observability & DDoS mitigation* Collect and review system data for capacity and… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
Senior Observability & Telemetry Platform…

NVIDIA Corporation (Santa Clara, CA)

A leading technology company is seeking a Site Reliability Engineer to design and maintain large-scale production systems with a focus on observability ... and reliability . The ideal candidate will have 5+ years of experience in infrastructure automation and distributed systems, along with strong skills in Python and cloud technologies like Kubernetes. This position offers a competitive salary based on… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
AI-Powered Cloud SRE: 24x7 Observability…

IBM Computing (San Jose, CA)

A leading global technology firm is seeking a Site Reliability Engineer to join their team in San Jose, California. This role involves monitoring production ... systems, troubleshooting issues, and utilizing CI/CD tools for deployment. Candidates should have 1-3 years of experience in system monitoring and automation, as well as proficiency in Linux. The company offers a collaborative environment focused on innovation… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
Lead Infrastructure Engineer

Kognitos (Mountain View, CA)

…this hybrid role, you'll operate at the intersection of Developer Productivity and Site Reliability Engineering (SRE). You'll design, implement, and maintain the ... Lead Infrastructure Engineer We're looking for a Lead Infrastructure ...scale the core systems powering developer velocity and platform reliability at Kognitos. If you're passionate about Terraform, scalable… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
Application Platform Staff Engineer

Cisco Systems (San Jose, CA)

Meet the Team The Splunk Observability Application Platform Team is a dynamic group of engineers responsible for the core platform powering Splunk Observability ... Cloud. Our platform is the foundation for advanced observability capabilities that enable our customers to thrive. We operate in small, high-performing teams,… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
Staff Software Engineer - Shopping Graph

jobr.pro (Mountain View, CA)

…engineers. Drive adoption of best practices in distributed system design, observability , reliability engineering, and modern DevOps tooling . Cross-Functional ... per week) About the Role ID.me is seeking a Staff Software Development Engineer to help design and build the next-generation SaaS Commerce Platform and Developer… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
Software Engineer (Backend)

Kloudfuse (Cupertino, CA)

…and led by the leading investors in Silicon Valley. Kloudfuse AI takes observability data to the next level. Our HawkEye and BullsEye analytics engine generates ... architecting and developing production web-scale systems (monitoring, telemetry, performance, reliability , triage and debug) Experience building and maintaining large… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
Senior Research Engineer Manager

Cisco Systems (San Jose, CA)

…and Cisco's global engineering capabilities. Our work spans networking, security, observability , and customer experience - designing and deploying foundation models ... that enhance reliability , strengthen security, prevent downtime, and deliver predictive insights...security, prevent downtime, and deliver predictive insights across Splunk Observability , Security, and Platform at enterprise scale. You'll be… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
AI Engineer

Develop Health Inc. (Menlo Park, CA)

…rapidly following a major funding round. About The Role: We're hiring an AI Engineer to take models from prototype to production and drive real clinical impact ... optimizing inputs, retrieval, and context to deliver cutting‑edge performance and reliability . Design and maintain evaluation pipelines that rigorously measure model… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
Senior Full-Stack Engineer

Develop Health Inc. (Menlo Park, CA)

…to bring frontier LLM capabilities into real‑world workflows, ensuring performance, usability, and reliability . Working on‑ site in Menlo Park at least three days ... a major funding round. About The Role: We're hiring a Senior Full Stack Engineer to design and ship the systems and interfaces that power our AI‑driven healthcare… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
Senior Engineer , Infrastructure

Athena LLC. (Palo Alto, CA)

…with expertise in Google Cloud Platform (GCP), development tools, distributed systems, Site Reliability Engineering (SRE), observability , and DevOps ... **SRE and Observability :** Implement SRE best practices to enhance system reliability and performance, and build observability into the infrastructure to… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
System Software Engineer

Pantera Capital (Palo Alto, CA)

…and accurately share knowledge with their teammates. About the Role As a Data Center Site Reliability Engineer (SRE) at xAI, you will play a pivotal ... or a related technical field (or equivalent experience). 5+ years in site reliability engineering, data center operations, or large‑scale infrastructure… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
SayPro Senior Infrastructure Engineer

SayPro (Palo Alto, CA)

…(physical, virtual and cloud) (A, I) Proven track record delivering high‑availability, multi‑ site architectures (A, I) Monitoring and observability using Azure ... summary About the Role We're looking for a highly skilled Senior Infrastructure Engineer to design, implement, and support the core systems and services that drive… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
Staff Engineer , Infrastructure

Athena LLC. (Palo Alto, CA)

…in cloud platforms along with a deep understanding of distributed systems, site reliability engineering (SRE), observability , and DevOps ... trained assistants leveraging highly trained AI.**Role Overview**The Infrastructure Staff Engineer will spearhead Athena's infrastructure initiatives, focusing on creating… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
Senior Backend Engineer : Distributed…

Neara (Palo Alto, CA)

Job type: Full Time Department: Backend Engineer Work type: On- Site About A rchetype AI Archetype AI is developing the world's first AI platform to bring AI into ... jobsarchetypeaiio. About the Role Were looking for a highly motivated backend engineer with a passion for building performant, scalable, and resilient distributed… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source
Backend Engineer - Distributed Systems

Neara (Palo Alto, CA)

Job type: Full Time . Department: Backend Engineer . Work type: On- Site About A rchetype AI Archetype AI is developing the world's first AI platform to bring AI ... io. About the Role We're looking for a highly motivated backend engineer with a passion for building performant, scalable, and resilient distributed systems.… more

job goal (01/12/26)
- Save Job - Related Jobs - Block Source

"Juju

Recent Searches

Recent Jobs

Account Login

Sign Up

Forgot your password?

Advanced Search