- NVIDIA (Santa Clara, CA)
- NVIDIA's Observability team is seeking a Senior /Staff Engineer to compose and build the next-generation, multi-region observability platform. This ... of observability infrastructure with Kubernetes, Terraform, and custom tooling ( Go , Python) + Ensuring reliability and cost efficiency of telemetry pipelines… more
- NVIDIA (Santa Clara, CA)
- … Observability is at the heart of this transformation. We are looking for a Senior AI & HPC Observability Engineer to design and build the next-generation ... GPU infrastructure. What You Will Be Doing: + Design and implement full-stack observability systems covering metrics, logs, traces, and events for GPU-powered AI and… more
- Cisco (Milpitas, CA)
- Senior Full Stack Engineer - Cloud-Native Observability Platform We are the Catalyst Center Platforms and Capabilities team, responsible for delivering ... you would fit right into our team! What You'll Do We're looking for a Senior Software Engineer to take ownership of building and shaping the user experience… more
- NVIDIA (Santa Clara, CA)
- …Design, implement and support operational and reliability aspects of large scale Observability & Telemetry collection platform with a focus on performance at scale, ... design through deployment, operation and refinement + Support services before they go live through activities such as system design consulting, developing software… more
- LinkedIn (Mountain View, CA)
- …and service health across all data center and backbone network environments. As a Senior Staff Software Engineer , you will serve as a technical leader driving ... This role will be based in Mountain View, CA. The Network Infrastructure Observability team is responsible for delivering the platforms, tools, and insights that… more
- Palo Alto Networks (Santa Clara, CA)
- …most advanced SecOps platform, consisting of XSIAM, XSOAR, and XPANSE. As a Senior DevOps Engineer , you will be responsible for designing, building, and ... engineer who is passionate about automation, cloud infrastructure, observability , and continuous integration/deployment. You will contribute to the evolution of… more
- LinkedIn (Mountain View, CA)
- …to optimize their models and deliver the best performance possible. As a Senior Software Engineer , you will have first-hand opportunities to advance one ... performance optimizations across billions of user queries. Model Training Infrastructure: As an engineer on the AI Training Infra team, you will play a crucial role… more
- Palo Alto Networks (Santa Clara, CA)
- …multi-tiered applications in a rapidly growing company. As a Sr Principal AI Engineer , you will leverage your extensive experience to act as the trailblazer, helping ... software engineering to design intelligent, agentic workflows that revolutionize our Go -To-Market (GTM) processes. **Your Impact** AI Architecture & Strategy +… more
- Capital One (San Jose, CA)
- Senior AI Engineer (AI Foundations, LLM Core and Agentic AI) **Overview:** At Capital One, we are creating responsible and reliable AI systems, changing banking ... agreed upon number of hours to be regularly worked. Cambridge, MA: $158,600 - $181,000 for Senior AI Engineer McLean, VA: $158,600 - $181,000 for Senior AI … more
- NVIDIA (Santa Clara, CA)
- We are looking for a Senior AI Infrastructure Engineer (AI Tooling) to design and build the backend systems and infrastructure powering our internal AI tools and ... data pipelines, and developing tools to improve AI reliability, observability , and deployment workflows. This is not a research...Design, develop, and maintain backend systems and infrastructure using Go and Python to support internal AI tools and… more
- MongoDB (Palo Alto, CA)
- **About the Role** We're looking for a Senior Engineer to help build the next-generation inference platform that supports embedding models used for semantic ... - fully integrated with Atlas and designed for developer-first experiences. As a Senior Engineer , you'll focus on building core systems and services that… more
- pony.ai (Fremont, CA)
- …globally. Pony.ai went public at NASDAQ in November 2024. Responsibilities As a ( Senior ) Kubernetes Engineer , you will: + Design, operate, and optimize ... service deployments, security policies, and operational guidelines. + Contribute to observability and SRE practices to ensure reliability at scale (SLOs, incident… more
- Rubrik (Palo Alto, CA)
- …agile, nimble, simple, but cohesive Cloud architectures. **About the role:** As a senior systems engineer , you will be responsible for strategic architecture, ... best practices for deployment and troubleshooting. + **Monitoring, Logging & Observability ** - Own observability solutions, monitoring platforms, logging tools,… more
- NVIDIA (Santa Clara, CA)
- NVIDIA's Deep Learning Safety team is looking for a Senior Software Engineer to build intelligent, autonomous software for the next generation of accelerated ... Our Deep Learning Safety team is looking for a Senior Software Engineer to design and implement...+ Strong programming experience in Python and C++ (or Go /Rust), including experience with async runtimes and API orchestration.… more
- General Motors (Mountain View, CA)
- …to solve production problems and reduce operational toil at scale. As a Software Engineer in SRE, you will work across the full lifecycle of services-from design and ... Instrument services withappropriate metrics, logging, and tracing to support observability , proactive issue detection, and data-driven decision making. + Participate… more
- Broadcom (Palo Alto, CA)
- …Account, please Sign-In before you apply.** **Job Description:** **Core Kubernetes Software Engineer - VMware Cloud Foundation -** VMware by Broadcom, a leader in ... infrastructure, data center networking, and security, is seeking a Core Kubernetes Software Engineer to join our Common Platform Group in the VMware Cloud Foundation… more
- Intuit (Mountain View, CA)
- **Overview** Join Intuit's Platform & Developer Experience (PDX) group as a Senior Staff Software Engineer focused on developing and scaling the API Gateway ... capabilities. + Continuously raise the bar for reliability, performance, observability , and operational excellence. + Mentor engineers across the organization… more
- NVIDIA (Santa Clara, CA)
- …crafting, constructing, and maintaining vital systems efficiently and reliably.. As a Senior Storage Product Engineer , you will take ownership of NVIDIA's ... + Experience in one or more of the following: C/C++, Java, Python, Go , NodeJS, and Bash for storage automation, monitoring, and performance tuning. + Hands-on… more
- NVIDIA (Santa Clara, CA)
- …AI/ML datacenters with NVIDIA GB200, and upcoming GB300 GPUs. NVIDIA seeks a Senior Software Engineer for our CSP (Cloud Service Provider) Engagements team ... roadmap influence. + Familiarity with CI/CD (GitHub Actions, Tekton), observability (Prometheus, OpenTelemetry), and infrastructure-as-code. + Excellent communication-able to… more
- General Motors (Sunnyvale, CA)
- …by prioritizing high-impact, ML-centric use cases. **About the Role:** We are seeking a Senior ML Infrastructure engineer to help build and scale robust Compute ... and auto-scaling mechanisms. + Drive the development of monitoring, observability , and metrics to ensure reliability, performance, and resource optimization.… more