- Cisco (San Jose, CA)
- …distributed tracing initiatives across an organization + Experience with using AI Agents to continually refine observability outcomes + Understanding ... the Team** The DevOps team within Cisco's newly formed AI Software and Platform group designs and operates the...Assistants. Within the DevOps team, our Cloud Platform and Observability group provides all the necessary insights to power… more
- Cisco (Milpitas, CA)
- Senior Full Stack Engineer - Cloud-Native Observability Platform We are the Catalyst Center Platforms and Capabilities team, responsible for delivering scalable, ... innovation. One of our key initiatives is a cloud-native observability platform purpose-built for Cisco Catalyst Center deployments-bridging on-premises network… more
- NVIDIA (Santa Clara, CA)
- …Intelligence: Real world experience applying model development, RAG, MCP, and Agentic AI technical solutions to the problem of observability data analytics, ... at NVIDIA, you will own the development of DGX Cloud strategy for observability , monitoring, and remediation across all layers of infrastructure, IaaS, platforms and… more
- Google (Sunnyvale, CA)
- Senior Engineering Manager, ML Optimization Tools and Observability _corporate_fare_ Google _place_ Sunnyvale, CA, USA **Advanced** Experience owning outcomes and ... of the following: ML performance, debugging, optimization, profiling, or observability . + 5 years of experience in a people...Like Google's own ambitions, the work of a Software Engineer goes beyond just Search. Software Engineering Managers have… more
- NVIDIA (Santa Clara, CA)
- …team and see how you can make a lasting impact on the world. NVIDIA is hiring an AI operations engineer within the Finance AI and Data Science team. You ... business priorities, and knowledge bases. + Monitor & optimize AI systems using observability stacks to track model performance, system health, and lifecycle… more
- Palo Alto Networks (Santa Clara, CA)
- …and developing multi-tiered applications in a rapidly growing company. As a Sr Principal AI Engineer , you will leverage your extensive experience to act as the ... and innovate enterprise-grade full-stack systems with a specific focus on Generative AI transformation. We are looking for a highly hands-on, extremely technical… more
- Palo Alto Networks (Santa Clara, CA)
- …and developing multi-tiered applications in a rapidly growing company. As a Principal AI Engineer , you will leverage your extensive experience to act as ... and innovate enterprise-grade full-stack systems with a specific focus on Generative AI transformation. We are looking for a highly hands-on, extremely technical… more
- Microsoft Corporation (Mountain View, CA)
- …audio, video, and multimodal content. We are looking for a **Principal Software Engineer - Responsible AI ** who is passionate about building customer-facing ... world. The **CoreAI organization** at Microsoft builds the end-to-end AI stack and is core to Azure AI...with a focus on high availability, scalability, robustness, and observability . + Lead project development across the organization and… more
- Walmart (Sunnyvale, CA)
- …** Join Walmart Global Tech's Site Reliability Engineering organization as a Distinguished AI /ML Engineer to architect revolutionary agentic AI systems that ... systems, and ML-specific dashboards + Proven ability to implement comprehensive observability solutions for complex AI /ML pipelines and distributed systems… more
- Cisco (San Jose, CA)
- … teams to build and sustain public-facing websites and applications. Leveraging AI -augmented observability and assurance tools, you will proactively monitor and ... CloudFront. + Security and compliance knowledge, including IAM and cloud auditing/monitoring tools. ** AI and Observability :** + Experience with AI -powered … more
- ServiceNow, Inc. (Santa Clara, CA)
- It all started in sunny San Diego, California in 2004 when a visionary engineer , Fred Luddy, saw the potential to transform how we work. Fast forward to today - ... ServiceNow stands as a global market leader, bringing innovative AI -enhanced technology to over 8,100 customers, including 85% of the Fortune 500(R). Our intelligent… more
- NVIDIA (Santa Clara, CA)
- …make them operational in production? We are seeking a dedicated Cluster Deployment Operations Engineer to support product deployments and issues by collaborating ... HPC and AI clusters (eg, Prometheus, Grafana, DCGM, and similar observability stacks). + Outstanding written and verbal communication skills, with the ability to… more
- Microsoft Corporation (Mountain View, CA)
- …and build the infrastructure that makes that possible. As an **Machine Learning Operations (MLOps) Engineer ** , you'll build the connective tissue between our ... At Microsoft Copilot, we focus on building the best AI powered products in the world. We're building applied...latency, manage costs, implement intelligent caching, and build the observability needed to maintain reliability at scale + Deployment… more
- ServiceNow, Inc. (Santa Clara, CA)
- …observability , incident response, and service reliability through modern, AI -native workflows. These solutions integrate monitoring, alerting, and automated ... sunny San Diego, California in 2004 when a visionary engineer , Fred Luddy, saw the potential to transform how...ServiceNow stands as a global market leader, bringing innovative AI -enhanced technology to over 8,100 customers, including 85% of… more
- ServiceNow, Inc. (Santa Clara, CA)
- …and intelligent operations at scale, tightening the loop between operations and service delivery, all targeting modernized, AI ‑assisted operations ... sunny San Diego, California in 2004 when a visionary engineer , Fred Luddy, saw the potential to transform how...ServiceNow stands as a global market leader, bringing innovative AI -enhanced technology to over 8,100 customers, including 85% of… more
- Deloitte (San Jose, CA)
- …systems (task delegation, coordination, and autonomous decision-making) + Certifications: Azure AI Engineer Associate, Azure Solutions Architect Expert; AWS ... Join our AI & Engineering team in transforming technology platforms,...in enterprise contexts. + Design and build for enterprise-grade operations , embedding observability , monitoring, cost management, and… more
- Deloitte (San Jose, CA)
- …systems (task delegation, coordination, and autonomous decision-making) + Certifications: Azure AI Engineer Associate, Azure Solutions Architect Expert; AWS ... Join our AI & Engineering team in transforming technology platforms,...in enterprise contexts. + Design and build for enterprise-grade operations , embedding observability , monitoring, cost management, and… more
- NVIDIA (Santa Clara, CA)
- …lead the development of DGX Cloud strategy for GPU fleet lifecycle, health, observability and utilization monitoring, and remediation. You will define and drive the ... Architectural Work: define and drive the technical implementation for DGX Cloud operations practice for GPU fleet lifecycle. + Collaborate on Cross Domain… more
- Rubrik (Palo Alto, CA)
- …USD **Join Us in Securing the World's Data** Rubrik (RBRK), the Security and AI Operations Company, leads at the intersection of data protection, cyber ... streaming telemetry, audit trails, and behavioral analytics across thousands of agents. ** Engineer scalable, resilient AI platforms:** + Architect and scale… more
- NVIDIA (Santa Clara, CA)
- …+ Experience in production-grade infrastructure, knowing how to take into account observability , resilience and operations . + Enthusiasm for continual learning ... We are currently seeking a senior-level Engineer with distinguished expertise to join the Dynamo...source projects. + Serve as a technical expert in AI inferencing, helping to push the frontier of what's… more