- Charles Schwab Corporation (San Francisco, CA)
- …and explore next‑generation GenAI efforts that will redefine how we serve our clients. As a Senior AI Site Reliability Engineer on AI .x, you will ... your career in one of the most exciting areas of technology today. As a Senior AI Site Reliability Engineer , you will design, implement, and manage the … more
- Boson AI (Palo Alto, CA)
- About The Role We're looking for a Senior Site Reliability Engineer to help us run one of the most exciting GPU clusters around-our Toronto datacenter packed ... with NVIDIA H100 and A100 GPUs, over 20PB of Ceph storage, terabit networking, and hundreds of servers. You'll be hands‑on with the full lifecycle of HPC infrastructure: planning, building, testing, deploying, and keeping everything running smoothly. That… more
- IBM (San Jose, CA)
- …heart of IBM, where growth and innovation thrive. Your role and responsibilities As a Site Reliability Engineer , you will work in an agile, collaborative ... curious, we are a team dedicated to creating the world's leading AI -powered, cloud-native software solutions for our customers. Our renowned legacy creates endless… more
- Bloom Energy (San Jose, CA)
- ## .Staff Reliability Engineer page is loaded## Staff Reliability Engineerlocations: San Jose, Californiatime type: Full timeposted on: Posted Todaytime left ... a rapidly digitizing, energy-intensive world. From revolutionizing power for AI -driven data centers to ensuring resilience for hospitals, electric...of the 21st century.We are looking for a **Staff Reliability Engineer ** to join our team in… more
- Amiri Recruiting (Mountain View, CA)
- Site Reliability Engineer Onsite- Bay Area, CA Skills Relevant Skills and Experience What You'll Do (Day-to-Day) Own and manage our cloud infrastructure (GCP ... using Grafana and other observability tools. Ensure high availability, reliability , and uptime across platforms. Handle infrastructure maintenance, upgrades, and… more
- Neara (Palo Alto, CA)
- …directly with your resume via jobsarchetypeai io. About the Role As a Site Reliability Engineer (SRE) at Archetype AI , you will be responsible for ... Job type: Full Time . Department: Backend Engineer . Work type: Remote About A rchetype AI Archetype AI is developing the world's first AI platform to… more
- Cisco Systems (San Jose, CA)
- …the Team Our dedicated team members are building the future of Cisco's AI -driven platforms and data infrastructure, supporting innovation across the globe. You will ... systems. Explore the opportunities at the intersection of data engineering and AI , helping to transform how Cisco and its customers harness information and… more
- Cisco Systems (San Jose, CA)
- Meet the Team CX AI Incubation team is part of CX and is focused on identifying and building breakthrough emerging solutions to support the diverse requirements of ... Are you ready to be at the forefront of AI innovation? At Cisco CX, you will design and...and reinforcement learning to improve model performance, scalability, and reliability . Support the training and fine‑tuning of Large and… more
- Develop Health Inc. (Menlo Park, CA)
- …now scaling rapidly following a major funding round. About The Role: We're hiring an AI Engineer to take models from prototype to production and drive real ... Develop Health is on a mission to use AI to radically accelerate access to life‑saving medications. By automating complex, manual healthcare processes-like benefit… more
- Hamilton Barnes Associates Limited (San Francisco, CA)
- …and incident response frameworks. Familiarity with high‑performance computing (HPC) or AI /ML training infrastructure at scale. Background in reliability ... opportunity? Join a stealth-mode hyperscale data center startup building a next-generation AI and cloud platform designed for startups and advanced research, powered… more
- Google Inc. (Mountain View, CA)
- Site Reliability Manager, Site ...or read a career profile about why a Software Engineer chose to join SRE. Search front-end SRE is the ... (eg, A/B, multivariate) and incremental analysis. About the job Site Reliability Engineering (SRE) combines software and...flagship product. Experience Search as we move forward into AI . Behind everything our users see online is the… more
- Bloom Energy (San Jose, CA)
- ## .Principal Engineer - CFD lead page is loaded## Principal Engineer - CFD leadlocations: Office - 4353 North 1st Streettime type: Full timeposted on: Posted ... in a rapidly digitizing, energy-intensive world. From revolutionizing power for AI -driven data centers to ensuring resilience for hospitals, electric grids,… more
- Victrays (San Jose, CA)
- Sr. System Engineer - Supermicro - San Jose, California, United States About Supermicro Supermicro(R) is a Top Tier provider of advanced server, storage, and ... technologists, and business leaders to join us. Job Summary As a Sr. System Engineer , you'll be the go-to person to roll out and maintain business critical… more
- Develop Health Inc. (Menlo Park, CA)
- …to bring frontier LLM capabilities into real‑world workflows, ensuring performance, usability, and reliability . Working on‑ site in Menlo Park at least three days ... Integrate and orchestrate LLM services in production environments, optimizing for performance, reliability , and user experience. Collaborate with AI and product… more
- Menlo Ventures (Mountain View, CA)
- …and performance of our systems, incorporating a blend of Cloud Engineering and Site Reliability Engineering (SRE) practices. This role requires a strong ... Develop infrastructure-as-code using tools such as Terraform, CloudFormation, or similar. Site Reliability Engineering (SRE) Implement SRE practices to ensure… more
- Mashgin (San Jose, CA)
- …a successful and innovative point-of-sale experience that uses computer vision and AI to make checkout nearly instantaneous. Our mission is to eliminate checkout ... culture of respect and fun. Position Summary Mashgin is seeking a Product Design Engineer to join our hardware engineering team. You'll play a key role in developing… more
- Cisco Systems (San Jose, CA)
- …developing and maintaining the chassis management software for Cisco's blade server and edge AI product lines. Our team plays a critical role in delivering the ... reliability , performance, and innovation customers expect from Cisco's data...future of compute infrastructure. Your Impact As a software engineer in the UCS Chassis Management team, you will… more
- Roku, Inc. (San Jose, CA)
- …degree, or equivalent work experience 8+ years of experience in DevOps or Site Reliability Engineering Experience with Cloud infrastructure such as Amazon AWS, ... our engagement over time. About the Role We are seeking a skilled engineer with exceptional DevOps skills to join our team. Responsibilities include automating and… more
- Cisco Systems (San Jose, CA)
- …Foundational Modeling team at Splunk, where we advance the state of AI for high‑volume, real‑time, multi‑modal machine‑generated data - including logs, time series, ... traces, and events. We combine deep AI research expertise with the scale and operational excellence of Splunk and Cisco's global engineering capabilities. Our work… more
- Bloom Energy (San Jose, CA)
- ## .Principal System Test Engineer page is loaded## Principal System Test Engineerlocations: Fremont, California: San Jose, Californiatime type: Full timeposted on: ... in a rapidly digitizing, energy-intensive world. From revolutionizing power for AI -driven data centers to ensuring resilience for hospitals, electric grids,… more