- Charles Schwab Corporation (San Francisco, CA)
- …and explore next‑generation GenAI efforts that will redefine how we serve our clients. As a Senior AI Site Reliability Engineer on AI .x, you will ... your career in one of the most exciting areas of technology today. As a Senior AI Site Reliability Engineer , you will design, implement, and manage the … more
- Hamilton Barnes Associates Limited (San Francisco, CA)
- …and incident response frameworks. Familiarity with high‑performance computing (HPC) or AI /ML training infrastructure at scale. Background in reliability ... opportunity? Join a stealth-mode hyperscale data center startup building a next-generation AI and cloud platform designed for startups and advanced research, powered… more
- Crusoe Energy Systems LLC (San Francisco, CA)
- A technology firm based in California is seeking a Site Reliability Engineer to optimize their AI -optimized cloud infrastructure. The role involves ... building automation tools, driving reliability initiatives, and collaborating with engineers to ensure high-performance storage systems. Candidates should have… more
- Boson AI (Palo Alto, CA)
- About The Role We're looking for a Senior Site Reliability Engineer to help us run one of the most exciting GPU clusters around-our Toronto datacenter packed ... with NVIDIA H100 and A100 GPUs, over 20PB of Ceph storage, terabit networking, and hundreds of servers. You'll be hands‑on with the full lifecycle of HPC infrastructure: planning, building, testing, deploying, and keeping everything running smoothly. That… more
- Neara (Palo Alto, CA)
- …directly with your resume via jobsarchetypeai io. About the Role As a Site Reliability Engineer (SRE) at Archetype AI , you will be responsible for ... Job type: Full Time . Department: Backend Engineer . Work type: Remote About A rchetype AI Archetype AI is developing the world's first AI platform to… more
- Sierra (San Francisco, CA)
- A technology company in San Francisco seeks a Software Engineer for its Site Reliability team. This role involves defining and building reliability and ... scalability in an AI -driven infrastructure. Candidates should have over 5 years of experience in SRE/infrastructure, proficiency in AWS and Terraform, and a strong commitment to collaboration across teams. The position offers competitive benefits,… more
- Crusoe Energy Systems LLC (San Francisco, CA)
- …platform - and operational excellence is at the heart of that mission. As a Site Reliability Engineer focused on Operational Excellence, you will help ensure ... self‑healing systems, automated remediation, or event‑driven operations Interest in scaling AI /HPC infrastructure and solving reliability challenges in GPU‑heavy… more
- Sierra (San Francisco, CA)
- …led the product and design teams for Google Workspace. What you'll do As a Software Engineer on our Site Reliability team at Sierra, you will be responsible ... for defining and building the foundation of reliability , observability, and scalability across Sierra's AI -driven...What you'll bring 5+ years of hands‑on experience in Site Reliability or Infrastructure engineering roles for… more
- Icon Ventures (San Francisco, CA)
- …across the world and unlock human potential. About the Role As a Senior Staff Site Reliability Engineer , you'll define the technical vision and architecture ... scalability, resilience, and security in partnership with product and AI platform teams. Define and enforce reliability ...and AI platform teams. Define and enforce reliability standards, SLO frameworks, and incident response practices .… more
- harvey.ai (San Francisco, CA)
- …is being written today - and we're just getting started. Role Overview As a Software Engineer on the Site Reliability team at Harvey, you will ensure the ... reliability , scalability, and performance of our legal AI platform. You'll join a high-leverage team that sits...functionality. What You Have 5+ years of experience in Site Reliability Engineering or similar roles supporting… more
- Air Apps (San Francisco, CA)
- …journey to redefine resource management-and change lives along the way. The Role As a Site Reliability Engineer (SRE) at Air Apps, you will be responsible ... company on a mission to create the world's first AI -powered Personal & Entrepreneurial Resource Planner (PRP), and we...minimize downtime. Requirements Around 4+ years of experience in Site Reliability Engineering (SRE), DevOps, or System… more
- Datacrunch (San Francisco, CA)
- …to align on vision and expectations. About the role We're seeking a Senior or Principal Site Reliability Engineer (SRE) to become our first US hire, based in ... access to intelligence. We're building a fully featured European AI cloud - with everything one needs to train,...(HPC) and cloud infrastructure globally. As our initial US‑based engineer , you'll set the standard for reliability ,… more
- Hive (San Francisco, CA)
- …to commercialize our machine learning models, we also need to grow our DevOps and Site Reliability team to maintain the reliability of our enterprise SaaS ... About Hive Hive is the leading provider of cloud-based AI solutions to understand, search, and generate content, and is trusted by hundreds of the world's largest… more
- OpenAI (San Francisco, CA)
- …will work at the intersection of hardware and software, where speed and reliability are critical. Expect to manage fast-moving operations, quickly diagnose and fix ... Qualifications Experience as an infrastructure, systems, or distributed systems engineer in large-scale or high-availability environments Strong knowledge of… more
- Icon Ventures (San Francisco, CA)
- … AI ‑powered learning tools that scale across the world and unlock human potential. Senior Site Reliability Engineer As a Senior Site Reliability ... architecture that enable Quizlet to scale reliably for the next generation of AI ‑powered learning. You'll engineer software, tools, and processes that improve… more
- Icon Ventures (San Francisco, CA)
- …that scale across the world and unlock human potential. About the Role As a Staff Site Reliability Engineer , you'll lead reliability engineering across ... and ensuring that our infrastructure can support rapid innovation in AI -powered learning. You'll drive the architectural direction for resilience, observability, and… more
- Quizlet, Inc. (San Francisco, CA)
- …the world and unlock human potential. About the Role: We're looking for an experienced Site Reliability Engineer to be a systems developer who engineers the ... to scale our platform for the next generation of AI features. We're happy to share that this is...the table: Proven History of Architectural Ownership: and driving reliability initiatives within complex, distributed production environments as an… more
- Fluidstack (San Francisco, CA)
- …we're building the infrastructure for abundant intelligence. We partner with top AI labs, governments, and enterprises - including Mistral, Poolside, Black Forest ... working across software, hardware, and operations to ensure the reliability and performance of our global GPU cloud. They...to build systems that scale with the demands of AI workloads. SREs are hands‑on and possess deep systems… more
- Crusoe Energy Systems LLC (San Francisco, CA)
- A leading energy technology firm seeks a Site Reliability Engineer to enhance its reliable, energy-efficient, AI -optimized cloud platform. In this role, ... you'll collaborate with cross-functional teams to improve system performance and incident management. Ideal candidates will have a strong background in cloud operations and automation, alongside critical problem-solving skills. Join this innovative team to… more
- Open Select (San Francisco, CA)
- Senior Infrastructure Engineer Location: On- site , San Francisco, CA (3 days/week in office) Salary: $150k - $200k + equity Industry: AI , Cloud Infrastructure ... What You'll Drive Join a fast-growing AI startup backed by $65M from top VCs (Scale...founders of Twilio, Affir ). As a Senior Infrastructure Engineer , you'll architect and scale distributed systems that power… more