- San Francisco Compute Co. (San Francisco, CA)
- …small clusters would have been in the TOP500 5 years ago. Our supercomputing team is responsible for keeping our compute clusters running smoothly, monitoring ... hardware health, participating in on-call rotation, and fixing things when they go wrong. We believe strongly in automation - code is the only reliable way to manage hardware at scale. As we scale, this will become a more data-driven role, predicting failures… more
- Promote Project (Palo Alto, CA)
- Software Engineer , Infrastructure - Supercomputing Compensation: 100000 - 150000 a year (s) Location Bay Area (San Francisco, Palo Alto). Candidates are expected ... ArgoCD Focus Operating some of the world's largest GPU supercomputing clusters for both AI training and serving production...Job Type Remote job Tags software security training cloud engineer engineering Please mention the word ENHANCES and tag… more
- Promote Project (Palo Alto, CA)
- A technology-driven firm in supercomputing is seeking a Software Engineer to operate large GPU supercomputing clusters and enhance deployment pipelines. The ... ideal candidate should have experience in writing scalable containerized applications in Rust and managing compute fleets with tools like Pulumi and Terraform. This role is remote and requires strong communication skills and a solid work ethic, perfect for… more
- Pantera Capital (Palo Alto, CA)
- …with their teammates. About the Role RDMA Engineers on xAI's Supercomputing team design and optimize low-latency, high-bandwidth networking solutions using NVIDIA's ... RDMA-capable technologies to support some of the world's largest GPU supercomputing clusters. These clusters drive AI training and inference workloads, demanding… more
- Pantera Capital (Palo Alto, CA)
- …build the backbone of next‑generation AI infrastructure. As a Hardcore Engineer , you'll tackle complex challenges in large‑scale infrastructure, distributed systems, ... a large‑scale distributed system that powers one of the world's largest supercomputing clusters. Dive into the low‑level stack to profile, debug, and optimize… more
- Pantera Capital (Palo Alto, CA)
- …or open to relocation. Focus Operating some of the world's largest GPU supercomputing clusters for both AI training and serving production models. Implement IaC best ... practices, enhancing deployment pipelines, and ensuring robust, secure service delivery across our production environments. Working with both on-premise clusters and cloud providers. Help with security best practices for internal researchers and live external… more
- OpenAI (San Francisco, CA)
- …AI models. In addition to delivering production-grade silicon for OpenAI's supercomputing infrastructure, the team also creates custom design tools and methodologies ... and enable hardware optimized specifically for AI. About the Role As a software engineer on the Scaling team, you'll help build and optimize the low-level stack that… more
- OpenAI (San Francisco, CA)
- Software Engineer , Infrastructure Security | OpenAI Careers Software Engineer , Infrastructure Security Security - Remote - US, San Francisco, Seattle, and New ... security culture. About the Role OpenAI is seeking a Security Software Engineer to join the Infrastructure Security (InfraSec) team. InfraSec safeguards the core… more
- Zettar Inc. (San Francisco, CA)
- Enterprise Sales Engineer - San Francisco Bay Area Job summary Enterprise Sales Engineer Ready for an opportunity to make a significant impact with a disruptive ... 2019. It is the winner of the fiercely competitive Supercomputing Asia 2019 Data Mover Challenge, the only international...let's talk! Zettar is looking for an Enterprise Sales Engineer to provide technical direction and business guidance to… more
- Menlo Ventures (San Francisco, CA)
- …of human-level capabilities. You could describe yourself as both a scientist and an engineer . As a Research Engineer on Alignment Science, you'll contribute to ... extensive continuous integration and testing infrastructure; several very large supercomputing clusters and the associated tooling. Interpretability - The… more
- Lawrence Berkeley National Laboratory (San Francisco, CA)
- … Engineer to architect and develop software for advanced supercomputing systems. This role involves automating system management, analyzing performance, and ... collaborating with national labs and open-source communities. Candidates should have extensive experience with Linux systems programming and be adept in modern development practices, contributing to high-impact scientific research. #J-18808-Ljbffr more
- OpenAI (San Francisco, CA)
- …AI models. In addition to delivering production‑grade silicon for OpenAI's supercomputing infrastructure, the team also creates custom design tools and methodologies ... specifically for AI. About the Role We are looking for an experienced Mechanical Engineer with 7+ years of experience in design of IT hardware from chip/package to… more
- OpenAI (San Francisco, CA)
- …researchers are minimally impacted by hardware faults. We maximize available supercomputing capacity for researchers and maintain the reliability, scalability, and ... scheduling, quota management, and job execution workflows. About The Role As a Software Engineer on the Platform Visualization team, you will play a critical role in… more
- OpenAI (San Francisco, CA)
- …and responsible AI deployment over unchecked growth. About the role As a software engineer on the Fleet Hardware team, you will be responsible for the reliability ... and devise innovative solutions to maintain the health and efficiency of our supercomputing infrastructure. Our team empowers strong engineers with a high degree of… more
- Lawrence Berkeley National Laboratory (San Francisco, CA)
- … supercomputing infrastructure. Your primary role will be to engineer robust, scalable, dynamic, and automated solutions for high-performance computing (HPC) ... Berkeley National Laboratory is hiring an HPC System Software Engineer within the National Energy Research Scientific Computing Center...selected candidate(s) will be hired at the Computer Systems Engineer 3 or 4 (CSE3 or CSE4) depending on… more
- OpenAI (San Francisco, CA)
- Software Engineer , Fleet Hardware Health | OpenAI Careers Software Engineer , Fleet Hardware Health Scaling - San Francisco Apply now (opens in a new window) ... over unchecked growth. About the role As a software engineer on the Fleet Hardware team, you will be...solutions to maintain the health and efficiency of our supercomputing infrastructure. Our team empowers strong engineers with a… more
- OpenAI (San Francisco, CA)
- …- from LLMs to recommender systems - to run reliably on advanced supercomputing platforms. That includes adapting our software stack to new types of accelerators, ... tuning system performance end-to-end, and removing bottlenecks across every layer of the stack. About the Role On the Accelerators team, you will help OpenAI evaluate and bring up new compute platforms that can support large-scale AI training and inference.… more
- OpenAI (San Francisco, CA)
- …AI models. In addition to delivering production‑grade silicon for OpenAI's supercomputing infrastructure, the team also creates custom design tools and methodologies ... optimized specifically for AI. About the Role We're looking for a RTL Engineer to design and implement key compute, memory, and interconnect components for our… more
- Menlo Ventures (San Francisco, CA)
- …at our scale often requires solving novel systems problems. As a Performance Engineer , you will be responsible for identifying these problems, and then developing ... Have significant software engineering or machine learning experience, particularly at supercomputing scale Are results-oriented, with a bias towards flexibility and… more
- Fluidstack (San Francisco, CA)
- …Our team is small, highly motivated, and focused on providing a world class supercomputing experience. We put out customers first in everything we do, working hard ... to not just win the sale, but to win repeated business and customer referrals. We hold ourselves and each other to high standards. We expect you to care deeply about the work you do, the products you build, and the experience our customers have in every… more