- OpenAI (San Francisco, CA)
- …across blob storage down to hardware caching Much more! About the Role As an engineer within Fleet infrastructure , you will design, write, deploy, and operate ... This role will support the fleet infrastructure team at OpenAI. The fleet team focuses...on running the world's largest, most reliable, and frictionless GPU fleet to support OpenAI's general purpose model training… more
- BlackLine (Pleasanton, CA)
- …Infrastructure and Environment Management Design and manage training infrastructure including distributed training orchestration, GPU /TPU resource allocation, ... 2001, BlackLine has become a leading provider of cloud software that automates and controls the entire financial close...BlackLine! Make Your Mark: As a Machine Learning Operations Engineer , you will play a pivotal role in bridging… more
- F. Hoffmann-La Roche Gruppe (Pleasanton, CA)
- …to come. Join Roche, where every voice matters. The Position Principal DevOps Engineer - ML/AI Algorithms Developing software is great, but developing ... a purpose is even better! As a Principal DevOps Engineer - ML/AI Algorithms, you will work on products...for DevOps, paving the way for seamless and efficient software delivery processes. Location This role can be based… more
- BlackLine (Pleasanton, CA)
- …Infrastructure and Environment Management Design and manage training infrastructure including distributed training orchestration, GPU /TPU resource allocation, ... Science, Machine Learning, Data Science, or a related field. 10+ years in ML infrastructure , DevOps, and software system architecture; 4+ years in leading MLOps… more
- Prima Mente (San Francisco, CA)
- …partners Expected Growth In 1 month you are deploying workflows on GPU -accelerated cloud infrastructure to process your own multi-omic experiments, while ... ability to identify GPU vs. CPU optimisation in other contexts Operational software engineering skills: the ability to write high-quality code to be the backbone… more
- Vizcom (San Francisco, CA)
- …design technology company in San Francisco is seeking a Senior Software Engineer for Backend (Systems / Infrastructure ). You will architect and deliver ... scalability as demand grows. This role involves optimizing APIs, managing GPU workloads, and collaborating with cross-functional teams. Ideal candidates have 5-8… more
- OpenAI (San Francisco, CA)
- An innovative company is seeking a talented software engineer to join their dynamic Inference team. This role involves designing and implementing ... infrastructure for large-scale multimodal models, focusing on high-performance delivery of audio and image inputs. You'll collaborate closely with researchers and… more
- Crusoe Energy Systems LLC (San Francisco, CA)
- A rapidly growing technology company in San Francisco is seeking a Senior Software Infrastructure Engineer to manage cloud operations and develop automation ... tools. The ideal candidate will have strong experience in Linux and hardware troubleshooting, with knowledge of Kubernetes, Docker, and server provisioning. This position offers competitive compensation and benefits in a rapidly growing company, playing a key… more
- pony.ai (Fremont, CA)
- …well as to influence the next-generation compute platform architecture design and software infrastructure . Apply model optimization and efficient deep learning ... public at NASDAQ in Nov. 2024. Responsibility The ML Infrastructure team at Pony.ai provides a set of tools...evaluation, optimization, deployment, and monitoring. As a Machine Learning Engineer in ML Runtime & Optimization, you will be… more
- OpenAI (San Francisco, CA)
- …urgency of keeping mission-critical systems running Qualifications Experience as an infrastructure , systems, or distributed systems engineer in large-scale or ... data center designs, turn them into real, working systems and build any software needed for running large-scale frontier model trainings. Our mission is to bring… more
- OpenAI (San Francisco, CA)
- …bring those models to life. About the Role We are looking for an engineer to design and implement the dataset infrastructure that powers OpenAI's next-generation ... team is responsible for designing and running OpenAI's LLM training and inference infrastructure that powers frontier models at massive scale. Our systems unify how… more
- Voxel (San Francisco, CA)
- …is backed by industry leading VC's. Voxel is looking for a Staff Machine-Learning Infrastructure Engineer to drive the next wave of our computer-vision platform ... quality-tiered datasets that stay within cost constraints. Build and operate training infrastructure - create multi- GPU / multi-node training frameworks (Ray,… more
- Decagon AI, Inc. (San Francisco, CA)
- …resolve issues quickly and accurately. About the Role We're hiring a Senior Infrastructure Engineer to design, build, and operate production infrastructure ... into simple, reliable designs. Even better Experience being an early backend/platform/ infrastructure engineer at another company Strong Kubernetes experience… more
- OpenAI (San Francisco, CA)
- Software Engineer , Infrastructure Security | OpenAI Careers Software Engineer , Infrastructure Security Security - Remote - US, San Francisco, ... a strong, collaborative security culture. About the Role OpenAI is seeking a Security Software Engineer to join the Infrastructure Security (InfraSec) team.… more
- OpenAI (San Francisco, CA)
- About the Role As an engineer within Fleet infrastructure , you will design, write, deploy, and operate infrastructure systems for model deployment and ... training on one of the world's largest GPU fleet. The scale is immense, the timelines are...product teams to understand workload requirements Collaborate with hardware, infrastructure , and business teams to provide a high utilization… more
- Eloquent AI (San Francisco, CA)
- …to work cross-functionally in a fast-moving startup. Bonus Points If You've built infrastructure for AI or ML workloads ( GPU orchestration, model serving, or ... financial services. Your Role Build, automate, and maintain cloud infrastructure using AWS and Infrastructure as Code...and maintain cloud infrastructure using AWS and Infrastructure as Code (IaC) tools such as Terraform and… more
- Julius (San Francisco, CA)
- …compute. What You'll Do Design and operate secure, multi‑tenant container infrastructure with fast startup and smart autoscaling. Ship on‑prem/private cloud ... for containerized, multi‑tenant systems. Nice to Have gVisor/Kata/Firecracker; Cilium/eBPF; GPU scheduling; serverless autoscaling (KEDA/Knative/Karpenter). Delivered on‑prem or air‑gapped… more
- Eloquent AI (San Francisco, CA)
- …and product as we redefine the future of financial services. Your Role As a Senior Software Engineer , AIOps & Infrastructure at Eloquent AI, you will be ... reliability of critical AI systems. Requirements 5+ years of experience in software engineering, MLOps, or infrastructure development. Strong expertise in… more
- 10X Recruiting Partners (San Francisco, CA)
- …the entire search-to-hire process as smooth as possible. We're seeking a highly skilled Software Engineer (C++ Systems) to join our client's team and help build ... systems from day one and tackling technically demanding challenges at the forefront of GPU infrastructure . What You'll Do Optimize performance of our C++ GPU… more
- Amazon (San Francisco, CA)
- Sr. Software Development Engineer , Production and Post Production Technology Job ID: 3146650 | Amazon.com Services LLC Amazon MGM Studios is seeking a ... Software Development Engineer III to lead technical...workflows (eg, media transcoding and processing engine, real‑time collaboration infrastructure , GPU ‑accelerated rendering system, or content delivery… more