- Apple Inc. (Cupertino, CA)
- …California seeks a senior /principal engineer to architect and build distributed ML infrastructure. This role involves optimizing GPU compute systems and ... to enhance AI capabilities. Candidates should have substantial experience in GPU programming and distributed systems, alongside a technical degree. The compensation… more
- NVIDIA Corporation (Santa Clara, CA)
- …with teams with varied strengths including GPU Compute, Distributed Systems, Networking, ML Infra , AI Platform, and Cloud Services to ensure engineers have ... Senior Software Engineer, Observability page is loaded## ...cost efficiency of telemetry pipelines while supporting high-volume workloads (AI/ ML , HPC clusters, GPU infrastructure)* Embedding security… more
- NVIDIA Corporation (Santa Clara, CA)
- …Physics, Mathematics, or a related technical field.* 8+ years of hands-on validated ML /DL Infra experience in Autonomous Vehicles with focus on improving compute ... AI models to ensure the best performance on current- and next-generation GPU architectures.* Build collateral (notebooks, github repos, demos, etc.) applied to… more
- Google Inc. (Sunnyvale, CA)
- Senior Software Engineer, ML Infrastructure, Cloud AI...compiler and runtimes interact at a high level. Close infra gaps to help with end‑to‑end ML stack ... the Research to Production pipeline for both Training and Serving use cases for TPU, GPU and CPU accelerators. In this role, you will work on projects that improve… more
- jobr.pro (Sunnyvale, CA)
- …feedback. Understand how accelerator compiler and runtimes interact at a high level. Close infra gaps to help with end-to-end ML stack maturation (eg, reduce ... the Research to Production pipeline for both Training and Serving use cases for TPU, GPU and CPU accelerators. In this role, you will work on projects that improve… more
- Pathway Genomics Corporation (Palo Alto, CA)
- A leading AI startup is seeking a Senior ML Infrastructure / DevOps Engineer in Palo Alto. This role involves designing and maintaining GPU and CPU clusters ... for ML training, automating infrastructure provisioning, and managing robust ML pipelines. The ideal candidate has 5+ years in DevOps/SRE roles, deep familiarity… more
- GEICO (Palo Alto, CA)
- …and Great Careers.**GEICO AI ML Infrastructure team is seeking an exceptional Senior ML Platform Engineer to build and scale our machine learning ... LLMs (Llama, Mistral, Gemma, etc.)* Architect and manage Kubernetes clusters for ML workloads, including GPU scheduling, autoscaling, and resource optimization*… more
- Oracle (Santa Clara, CA)
- …is responsible for designing and developing fundamental architectural changes for GPU delivery, health monitoring, triage automation, and diagnostic services. These ... are essential for running distributed AI/ ML /HPC workloads across thousands of GPUs, leveraging technologies like RoCE and Infiniband. **Why Join Us?** + Innovative… more
- Google (Sunnyvale, CA)
- Senior Software Engineer, ML Infrastructure, Cloud AI...and runtimes interact at a high level. + Close infra gaps to help with end-to-end ML stack ... the Research to Production pipeline for both Training and Serving use cases for TPU, GPU and CPU accelerators. In this role, you will work on projects that improve… more
- LinkedIn (Mountain View, CA)
- …models across LLM and Personalization models. As an engineer, you will build compute efficient infra on top of native cloud, enable GPU based inference for a ... billions of parameters models and large scale feature engineering infra for all AI use cases from recommendation models,...models per quarter using thousands of features), and enabling GPU inference at scale. ML Ops: The… more
- NVIDIA (Santa Clara, CA)
- …with teams with varied strengths including GPU Compute, Distributed Systems, Networking, ML Infra , AI Platform, and Cloud Services to ensure engineers have ... NVIDIA's Observability team is seeking a Senior /Staff Engineer to compose and build the next-generation,...cost efficiency of telemetry pipelines while supporting high-volume workloads (AI/ ML , HPC clusters, GPU infrastructure) + Embedding… more
- NVIDIA (Santa Clara, CA)
- …Mathematics, or a related technical field. + 8+ years of hands-on validated ML /DL Infra experience in Autonomous Vehicles with focus on improving compute ... art AI models to ensure the best performance on current- and next-generation GPU architectures. + Build collateral (notebooks, github repos, demos, etc.) applied to… more