- San Francisco Compute Co. (San Francisco, CA)
- A cutting-edge technology firm in San Francisco is seeking an engineer for Large Scale Inference . You will build and scale software systems to optimize ... compute for inference workloads. The ideal candidate enjoys software craftsmanship, is a strong communicator, and has an appreciation for reliable systems. The role offers generous equity, competitive salary, and benefits including unlimited paid time off… more
- Amazon (San Francisco, CA)
- …and performance tuning of a wide variety of LLM model families, including massive scale large language models like the Llama family, DeepSeek and beyond. The ... in debugging, profiling, and implementing best software engineering practices in large - scale systems. PREFERRED QUALIFICATIONS - Familiarity with PyTorch, JIT… more
- Amazon (San Francisco, CA)
- …and performance tuning of a wide variety of LLM model families, including massive scale large language models like the Llama family, DeepSeek and beyond. The ... in debugging, profiling, and implementing best software engineering practices in large ‑ scale systems. Preferred Qualifications Familiarity with PyTorch, JIT… more
- Crusoe Energy Systems LLC (San Francisco, CA)
- …thousands of customers. From day one, you'll own critical subsystems for managed AI inference , helping to serve large language models (LLMs) to a global ... Multimodal). Technical Skills: Proficiency in Golang or Python for large - scale , production-level services. (Preferred) Familiarity with AI infrastructure,… more
- Capital One (San Francisco, CA)
- …delivering models at scale both in terms of training data and inference volumes. Experience in delivering libraries, platform level code or solution level code ... delivering models at scale both in terms of training data and inference volumes. Experience in delivering libraries, platform level code or solution level code… more
- Smallest Inc. (San Francisco, CA)
- …and how to work around them is excited by the challenge of ultra‑low latency and large ‑ scale real‑time inference loves debugging at the CUDA + model level ... GPU constraints matter, and how to restructure models for real-world inference performance. You'll work across CUDA kernels, model graph optimizations,… more
- Capital One National Association (San Francisco, CA)
- …delivering models at scale both in terms of training data and inference volumes. Experience in delivering libraries, platform level code or solution level code ... Has a deep understanding of the foundations of AI methodologies. Experience building large deep learning models, whether on language, images, events, or graphs, as… more
- Menlo Ventures (San Francisco, CA)
- …the most efficient and impactful use of our compute resources, be it inference or training. As an Engineering Manager on these teams you will be responsible for ... effective Strong candidates may also have experience with: High performance, large - scale ML systems GPU/Accelerator programming ML framework internals Language… more
- Amazon (San Francisco, CA)
- …driving. Track record of successful production ML deployments. Experience with large ‑ scale distributed environments for ML training and inference . History of ... foundation models on vast amounts of Amazon data and infer at Amazon scale , taking advantage of latest developments in hardware and deep learning libraries.… more
- Amazon (San Francisco, CA)
- …driving. Track record of successful production ML deployments. Experience with large ‑ scale distributed environments for ML training and inference . History of ... foundation models on vast amounts of Amazon data and infer at Amazon scale , taking advantage of latest developments in hardware and deep learning libraries.… more
- Capital One (San Francisco, CA)
- …Worked on scaling graph models to greater than 50m nodes* Experience with large scale deep learning based recommender systems* Experience with production ... delivering models at scale both in terms of training data and inference volumes.* Experience in delivering libraries, platform level code or solution level code… more
- Capital One National Association (San Francisco, CA)
- …models at scale both in terms of training data and inference volumes. Experience in delivering libraries, platform‑level code, or solution‑level code to existing ... Experience scaling graph models to greater than 50M nodes and building large ‑ scale deep‑learning recommender systems. Experience optimizing training for very … more
- Varo Money, Inc. (San Francisco, CA)
- …such as fraud detection algorithms, recommender systems, dynamic credit risk models, and large ‑ scale causal inference models, as well as processes for ... the fraud detection, risk modeling, personalization, operations, and causal inference spaces. About the Data Science Team Varo's Data...mode, you will get to work on the most impactful data science problems from day one. We rely… more
- Recruiting From Scratch (San Francisco, CA)
- …stakeholders to deliver impactful analytical solutions. Work hands‑on with large - scale , multimodal data including network, cyber, and operational datasets. ... Databricks, Spark, Dask) to enable scalable and efficient model training and inference . Maintain a strong customer focus while communicating complex findings in… more
- Capital One National Association (San Francisco, CA)
- …models at scale both in terms of training data and inference volumes. Experience in delivering libraries, platform‑level code or solution‑level code to existing ... available for use Demonstrated ability to guide the technical direction of a large ‑ scale model training team Experience working with 500+ node clusters of… more
- Eloquent AI, Inc. (San Francisco, CA)
- …we push the boundaries of AI while staying grounded in practical impactful solutions. With a global presence spanning San Francisco, London, and Lisbon, ... that are not only innovative but also practical, scalable, and impactful . We prioritize simple, effective solutions over unnecessary complexity, valuing empiricism… more
- Cartesia (San Francisco, CA)
- …invented State Space Models or SSMs, a new primitive for training efficient, large - scale foundation models.Our team combines deep expertise in model innovation ... architectures). * Design novel architectures that improve model quality, inference efficiency, and adaptability across diverse deployment environments, from cloud… more
- Amadeus Search (San Francisco, CA)
- …Work On Build backend infrastructure and APIs that power AI-native interactions with large - scale web data Design systems for retrieval, ranking, parsing, and ... Academic or project pedigree (eg Olympiads, SAIL, MIT CSAIL, IITs, or impactful open-source work) Familiarity with production AI systems, evaluation pipelines, and… more
- PeopleConnect Staffing (San Francisco, CA)
- …clinical data. Data engineering experience, including annotation workflows and managing large ‑ scale labelled datasets. Benefits and Perks Competitive salary and ... other machine learning systems, including model training, MLOps pipelines, and inference . Build, optimize, and maintain robust machine learning pipelines leveraging… more
- DoorDash (San Francisco, CA)
- …GPT-OSS or BERT. + Hands-on experience with Kubernetes/EKS, microservice architectures, and large - scale orchestration for inference workloads. + Cloud ... model serving to drive the next generation of our inference platform. This is a highly technical, hands-on role:...8+ years of engineering experience, including building or operating large - scale , high-QPS ML serving systems. + Bring… more