• Apple Inc. (San Francisco, CA)
    Senior Software Engineer , Model Inference San Francisco Bay Area, California, United States Software and Services Join Apple Maps to help build the best ... production-ready systems, providing technical guidance and feedback to influence upstream model design. Optimize inference execution across heterogeneous compute… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • OpenAI (San Francisco, CA)
    A leading AI research company in San Francisco seeks an engineer to optimize their powerful AI models for high-volume production environments. The ideal candidate ... has over 5 years of software engineering experience, strong familiarity with ML architectures, and experience with distributed systems. This role involves collaboration with researchers and focus on performance optimization. Compensation ranges from $325K to… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Amazon (San Francisco, CA)
    Senior Software Development Engineer , AI/ML, AWS Neuron, Model Inference Job ID: 3067759 | Amazon.com Services LLC The Annapurna Labs team at Amazon Web ... with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement and Acceleration team… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Hamilton Barnes Associates Limited (San Francisco, CA)
    …thousands of H100s, H200s, and B200s, ready to go for experimentation, full-scale model training, or inference . Our client operates high-performance GPU clusters ... with cost-efficient batch inference and expanding into low-latency, real-time inference and custom model hosting. This is a unique chance to join at an early… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Amazon (San Francisco, CA)
    Software Development Engineer , AI/ML, AWS Neuron, Model Inference The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software ... with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement and Acceleration team… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • OpenAI (San Francisco, CA)
    …research progression via model inference . About the Role We're looking for a senior engineer to design and build the load balancer that will sit at the ... About the Team Our Inference team brings OpenAI's most capable research and...jobs where requests must stay "sticky" to the same model instance for hours or days and where even… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Menlo Ventures (San Francisco, CA)
    …to new inference features (eg, structured sampling, prompt caching) Supporting inference for new model architectures Analyzing observability data to tune ... to build beneficial AI systems. About the role Our Inference team is responsible for building and maintaining the...by serving our models via the industry's largest compute-agnostic inference deployments. We are responsible for the entire stack… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Comfy (San Francisco, CA)
    …AI platform company in San Francisco is seeking a talented individual to optimize model inference for their advanced visual AI product. The ideal candidate will ... engage in building efficient AI models and tackling complex challenges. The role requires a strong background in PyTorch and a passion for pushing performance limits. Join a dynamic team focused on creating innovative AI solutions and shaping the future of… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Databricks Inc. (San Francisco, CA)
    …customers to operationalize models at scale with strong SLAs and cost efficiency. As a Senior Engineer , you'll play a critical role in shaping both the product ... of experience building and operating large-scale distributed systems. Experience in model serving, inference systems, or related infrastructure (eg, routing,… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Tether Operations Limited (San Francisco, CA)
    …for multimodal language models, integrating text, visual, and audio modalities. Engineer scalable training and inference pipelines optimized for large‑scale ... About the job As a member of the AI model team, you will drive innovation in architecture development...pipeline from data processing & data loading to training, inference , and optimization. Experience working with large‑scale text data,… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Amazon (San Francisco, CA)
    …generative AI applications at scale. Additionally, we work closely with foundational model providers to optimize AI models for Amazon Silicon, enhancing performance ... continued pre‑training, fine‑tuning, and Reinforcement Learning with Human Feedback (RLHF) Model Optimization on AWS Silicon: Optimize AI models for deployment on… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Crusoe Energy Systems LLC (San Francisco, CA)
    …About This Role: The Crusoe Cloud Managed AI team seeks an ambitious and experienced Senior Software Engineer to join their team. You'll have a pivotal role in ... and fault tolerance. Optimize performance across all stages of the AI inference pipeline, including model loading, execution, and response handling. Continuously… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Crusoe Energy Systems LLC (San Francisco, CA)
    About This Role: As a Senior Staff Software Engineer on the Managed AI team at Crusoe, you'll have a pivotal role in shaping the architecture and scalability of ... our next-generation AI inference platform. You will lead the design and implementation...systems for our AI services, including resilient fault-tolerant queues, model catalogs, and scheduling mechanisms optimized for cost and… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Capital One National Association (San Francisco, CA)
    …Design, develop, test, deploy, and support AI software components including foundation model training, large language model inference , similarity search, ... developing AI and ML algorithms or technologies (eg, LLM Inference , Similarity Search and VectorDBs, Guardrails, Memory) using Python,...McLean, VA: $225,400 - $257,200 for Sr. Lead AI Engineer New York, NY: $245,900 - $280,600 for Sr.… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Icon Ventures (San Francisco, CA)
    …training/ inference on GPUs, and familiarity with modern MLOps ( model registry, feature stores, monitoring, drift) Solid experiment design (offline/online), ... learning coach that's recognized as best‑in‑class. About the Role As an Applied AI Engineer , you will be working at the forefront of our AI strategy, shaping… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Highlight US Inc (San Francisco, CA)
    Job Title: Senior Machine Learning Engineer / Researcher Location: NYC or SF (On-site) About Highlight AI Highlight AI is a cutting-edge desktop assistant ... mode and expanding our team. The Role As a Senior Machine Learning Engineer , you will drive...of ML fundamentals: supervised & unsupervised learning, deep learning, model evaluation metrics, deployment, inference latency trade‑offs… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • CompScience (San Francisco, CA)
    … drift, latency, and throughput. Familiarity with managing hybrid or edge inference deployments (Greengrass, Jetson) and supporting model fine-tuning workflows. ... are looking for an experienced and self-motivated Sr MLOps Engineer to join our growing team and take ownership...Step Functions, Batch) to support both data-centric pipelines and model execution. Develop and maintain robust CI/CD pipelines for… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Voxel (San Francisco, CA)
    …by industry leading VC's. Voxel is looking for a Staff Machine-Learning Infrastructure Engineer to drive the next wave of our computer-vision platform for workplace ... ground-truth data & labeling workflows, large-scale training infrastructure, and continuous model lifecycle management . If you excel at designing cloud-native,… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Vizcom (San Francisco, CA)
    …scale, a modern TypeScript stack, and serving real enterprise The Role As the Senior Software Engineer - Backend (Systems / Infrastructure) you'll architect and ... Within your first 90 days you will: ship one critical backend feature (eg, model queue, caching layer, or realtime service), reduce p95 latency in one major API… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • General Motors (San Francisco, CA)
    Senior AI/ML Tooling Engineer Role: We are looking for an ML tooling engineer to build tools to analyze and optimize distillation, training, and inference ... What You'll Do Identify new opportunities to improve both training and inference efficiency Build workflows for correctness and performance analysis on physical… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source