- Pantera Capital (San Francisco, CA)
- …Full time Location Type Hybrid Department AI We are looking for an AI Inference engineer to join our growing team. Our current stack is Python, Rust, ... learning models for real-time inference . Responsibilities Develop APIs for AI inference that will be used by both internal and external customers Benchmark… more
- Pantera Capital (San Francisco, CA)
- A financial technology firm in San Francisco is seeking an experienced AI Inference Engineer to develop APIs for AI inference used by both internal ... and external customers. Candidates should have experience with machine learning systems and deep learning frameworks like PyTorch, and familiarity with LLM architectures. The role supports a hybrid work environment and offers a competitive salary, equity, and… more
- quadric.io, Inc (Burlingame, CA)
- …GPNPU executes both NN graph code and conventional C++ DSP and control code. Role: The AI Inference Engineer in Quadric is the key bridge between the world ... of AI /LLM models and Quadric unique platforms. The AI Inference Engineer at Quadric will [1] port AI models to Quadric platform; [2] optimize the… more
- Menlo Ventures (San Francisco, CA)
- A technology-focused public benefit corporation in San Francisco seeks a skilled software engineer to join the inference team. This role involves building ... systems that power AI models like Claude, focusing on maximizing efficiency and enabling groundbreaking research. Ideal candidates have a background in distributed… more
- quadric.io, Inc (Burlingame, CA)
- A pioneering tech company is looking for an experienced AI Inference Engineer to bridge AI models and advanced processing platforms. This role requires ... expertise in AI model algorithms, strong C/C++ and Python skills, and...experience with deployment frameworks. You will optimize and benchmark AI models, ensuring efficient deployment in edge devices. The… more
- Quadric Inc. (Burlingame, CA)
- A leading technology company in California is seeking an AI Inference Engineer to bridge AI models with unique platforms. Key responsibilities include ... should have a Bachelor's or Master's degree, 5+ years' experience in AI frameworks, and proficiency in C/C++ and Python. Competitive benefits included, such… more
- Menlo Ventures (San Francisco, CA)
- …capabilities of our embodied systems. You will be responsible for optimizing AI inference processes from lightweight to billion-parameter models, ensuring our ... unforeseen compute and hardware constraints. Responsibilities Develop and optimize runtime AI inference pipelines for real-world robotic deployment. Build… more
- Amazon (San Francisco, CA)
- A leading technology company in Herndon, Virginia is seeking a Senior Software Development Engineer to work on AI /ML projects. You will design and optimize ... machine learning models for deployment on custom hardware accelerators, ensuring maximum performance. Ideal candidates will have over 5 years of experience, strong Python and C++ skills, and knowledge in machine learning principles. This role fosters a… more
- Capital One (San Francisco, CA)
- …financial services provider in San Francisco is seeking a Technical Specialist to develop AI and ML solutions. You'll need a strong foundation in engineering and at ... 4 years of experience programming in Python and deploying AI on cloud platforms. The ability to optimize solutions...The ability to optimize solutions and a passion for AI research are essential. This role offers a competitive… more
- Amazon (San Francisco, CA)
- …leading e-commerce platform in San Francisco is seeking a Software Development Engineer to develop and optimize machine learning models for custom hardware ... accelerators. This role involves performance tuning, debugging, and close collaboration with customers to enhance their models on AWS's services. The ideal candidate has strong programming skills in C++ and Python, along with a solid understanding of machine… more
- Virtue AI (San Francisco, CA)
- An innovative AI security company in San Francisco is seeking an Inference Engineer who will be pivotal in optimizing ML model inferences. The role requires ... deep knowledge of serving LLMs and experience in designing inference APIs. Candidates should be comfortable in a fast-paced...presents an opportunity to work at the cutting edge of AI security with competitive compensation and growth potential.… more
- San Francisco Compute Co. (San Francisco, CA)
- A cutting-edge technology firm in San Francisco is seeking an engineer for Large Scale Inference . You will build and scale software systems to optimize compute ... for inference workloads. The ideal candidate enjoys software craftsmanship, is a strong communicator, and has an appreciation for reliable systems. The role offers… more
- Capital One (San Francisco, CA)
- …Experience developing AI and ML algorithms or technologies (eg LLM Inference , Similarity Search and VectorDBs, Guardrails, Memory) using Python, C++, C#, Java, ... in engineering and mathematics, and your expertise in hardware, software, and AI enable you to see and exploit optimization opportunities that others miss.*… more
- OpenAI (San Francisco, CA)
- An innovative company is seeking a talented software engineer to join their dynamic Inference team. This role involves designing and implementing infrastructure ... researchers and product teams to push the boundaries of AI technology, ensuring reliable production services. If you thrive...enjoy tackling complex challenges, this opportunity offers a chance to make a significant impact in the AI landscape.… more
- Baseten (San Francisco, CA)
- …is seeking a skilled individual to enhance the API infrastructure supporting AI models. The role involves designing and optimizing backend services, focusing on ... performance and reliability. Candidates should have over 3 years of experience with distributed systems and be comfortable debugging complex systems. This unique opportunity includes a competitive compensation package and a supportive culture emphasizing… more
- Mvp VC (San Francisco, CA)
- A cutting-edge aerospace company in San Francisco is seeking a skilled software engineer to optimize and integrate the Ultimate Edge SDK for embedded platforms. Key ... responsibilities include collaborating on performance tuning and ensuring efficient deployment on NVIDIA hardware. Required qualifications include a Master's in Computer Engineering, expertise in C++/Python, and familiarity with containerization technologies.… more
- Loft Orbital Solutions (San Francisco, CA)
- A leading space technology company in San Francisco is seeking a skilled engineer to contribute to the development and optimization of the Ultimate Edge SDK. The ... role focuses on integrating ONNX-based runtimes and optimizing performance across embedded platforms. Candidates should have a master's degree and solid experience in C++ or Python, along with familiarity with embedded systems. This position offers a salary of… more
- Amazon (San Francisco, CA)
- Senior Software Development Engineer , AI /ML, AWS Neuron, Model Inference Job ID: 3067759 | Amazon.com Services LLC The Annapurna Labs team at Amazon Web ... of applied scientists, system engineers, and product managers to deliver state-of-the-art inference capabilities for Generative AI applications. Your work will… more
- Amazon (San Francisco, CA)
- Software Development Engineer , AI /ML, AWS Neuron, Model Inference The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software ... of applied scientists, system engineers, and product managers to deliver state‑of‑the‑art inference capabilities for Generative AI applications. Your work will… more
- Virtue AI (San Francisco, CA)
- About Virtue AI Virtue AI sets the standard for advanced...to join our core team. What You'll Do As an Inference Engineer , you will own how models are ... Built on decades of foundational and award-winning research in AI security, its AI -native architecture unifies automated...Serve and optimize LLM, embedding, and other ML models' inference across multiple model families Design and operate … more