- pony.ai (Fremont, CA)
- …model development, evaluation, optimization, deployment, and monitoring. As a Machine Learning Engineer in ML Runtime & Optimization, you will be developing ... ML operator libraries. + Work across the entire ML framework/compiler stack (eg Torch, CUDA and...learning frameworks and libraries. + Deep knowledge on system performance , GPU optimization or ML compiler. Compensation… more
- General Motors (San Francisco, CA)
- **Job Description** **Senior AI/ ML Tooling Engineer ** Role: We are looking for an ML tooling engineer to build tools to analyze and optimize ... models. You will develop and enhance GM's internal ML tooling for high performance software by...developing and deploying machine learning models + GPU programming ( CUDA ) and familiarity with ML SW stack… more
- Meta (Menlo Park, CA)
- …strategy that delivers a highly flexible platform to train & serve new DL/ ML model architectures, combined with auto-tuned high performance for production ... hardware software codesign for AI domain specific problems. **Required Skills:** Software Engineer , Systems ML - Frameworks / Compilers / Kernels… more
- pony.ai (Fremont, CA)
- …including model development, evaluation, optimization, deployment and monitoring. As a Machine Learning Engineer Intern in ML Runtime & Optimization, you will be ... Work across the entire AI framework/compiler stack (eg Torch, CUDA and TensorRT), support model development and prototype key...learning frameworks and libraries. + Strong knowledge on system performance , GPU optimization or ML compiler. Note… more
- Amazon (San Francisco, CA)
- …in Python, C++ and CUDA programming - Experience with TensorRT or similar ML optimization frameworks - Track record of optimizing ML models for production ... run at production scale. As a Senior Machine Learning Engineer embedded in our science team, you'll be instrumental...Preferred Qualifications - Expertise in NVIDIA's ML stack (cuDNN, CUDA Graph, etc.) -… more
- Meta (Menlo Park, CA)
- …and SW stacks around NCCL and PyTorch to improve the full-stack distributed ML reliability and performance (eg Large-Scale GenAI/LLM training) from the trainer ... for engineers to work on the space of GenAI/LLM scaling reliability and performance . **Required Skills:** Software Engineer , SystemML - Scaling / Performance… more
- Amazon (San Francisco, CA)
- …breakthrough foundation models run at production scale. As a Software Development Engineer embedded in our science team, you'll be instrumental in transforming novel ... research into high- performance production systems. You'll collaborate directly with scientists to optimize large-scale transformer architectures for robotics… more
- Meta (Menlo Park, CA)
- …and SW stacks around NCCL and PyTorch to improve the full-stack distributed ML reliability and performance (eg Large-Scale GenAI/LLM training) from the trainer ... networking (RDMA), Distributed ML Training, GPU architecture, ML systems, AI infrastructure, high performance computing,...and parallel computing 11. Knowledge of GPU architectures and CUDA programming 12. Knowledge of ML , deep… more