• Production Systems Engineer

    Meta (Menlo Park, CA)
    …platforms, all the way to mass production and deployment. **Required Skills:** Production Systems Engineer , AI Systems Responsibilities: ... **Summary:** Meta is seeking a Systems Engineer to join our Release to Production...Inference Accelerator (MTIA) program as a part of the AI /ML initiatives supporting large scale AI Training… more
    Meta (04/25/25)
    - Save Job - Related Jobs - Block Source
  • Production Systems Engineer

    Meta (Menlo Park, CA)
    …health and lifecycle of servers in production . **Required Skills:** Production Systems Engineer , Fleet AI Systems Responsibilities: 1. Interface ... **Summary:** Meta is seeking a Production Systems Engineer to...systems issues. 15. 2+ years of experience supporting AI or HPC systems and/or related … more
    Meta (03/29/25)
    - Save Job - Related Jobs - Block Source
  • Production Systems Engineer

    Meta (Menlo Park, CA)
    …health and lifecycle of servers in production . **Required Skills:** Production Systems Engineer , Fleet AI Systems Lead Responsibilities: 1. Lead ... **Summary:** Meta is seeking an experienced Production Systems Engineer to...systems issues. 18. 4+ years of experience supporting AI or HPC systems and/or related … more
    Meta (03/29/25)
    - Save Job - Related Jobs - Block Source
  • Hardware Systems Engineer , NPI…

    Meta (Menlo Park, CA)
    **Summary:** Meta is seeking a Systems Engineer to join our Release to Production (RTP) team working on AI /ML initiatives supporting large scale AI ... services, and data center operations teams to enable new systems that will be deployed in our production...Silicon hyperscalar bring up and validation. **Required Skills:** Hardware Systems Engineer , NPI AI Responsibilities:… more
    Meta (04/24/25)
    - Save Job - Related Jobs - Block Source
  • Hardware Systems Engineer

    Meta (Menlo Park, CA)
    … based approach to the new product introduction (NPI) phase. **Required Skills:** Hardware Systems Engineer , AI NPI Responsibilities: 1. Drive and execute ... services, and data center operations teams to enable new systems that will be deployed in our production...strategy (hardware and software), with a focus on various AI /HPC hardware systems in datacenter applications. 2.… more
    Meta (05/07/25)
    - Save Job - Related Jobs - Block Source
  • AI /HPC Systems Performance…

    Meta (Menlo Park, CA)
    …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI /HPC Systems Performance Engineer Responsibilities: 1. Lead ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...interconnect with minimal latency. To improve performance of these systems we constantly look for opportunities across stack: network… more
    Meta (04/20/25)
    - Save Job - Related Jobs - Block Source
  • AI /HPC Systems Performance…

    Meta (Menlo Park, CA)
    …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI /HPC Systems Performance Engineer Responsibilities: 1. Active ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...a loss-less fabric interconnect. To improve performance of these systems we constantly look for opportunities across stack: network… more
    Meta (03/05/25)
    - Save Job - Related Jobs - Block Source
  • Lead AI Engineer

    Capital One (San Francisco, CA)
    Lead AI Engineer At Capital One, we are...- scalability, cost, latency, throughput - of large scale production AI systems . + Contribute to ... latest AI research and AI systems , and judiciously apply novel techniques in production...regularly worked. Cambridge, MA: $193,400 - $220,700 for Lead AI Engineer McLean, VA: $193,400 - $220,700… more
    Capital One (05/03/25)
    - Save Job - Related Jobs - Block Source
  • Machine Learning Engineer III, FAR…

    Amazon (San Francisco, CA)
    …team, you'll be instrumental in transforming cutting-edge research into high-performance production systems . You'll collaborate directly with scientists to ... Peter Chen to make breakthrough foundation models run at production scale. As a Senior Machine Learning Engineer...We tackle some of the most challenging problems in AI and robotics, from developing sophisticated perception systems more
    Amazon (03/22/25)
    - Save Job - Related Jobs - Block Source
  • Machine Learning Engineer , AI

    Cisco (San Francisco, CA)
    …learning technologies. The ideal candidate will help build and maintain scalable AI systems while ensuring robust deployment and operational excellence. ... part of our journey! **Role** As the Machine Learning Engineer , AI Platform in the Splunk ...Engineers and Applied Scientists to build efficient model serving systems + Monitor system performance and implement improvements for… more
    Cisco (03/21/25)
    - Save Job - Related Jobs - Block Source
  • Software Engineer (Technical Leadership)…

    Meta (Menlo Park, CA)
    …this area and implementing them towards product needs. **Required Skills:** Software Engineer (Technical Leadership) - AI Specialist Responsibilities: 1. Help ... using AI /ML approaches 13. Experience in applying research to production problems 14. Experience communicating research for public audiences of peers 15.… more
    Meta (04/02/25)
    - Save Job - Related Jobs - Block Source
  • Production Systems Engineer

    Meta (Menlo Park, CA)
    …validation, supporting customer deployment, production issue triage. **Required Skills:** Production Systems Engineer , Cooling & Power Responsibilities: ... **Summary:** Meta is seeking a Systems Engineer to join our Release to Production...scaling and deployment challenges requires us to take a systems based approach to AI system bring… more
    Meta (04/20/25)
    - Save Job - Related Jobs - Block Source
  • AI /HPC Network Engineer

    Meta (Menlo Park, CA)
    …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI /HPC Network Engineer Responsibilities: 1. Design, develop, test and ... operate networking systems to support large scale AI training...more. 5. Be oncall to learn from real world production challenges and take the lessons to improve current… more
    Meta (05/08/25)
    - Save Job - Related Jobs - Block Source
  • Research Engineer , Language - Monetization…

    Meta (Menlo Park, CA)
    …and drive SOTA research across the Monetization organization. **Required Skills:** Research Engineer , Language - Monetization AI Responsibilities: 1. Develop and ... **Summary:** We are the Monetization Ranking AI Research organization, dedicated to delivering personalized ads that maximize both user utility and advertiser value.… more
    Meta (05/06/25)
    - Save Job - Related Jobs - Block Source
  • Software Development Engineer , Frontier…

    Amazon (San Francisco, CA)
    …team, you'll be instrumental in transforming novel research into high-performance production systems . You'll collaborate directly with scientists to optimize ... where you'll contribute to breakthrough foundation models run at production scale. As a Software Development Engineer ...We tackle some of the most challenging problems in AI and robotics, from developing sophisticated perception systems more
    Amazon (03/04/25)
    - Save Job - Related Jobs - Block Source
  • Lead Software Engineer with AI /ML…

    Cisco (San Francisco, CA)
    …software engineering, Dev/Ops or related fields. * Experience in building large scale AI models and systems using MLOps practices; including deep learning models ... and accessible for everyone. As the Lead Machine Learning Engineer for Meraki Assurance, you will drive the engineering...Jenkins, Terraform Bonus points: * Experience building scalable and production ready Generative AI solutions * Have… more
    Cisco (04/18/25)
    - Save Job - Related Jobs - Block Source
  • Principal Machine Learning Engineer

    Cisco (San Francisco, CA)
    …(eg, Docker, Kubernetes). + Experience with model deployment and serving into production environments + Knowledge of version control systems , especially Git. ... become a part of our journey! **Principal Machine Learning Engineer (MLE), Artificial Intelligence** Join us as we pursue...group, you will be responsible for developing the core AI /ML capabilities to power the entire Splunk product portfolio… more
    Cisco (04/13/25)
    - Save Job - Related Jobs - Block Source
  • Software Engineer , SystemML - AI

    Meta (Menlo Park, CA)
    …space of GenAI/LLM scaling reliability and performance. **Required Skills:** Software Engineer , SystemML - AI Networking Responsibilities: 1. Tech-leading the ... learning/deep learning domains: Distributed ML Training, GPU architecture, ML systems , AI infrastructure, high performance computing, performance optimizations,… more
    Meta (04/22/25)
    - Save Job - Related Jobs - Block Source
  • Software Engineer , SystemML - AI

    Meta (Menlo Park, CA)
    …space of GenAI/LLM scaling reliability and performance. **Required Skills:** Software Engineer , SystemML - AI Networking Responsibilities: 1. Enabling reliable ... learning/deep learning domains: Distributed ML Training, GPU architecture, ML systems , AI infrastructure, high performance computing, performance optimizations,… more
    Meta (03/21/25)
    - Save Job - Related Jobs - Block Source
  • Staff Software Engineer , ML…

    Snap Inc. (Palo Alto, CA)
    …always execute with privacy at the forefront. We're looking for a Staff Software Engineer , Machine Learning Infrastructure to join the AI Training Platform (aka ... evaluation, and inference in the cloud such as Vertex AI , Google Kubernetes Engine (GKE) and Sagemaker + Build...industry software engineering experience + Experience building large scale production machine learning systems or data pipelines… more
    Snap Inc. (04/24/25)
    - Save Job - Related Jobs - Block Source