• Distributed Embodied AI

    Bosch (Sunnyvale, CA)
    …journals such as CVPR, ICRA, IROS, RSS, NeurIPS and CoRL. **Job Description** As the Distributed Embodied AI Systems intern, you will perform research on ... that take advantage of technologies in the field of reliable distributed computing. We work with internal...future prediction for latency mitigation in distributed embodied AI systems . A… more
    Bosch (12/04/25)
    - Save Job - Related Jobs - Block Source
  • Research Intern - Reliability of Cloud…

    Microsoft Corporation (Redmond, WA)
    …healthcare, economics, and the environment. Are you passionate about building the future of reliable , large-scale cloud and AI systems ? The ** Systems ... Interns to tackle cutting-edge challenges at the intersection of distributed systems , AI systems...letter. **Preferred Qualifications** + Experience of building scalable and reliable systems . + Demonstrated ability to develop… more
    Microsoft Corporation (11/26/25)
    - Save Job - Related Jobs - Block Source
  • Senior Researcher - AI and Systems

    Microsoft Corporation (Redmond, WA)
    **Overview** **Help shape the future of reliable AI systems ** . At Microsoft Research's AI and Systems Reliability Group (Redmond, WA), we push the ... the computing landscape. We are seeking **Senior Researcher - AI and Systems Reliability - Microsoft Research**...Systems Reliability - Microsoft Research** areas such as distributed systems and reliability, formal methods and… more
    Microsoft Corporation (12/17/25)
    - Save Job - Related Jobs - Block Source
  • Senior Technical Systems AI

    NVIDIA (Santa Clara, CA)
    …design, or enterprise platform engineering. + Deep expertise in architecting large-scale distributed systems with a focus on reliability, performance, and ... record of publishing technical papers, architecture patterns, or thought leadership in AI systems . + Knowledge of observability tools, telemetry dashboards, and… more
    NVIDIA (10/16/25)
    - Save Job - Related Jobs - Block Source
  • Software Engineer Data/ AI /Intelligent…

    Cisco (San Jose, CA)
    …platforms, such as AWS, Azure, or Google Cloud. + Understanding of distributed systems concepts, including scalability, reliability, fault tolerance, and data ... Team** Our dedicated team members are building the future of Cisco's AI -driven platforms and data infrastructure, supporting innovation across the globe. You will… more
    Cisco (12/01/25)
    - Save Job - Related Jobs - Block Source
  • Principal Software Developer - OCI AI

    Oracle (Nashville, TN)
    …Work closely with a collaborative and experienced global team. - Expand your knowledge in AI , cloud computing, and distributed systems . - Contribute to one ... tools to operationalize Large Language Models (LLMs) and agentic AI systems . Our goal is to empower...will contribute to the design and implementation of scalable, distributed systems that serve LLMs and support… more
    Oracle (11/25/25)
    - Save Job - Related Jobs - Block Source
  • Senior Software Engineer, AI Resiliency

    NVIDIA (Santa Clara, CA)
    …and inference more reliable , scalable, and efficient. If you're passionate about AI , distributed systems , and high-performance computing, we want to hear ... driving down cluster downtime towards zero, ensuring that our AI systems remain robust and reliable...detection. + Hands-On Coding & Optimization: Contribute to large-scale distributed systems with high-quality, production-level C++ and… more
    NVIDIA (10/15/25)
    - Save Job - Related Jobs - Block Source
  • Principal Software Engineer- OCI AI

    Oracle (Frankfort, KY)
    …Work closely with a collaborative and experienced global team. - Expand your knowledge in AI , cloud computing, and distributed systems . - Contribute to one ... tools to operationalize Large Language Models (LLMs) and agentic AI systems . Our goal is to empower...will contribute to the design and implementation of scalable, distributed systems that serve LLMs and support… more
    Oracle (12/20/25)
    - Save Job - Related Jobs - Block Source
  • Software Developer 3 - OCI AI Platform

    Oracle (Columbus, OH)
    …. This is a highly technical, hands-on role where you'll build large-scale distributed systems , optimize AI /ML workflows, and collaborate with ... observability, CI/CD pipelines, and operational excellence. Troubleshoot complex issues in distributed systems and participate in on-call rotations as needed.… more
    Oracle (11/25/25)
    - Save Job - Related Jobs - Block Source
  • AI Applications and Innovation Engineer

    Oracle (Raleigh, NC)
    …learning, LLM applications, and agentic AI . Our team builds real-world AI systems and deploys scalable, production-ready solutions across Oracle's enterprise ... engineer to contribute to the design and deployment of advanced AI systems , including LLM-powered agents, Retrieval-Augmented Generation (RAG) pipelines,… more
    Oracle (12/19/25)
    - Save Job - Related Jobs - Block Source
  • Director of AI SRE & DevOps, AI .x

    Charles Schwab (San Francisco, CA)
    …+ Champion reliability, monitoring, observability, and operational best practices for AI systems and data pipelines. + Collaborate with cross-functional ... in the development process. You will ensure that the systems we build are robust, reliable , and...troubleshoot complex problems with ambiguous or incomplete data in distributed systems . + Curiosity about new technologies… more
    Charles Schwab (12/06/25)
    - Save Job - Related Jobs - Block Source
  • (USA) Principal, Software Engineer - AI

    Walmart (Sunnyvale, CA)
    …build dynamic, context-aware systems . 2. **Architecture ; Scalability:** + Architect scalable, distributed AI systems with a focus on performance, fault ... to lead the design, development, and deployment of advanced AI systems . This role involves architecting scalable...Walmart GTP, you will be building highly scalable and reliable APIs, services and applications which will drive the… more
    Walmart (11/27/25)
    - Save Job - Related Jobs - Block Source
  • Data Engineering Manager, Amazon Leo AI

    Amazon (Redmond, WA)
    …for a Data Engineering Manager who will design, implement, and operate globally distributed systems that enable Leo to achieve low single-digit-second query ... real-time analytics layer or lakehouse, and to support agentic AI capabilities on top. You'll build these systems...user experience in real time. We combine expertise in distributed systems , data lakehouse architectures, and applied… more
    Amazon (12/13/25)
    - Save Job - Related Jobs - Block Source
  • Data Engineer II, Amazon Leo AI Foundations

    Amazon (Redmond, WA)
    …is for a Data Engineer who will design, implement, and operate globally distributed systems that enable Leo to achieve low single-digit-second query responses ... real-time analytics layer or lakehouse, and to support agentic AI capabilities on top. You'll build these systems...user experience in real time. We combine expertise in distributed systems , data lakehouse architectures, and applied… more
    Amazon (12/13/25)
    - Save Job - Related Jobs - Block Source
  • Research Scientist, AI Networking (PhD)

    Meta (Menlo Park, CA)
    …leverage our large-scale GPU training and inference fleet through an observable, reliable and high-performance distributed AI /GPU communication stack. ... Skills:** Research Scientist, AI Networking (PhD) Responsibilities: 1. Enabling reliable and highly scalable distributed ML training on Meta's large-scale… more
    Meta (12/20/25)
    - Save Job - Related Jobs - Block Source
  • Senior AI Site Reliability Engineer

    Charles Schwab (San Francisco, CA)
    …bring curiosity, creativity, and technical depth to help shape the next generation of reliable AI at Schwab. **What you have** **Required Qualifications** + 8+ ... you will play a key role in ensuring our AI solutions are reliable , scalable, and resilient-enabling...Experience implementing monitoring, alerting, and incident response for large-scale distributed systems . + Proven track record in… more
    Charles Schwab (12/25/25)
    - Save Job - Related Jobs - Block Source
  • Senior Software Architect - Generative AI

    GE Vernova (Niskayuna, NY)
    …neural network architectures (eg, CNNs, RNNs, Transformers). + Expertise in designing scalable, distributed architectures for AI systems . + Strong experience ... Azure, GCP) and containerization (Kubernetes, Docker). + Familiarity with large-scale distributed systems and database technologies. + Experience in creating… more
    GE Vernova (12/11/25)
    - Save Job - Related Jobs - Block Source
  • Sr Machine Learning Engineer - GenAI, LLM, Agentic…

    eightfold.ai (Santa Clara, CA)
    …with opportunities. Responsibilities: + Research, design, development, and deployment of advanced AI agents and agentic systems . + Architect and implement ... About Eightfold : Eightfold is a global leader in AI -native enterprise talent platform, trusted by the world's largest...we are defining the next era of agentic talent systems . What sets Eightfold apart is not just the… more
    eightfold.ai (11/07/25)
    - Save Job - Related Jobs - Block Source
  • Principal AI Architect

    Paycom Online (Oklahoma City, OK)
    …give and receive concrete feedback.** + **Experience in deploying and scaling containerized, distributed software and AI systems using tools such as ... in "traditional" NLP tools** + **Experience in SOA, Modular Monolith Architecture, and distributed systems for AI training and inference** + **Familiarity… more
    Paycom Online (12/18/25)
    - Save Job - Related Jobs - Block Source
  • Senior Agentic AI SW Engineer

    Zebra Technologies (Holtsville, NY)
    …Knowledge of NLP, computer vision, or reinforcement learning. + Familiarity with multi-agent systems and distributed AI frameworks. + Strong communication ... processing (NLP), or computer vision. + Build, test, deploy, and maintain software and AI systems , ensuring high performance, security, and scalability. + Own a… more
    Zebra Technologies (11/21/25)
    - Save Job - Related Jobs - Block Source