• Senior AI- HPC Storage…

    NVIDIA (Santa Clara, CA)
    …in any of the leading Cloud environment [ AWS, Azure or GCP] + Experience with AI/ HPC cluster job schedulers such as SLURM, LSF + In depth understating of ... and RDMA + Background with Software Defined Networking and AI/ HPC cluster networking + Familiarity with deep...are growing fast. If you're a creative and autonomous engineer with real passion for technology, we want to… more
    NVIDIA (09/18/24)
    - Save Job - Related Jobs - Block Source
  • Senior HPC Systems Engineer

    General Dynamics Information Technology (Fairfax, VA)
    …Yes **Job Description:** At GDIT, people are our differentiator. Our work depends on a Senior HPC Systems Engineer joining our team to support the National ... Obtain:** None **Job Family:** Systems Engineering **Skills:** High-Performance Computing ( HPC ) Systems,Linux System Administration,Systems Management **Certifications:** None - N/A… more
    General Dynamics Information Technology (07/14/24)
    - Save Job - Related Jobs - Block Source
  • Senior GPU Cluster Software…

    NVIDIA (Santa Clara, CA)
    …+ Background in running and instrumenting distributed LLM training on a multi gpu HPC cluster + Knowledge of LLM training features and libraries - Checkpointing, ... working with distributed system software architecture + Basic understanding of HPC GPU cluster , slurm + Basic understanding of Machine learning concepts and… more
    NVIDIA (08/13/24)
    - Save Job - Related Jobs - Block Source
  • Linux HPC Administrator

    Federal Reserve Bank (Kansas City, MO)
    …skills and experience. + Experience supporting and administering a High Performance Compute ( HPC ) cluster and its components: Red Hat Linux, SLURM, IBM Spectrum ... Federal Reserve System. Our services include multiple high performance computing ( HPC ) environments, research data warehousing and curating services, and endpoint… more
    Federal Reserve Bank (09/17/24)
    - Save Job - Related Jobs - Block Source
  • Solutions Architect - InfiniBand and HPC

    NVIDIA (CA)
    NVIDIA is looking for a Senior HPC Engineer to join its Professional Services team. Academic, commercial and government groups around the world are using ... the team building many of the largest and fastest AI/ HPC systems in the world! NVIDIA is looking for...equivalent experience. Ways to stand out from crowd: + Cluster management technologies knowledge (bonus credit for BCM (Base… more
    NVIDIA (08/20/24)
    - Save Job - Related Jobs - Block Source
  • Senior Scientific Computing Support…

    Penn Medicine (Philadelphia, PA)
    …performance parallel file systems. + Assist end users running applications on the HPC cluster . + Provide leadership and solutions for complex research computing ... Location:Remote Hours: (must live reasonable distance from location)M-F, Daylight The Senior Scientific Computing Support Engineer position will work in… more
    Penn Medicine (07/05/24)
    - Save Job - Related Jobs - Block Source
  • Senior Site Reliability Engineer

    NVIDIA (Santa Clara, CA)
    …the choice, join our diverse team today! As a member of the GPU AI/ HPC Infrastructure team, you will provide leadership in the design and implementation of ground ... for our GPU Compute Clusters. As a Site Reliability Engineer , you will help us with the strategic challenges...fixing problems before they occur + Building automation for Cluster bring up and scaled up operation. + Improving… more
    NVIDIA (09/12/24)
    - Save Job - Related Jobs - Block Source
  • Senior Site Reliability Engineer

    NVIDIA (Santa Clara, CA)
    …artificial intelligence. Join our team at NVIDIA as a Senior Site reliability engineer focused on HPC storage and play a crucial role in designing, ... Familiarity with newer and emerging monitoring products. + Prior Experience with HPC cluster management tools such as Slurm, PBS, LSF, etc. + Experience with… more
    NVIDIA (08/30/24)
    - Save Job - Related Jobs - Block Source
  • Senior Systems Engineer - Storage

    Dana-Farber Cancer Institute (Boston, MA)
    …work with amazing partners, including other Harvard Medical School-affiliated hospitals. The HPC Sr. Engineer , Storage will serve the Dana-Farber Cancer ... + Researches, finds and implements optimal runtime conditions for HPC workloads. + Act as a cluster scheduler power user and works with system administration to… more
    Dana-Farber Cancer Institute (08/28/24)
    - Save Job - Related Jobs - Block Source
  • Senior Principal Systems Development…

    Dell Technologies (Austin, TX)
    ** Senior Principal Systems Development Engineer ** Our customers' system requirements are usually highly complex. Bringing together hardware and software systems ... the best work of your career and make a profound social impact as a Senior Principal Systems Development Engineer on our Systems Development Engineering Team in… more
    Dell Technologies (09/14/24)
    - Save Job - Related Jobs - Block Source
  • Senior Cloud Services Software…

    NVIDIA (Santa Clara, CA)
    …seeking a distributed software engineer to join our team! As a Senior engineer , you'll be instrumental in developing and optimizing AI infrastructure ... engineers. What You Will Be Doing: As a software engineer specializing in backend development, you'll work in a...well as container technologies like Docker and Kubernetes, and HPC /AI platforms such as Slurm. Ways to stand out… more
    NVIDIA (09/18/24)
    - Save Job - Related Jobs - Block Source
  • Senior Linux Systems Engineer

    NVIDIA (Santa Clara, CA)
    …how you can make a lasting impact on the world. We are looking for a Senior Linux Software Engineer to join the NVIDIA Applied Systems Engineering group. The ... runtimes, drivers+containers, and containerization of various high performance computing cluster software elements within a variety of environments. + Crafting,… more
    NVIDIA (08/24/24)
    - Save Job - Related Jobs - Block Source
  • Software Development Engineer , ML…

    Amazon (Cupertino, CA)
    …of peer teams? We want to talk to you! We seek a Software Development Engineer for the Machine Learning (ML) Infrastructure team to build the tools that are used ... top performance of AWS ML and High Performance Computing ( HPC ) technologies developed by our organization. Bring your exceptional...Fabric Adapter (EFA). Key job responsibilities Be an autonomous engineer on a team that builds and maintains the… more
    Amazon (09/19/24)
    - Save Job - Related Jobs - Block Source
  • Senior System Software Engineer

    NVIDIA (Santa Clara, CA)
    We are seeking a Sr System Software Engineer to help us build out our scientific computing platform on Nvidia DGX Cloud. We are building a cloud based accelerated ... shared memory and distributed memory architecture, message passing (MPI, NCCL), Cluster scalability and performance. + Hands on Debugging skills with Process,… more
    NVIDIA (09/10/24)
    - Save Job - Related Jobs - Block Source
  • Senior Software QA Test Development…

    NVIDIA (Santa Clara, CA)
    …GPU Computing. We are passionate about markets include gaming, automotive, vision, HPC , datacenters and networking in addition to our traditional OEM business. ... integration, strong OS experience, reliability testing with various telemetries, scale out cluster , test plan development, CI/CD and DevOps experience to join our… more
    NVIDIA (09/05/24)
    - Save Job - Related Jobs - Block Source
  • Sr. GNC DevOps Engineer (Falcon)

    SpaceX (Hawthorne, CA)
    …+ Work with SpaceX HPC team to monitor and maintain a 4000+ thread HPC cluster + Closely collaborate with GNC software engineers to create highly operable ... appropriate to the responsibilities COMPENSATION AND BENEFITS: Pay Range: GNC DevOps Engineer / Senior : $160,000.00 - $220,000.00/per year Your actual level and… more
    SpaceX (08/20/24)
    - Save Job - Related Jobs - Block Source
  • Sr. Software Development Engineer

    Amazon (Cupertino, CA)
    …have extensive experience in low-latency networking and collective operations, such as HPC network fabric or machine learning accelerator cluster systems. Also ... Description We are seeking an experienced software engineer with low-level latency networking or interconnect expertise to optimize customer experience by designing… more
    Amazon (08/03/24)
    - Save Job - Related Jobs - Block Source
  • Sr Distinguished Engineer , Generative AI…

    Capital One (Mclean, VA)
    …to love the products and services we build. We are looking for an experienced Senior Distinguished Engineer , AI Systems, to help us build the foundations of our ... VA - McLean, United States of America, McLean, Virginia Sr Distinguished Engineer , Generative AI Systems - (Remote- Eligible) Sr Distinguished Engineer ,… more
    Capital One (08/18/24)
    - Save Job - Related Jobs - Block Source