- KOHLER (Palo Alto, CA)
- MLOps Engineer - Kohler Ventures Work Mode: Hybrid Location: Hybrid - 3 days/week onsite in either Palo Alto, CA or NYCOpportunityKohler Ventures is an independent ... multi-disciplinary team across artificial intelligence, machine learning, design, advanced software and hardware engineering, strategy, venture investments, sales, marketing,… more
- TikTok (San Jose, CA)
- …and scalability. *Experience in designing, analyzing and troubleshooting large-scale distributed systems Job Information [For Pay Transparency]Compensation ... our TikTok users. SREs in our team keep the systems up and running with the highest level of...through to launch reviews, deployment, operation and refinement *Deliver tools/ software to improve the reliability and scalability of services,… more
- Google (Sunnyvale, CA)
- …work environment. About the job Like Google's own ambitions, the work of a Software Engineer goes way beyond just Search. Software Engineering Managers ... Bachelor's degree or equivalent practical experience. 8 years of experience with software development in one or more programming languages (eg, Python, C, C… more
- CyberArk (Santa Clara, CA)
- …years in a senior, architect or a technical lead role of site reliability, systems engineering or software development A deep understanding of Site Reliability, ... for any identity - human or machine - across business applications, distributed workforces, hybrid cloud workloads and throughout the DevOps lifecycle. The world's… more
- Meta (Menlo Park, CA)
- …such as cuBLAS, cuDNN, FlashAttention, training performance acceleration through hardware- software co-design. Research Engineer , SysML - FAIR Responsibilities ... understand our world. We are seeking individuals passionate in solving systems challenges to sustainably accelerate our reach to human-level intelligence. Candidates… more
- Dyna Robotics (Redwood City, CA)
- …our system is ready to support growth. High-Performance ML Computing & Distributed Systems : Manage and optimize high-performance computing resources. Develop ... in a tech lead role. Proven experience with high-performance computing environments and distributed systems . Demonstrated ability to scale ML training systems… more
- Navan (Palo Alto, CA)
- …to enhance system reliability, observability, or automation. Proven track record of operating distributed systems in AWS or other public clouds, with strong ... Are you passionate about building highly reliable and scalable systems ? Do you thrive on tackling exciting challenges that...growth? Navan is looking for a talented Site Reliability Engineer (SRE-2) to join our world-class team in the… more
- Rubrik (Palo Alto, CA)
- …services for system monitoring, detecting faults, and automatically self-healing the distributed systems + Design, develop, and operationalize high-performance, ... Computer Science or related field + 2+ years of software development experience on Linux, preferably in Platform/ Systems...domain + Strong fundamentals in data structures, algorithms, and distributed systems design + Strong background in… more
- NVIDIA (Santa Clara, CA)
- …from the crowd: + Technical competency in managing and automating large-scale distributed systems independent of cloud providers. Advanced hands-on experience ... + 5+ years in similar role and experience on large-scale production systems . Experience with common software engineering principles, tools and techniques.… more
- NVIDIA (Santa Clara, CA)
- …building the next generation of scalable AI systems . As a Senior Applied AI Software Engineer on the Dynamo project, you will address some of the most ... Go for Kubernetes controllers and operators development. + Deep understanding of distributed systems , parallel computing, and GPU architectures. + Experience… more
- Amazon (East Palo Alto, CA)
- …base. You'll bring a passion for innovation, data, search, analytics, and distributed systems . You'll also: Solve challenging technical problems, often ones ... about transforming business challenges into technological breakthroughs? Join Amazon as a Software Development Engineer (SDE) and help shape the future of… more
- NVIDIA (Santa Clara, CA)
- …and fleet management engineering. + Experience with infrastructure automation and distributed systems design developing tools for running large scale ... We are seeking Software Engineers with previous experience building and running...more of the following: Linux, Slurm, Kubernetes, Local and Distributed Storage, and Systems Networking. Ways to… more
- Amazon (Cupertino, CA)
- …- Bachelor's degree in computer science or equivalent - Preferred previous software engineer expertise with Pytorch/Jax/Tensorflow, Distributed libraries and ... customers and raise our performance bar. You'll design fault-tolerant systems that run at massive scale as we continue...that use them. This role is for a senior software engineer in the Machine Learning Applications… more
- Amazon (Cupertino, CA)
- …design or architecture (design patterns, reliability and scaling) of new and existing systems experience - 5+ years of full software development life cycle, ... science or equivalent - Experience in computer architecture - Previous software engineering expertise with Pytorch/Jax/Tensorflow, Distributed libraries and… more
- Amazon (Cupertino, CA)
- …design or architecture (design patterns, reliability and scaling) of new and existing systems experience - 5+ years of full software development life cycle, ... science or equivalent - Experience in computer architecture - Previous software engineering expertise with Pytorch/Jax/Tensorflow, Distributed libraries and… more
- Amazon (Cupertino, CA)
- …and the Trn1 and Inf1 servers that use them. This role is for a senior software engineer in the Machine Learning Applications (ML Apps) team for AWS Neuron. This ... and runtime engineers to create , build and tune distributed training solutions with Trn1. Experience training these large...systems experience - - 5+ years of full software development life cycle, including coding standards, code reviews,… more
- Google (Sunnyvale, CA)
- …academic or industry setting. + Experience building and supporting large scale distributed systems and infrastructure. + Familiarity with Kubernetes development, ... of experience with an advanced degree. + Experience in distributed computing or machine learning infrastructure. Preferred qualifications: +...goes on and is growing every day. As a software engineer , you will work on a… more
- Google (Sunnyvale, CA)
- …design and architecture. + 3 years of experience developing large-scale infrastructure, distributed systems or networks, or experience with compute technologies, ... bring fresh ideas from all areas, including information retrieval, distributed computing, large-scale system design, networking and data storage,...goes on and is growing every day. As a software engineer , you will work on a… more
- Amazon (Cupertino, CA)
- …offers growth opportunities in ML infrastructure, bridging the gap between frameworks, distributed systems , and hardware acceleration. About the team Annapurna ... Learning accelerators. This role is for a Machine Learning Engineer on one of our AWS Neuron teams: -...The ML Inference team collaborates closely with hardware designers, software optimization experts, and systems engineers to… more
- NVIDIA (Santa Clara, CA)
- …Our data center platforms integrate CPUs, GPUs, DPUs, networking, and a full-stack software ecosystem to power AI at scale! We are seeking a highly technical ... and creative Senior Technical Marketing Engineer to join our team to showcase the innovations...world's largest AI models. This role will focus on distributed AI model training, ensuring that customers and partners… more