- General Motors (Sunnyvale, CA)
- …analysis systems ** to identify and categorize root causes of reliability or stability regressions. + **Integrate data pipelines** for continuous monitoring of ... Points!** + Experience with **release governance frameworks** for ML or AV systems . + Familiarity with ** reliability engineering methodologies** (MTBF, FMEA, … more
- Walmart (Sunnyvale, CA)
- …household necessities **Qualifications** * 16+ years of experience in Site Reliability Engineering, Production Engineering, and Infrastructure Reliability , ... ** **What you'll do ** **Location: Sunnyvale / Bentonville** **Department: Reliability Engineering / Business Reliability Engineering (BRE)** **Reports To:… more
- Google (Sunnyvale, CA)
- …also a mindset and a set of engineering approaches to running better production systems -we build our own creative engineering solutions to operations problems. ... Senior Software Engineer, Site Reliability Engineering...that Google's services-both our internally critical and our externally-visible systems -have reliability and uptime appropriate to users'… more
- NVIDIA (Santa Clara, CA)
- …NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and availability using the combination of ... and a set of engineering approaches to running better production systems and optimizations. Much of our...systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability… more
- NVIDIA (Santa Clara, CA)
- …and secure production environments. We are seeking a deeply skilled Senior Staff Site Reliability Engineer (SRE) to advance our enterprise security ... position requires a strong software engineering background, but focuses on reliability , scalability, and operational excellence. A strong candidate excels in… more
- LinkedIn (Sunnyvale, CA)
- …and/or knowledge of TCP/IP and network programming. + Experience with Linux operating systems and troubleshooting production systems at scale. Suggested ... lead, driving craftsmanship through code reviews, and driving the reliability of our systems through complex key...reviews, and driving the reliability of our systems through complex key technology stack migrations. Responsibilities +… more
- NVIDIA (Santa Clara, CA)
- …boundaries and geographies. + 5+ years of experience administering large-scale production systems . 3+ years of experience in high-availability Internet, ... engineers to design, develop and implement a global, dynamic, innovative Service Reliability Operations Center, to provide extraordinary levels of support for our… more
- Amazon (Cupertino, CA)
- …cutting AI platforms for the world's largest Cloud Services provider. As a Senior Reliability Engineer you will engage with an experienced cross-disciplinary ... * You will have a fundamental understanding of Reliability statistics/ Reliability tests and/or solid understanding of computer systems to influence design… more
- NVIDIA (Santa Clara, CA)
- …and board designers, software/firmware engineers, HW/SW applications engineering, process/ reliability specialists, DFx engineers, ATE engineers, product managers, ... management techniques all the way from feature definition to production ; working with multi-functional teams. + Correlate silicon behavior...we need to see: + MS in EE, CE, Systems Engineering (or equivalent experience) + 4+ years of… more
- Celestica (San Jose, CA)
- …managing field risk, and driving continuous improvements. A strong background in the reliability of complex electronic systems and their components is essential. ... Functional Area: Quality (QUA) Career Stream: Global Supplier Quality (GSQ) Role: Senior Manager (SMG) Job Title: Senior Manager, Global Supplier Quality… more
- Cadence Design Systems, Inc. (San Jose, CA)
- …on the world of technology. We are seeking a highly skilled and experienced AI Systems Engineer to join our team. This is a hands-on, senior individual ... be responsible for the entire lifecycle of our AI systems , from architecting and building high-performance GPU clusters to...solutions, and networking to ensure optimal performance, scalability, and reliability for all our AI workloads. + Cloud AI… more
- NVIDIA (Santa Clara, CA)
- …is a discipline that involves designing, building, and maintaining large-scale production systems with high efficiency and availability. It encompasses ... that our internal and external-facing GPU cloud services meet reliability and uptime goals as promised to the users...performance tuning, and improving the efficiency of storage and production systems . Since Production Engineers… more
- Snap Inc. (Palo Alto, CA)
- …areas of improving platform reliability , operational stability and performance of production systems + Strong proficiency with Python and SQL, and experience ... internal tools or developer platforms to improve engineering velocity and system reliability . + Experience managing production systems , reliability… more
- General Motors (Mountain View, CA)
- …leading senior engineers and/or managers. + Demonstrated success delivering production -grade software on modern SoC-based or embedded systems . + Proven ... OS, multimedia, connectivity, and core system services. What You'll Do As ** Senior Software Engineering Manager, Compute Systems Software** , you'll lead… more
- Google (Sunnyvale, CA)
- Senior Hardware Systems Design Engineer, Board and Systems _corporate_fare_ Google _place_ Sunnyvale, CA, USA **Mid** Experience driving progress, solving ... and driving technical roadmaps. **About the job** As a Senior Hardware Engineer, you will work on ML/AI hardware...ensure that designs are manufacturable and ready for volume production and with field teams to support systems… more
- Ford Motor Company (Palo Alto, CA)
- …Validation Engineer, you will play a critical role in ensuring the quality and reliability of our automotive systems . You will be responsible for designing, ... and maintain complex automotive system level test benches to validate automotive systems with a focus on automation. + Develop comprehensive verification and… more
- NVIDIA (Santa Clara, CA)
- …right in the center of this revolution. We are seeking a motivated Senior Systems Software Engineer to join our Autonomous Vehicle Infrastructure organization, ... built. From healthcare research applications to autonomous vehicles, or voice-recognition systems , the need for advanced perception and cognitive capabilities is… more
- NVIDIA (Santa Clara, CA)
- …is right in the center of this revolution. We are seeking a motivated Senior Systems Software Engineer to join our AV Infrastructure organization and become ... Demonstrated technical leadership and architectural impact on critical, large-scale distributed systems deployed in production . + Experience building developer… more
- NVIDIA (Santa Clara, CA)
- …role, you will design next-generation testing methodologies that ensure the performance, reliability , and integrity of pioneering GPU server systems used in ... datacenter products are the engines powering this transformation. We seek a Senior Test Development Engineer to join our Silicon Solutions Architecture Development… more
- NVIDIA (Santa Clara, CA)
- …Linux kernel features. + Proficiency in diagnosing, fixing, and optimizing distributed systems and containers under real production constraints. + Excellent ... design and development. + Understanding of performance, security and reliability in complex distributed systems . Ways to stand out from the crowd: + Experience… more