- Paycom Online (Oklahoma City, OK)
- …give and receive concrete feedback.** + **Experience in deploying and scaling containerized, distributed software and AI systems using tools such as ... in "traditional" NLP tools** + **Experience in SOA, Modular Monolith Architecture, and distributed systems for AI training and inference** + **Familiarity… more
- Jobleads-US (San Francisco, CA)
- …industry-leading unified DataOps platform powered by Apache Airflow(R). Astro accelerates building reliable data products that unlock insights, unleash AI value, ... empowers data teams to bring mission-critical software, analytics, and AI to life and is the company behind Astro,...You'll play a pivotal role in shaping the culture, systems , and programs that empower our teams to perform… more
- Jobleads-US (San Jose, CA)
- At Bloom Energy, our vision for a world powered by clean, reliable , and affordable energy is more than just a dream-we're making it reality. For over two decades, ... in a rapidly digitizing, energy-intensive world. From revolutionizing power for AI -driven data centers to ensuring resilience for hospitals, electric grids,… more
- DataRobot (Seattle, WA)
- **Job Description:** DataRobot delivers AI that maximizes impact and minimizes business risk. Our platform and applications integrate into core business processes so ... teams can develop, deliver, and govern AI at scale. DataRobot empowers practitioners to deliver predictive...shared ownership of our platform and aim to build systems that are resilient, observable, and require minimal intervention.… more
- Home Depot (Atlanta, GA)
- …experience, including significant responsibility for architecting and delivering large-scale, distributed systems in a retail or operational environment. ... cloud infrastructure and leveraging cloud-native services. + Practical experience applying AI /ML concepts to operational systems , including integrating machine… more
- Robert Half (Fresno, CA)
- …backend services using TypeScript and Python . Architect performant, secure, and reliable systems to support rapid business growth. Collaborate with Product, ... expertise in TypeScript and Python . Proven success building and scaling high-performance distributed systems . Strong understanding of modern DevOps pipelines ,… more
- Signature Aviation (Orlando, FL)
- …ground handling, or FBO workflows is a plus. + Background in deploying AI -powered or predictive systems into frontline environments. **Tech Stack & Skill ... for payment data. + **Observability:** Dynatrace, Splunk, Azure Monitor, Prometheus. + ** AI Enablement:** Architecting systems to integrate with AI /ML… more
- Signature Aviation (Orlando, FL)
- …ground handling, or FBO workflows is a plus. + Background in deploying AI -powered or predictive systems into frontline environments. **Tech Stack & Skill ... for payment data. + **Observability:** Dynatrace, Splunk, Azure Monitor, Prometheus. + ** AI Enablement:** Architecting systems to integrate with AI /ML… more
- Zscaler (Short Hills, NJ)
- …performant, reusable, and extensible + Designing and implementing scalable, high-availability distributed systems , microservices, and APIs (RESTful and SDKs) ... Docker and Kubernetes + Proven expertise in designing, developing, and deploying scalable distributed systems with technologies such as Kafka, Redis and Mongo +… more
- SLAC National Accelerator Laboratory (Menlo Park, CA)
- …data pipelines and propose solutions that leverage emerging technologies. + Experience deploying reliable data systems and data quality management. + Ability to ... such as particle physics, astrophysics, materials science. The Scientific Computing Systems (SCS) division within the Technology and Innovation (TID) Directorate at… more
- NVIDIA (Santa Clara, CA)
- …you will work with internal teams and external partners to integrate distributed systems , manage large-scale data pipelines, and operationalize next-generation ... pipelines using Go, Python, Bash, and Bazel to ensure reproducibility, efficiency, and reliable distributed execution. + Integrate simulation and drive logs (eg… more
- Rubrik (Palo Alto, CA)
- …/Kernel or Networking domain + Strong fundamentals in data structures, algorithms, and distributed systems design + Strong background in Systems Programming ... and CTO, our mission is to build a highly reliable , secure, and scalable software-defined platform. We are the...Go, and either C++, Java, or Scala + Large distributed systems design and development experience is… more
- NVIDIA (Austin, TX)
- …from the crowd: + Technical competency in managing and automating large-scale distributed systems independent of cloud providers. Advanced hands-on experience ... part of an DGX Cloud team responsible for production systems that enable large scalable GPU clusters to be...Bright Cluster Manager) + Proven operational excellence in maintaining reliable and performant AI infrastructure. NVIDIA is… more
- LinkedIn (Mountain View, CA)
- …such as Scala or other relevant coding languages + Hands-on experience developing distributed systems or other large-scale systems . Preferred Qualifications ... in production. Why join us: If you're passionate about ** AI infra, scalable evaluation systems , or model...Beam, Spark etc., feature engineering, + Experience with search systems or similar large-scale distributed systems… more
- Microsoft Corporation (Redmond, WA)
- …healthcare, economics, and the environment. Are you passionate about building the future of reliable , large-scale cloud and AI systems ? The ** Systems ... Interns to tackle cutting-edge challenges at the intersection of distributed systems , AI systems...letter. **Preferred Qualifications** + Experience of building scalable and reliable systems . + Demonstrated ability to develop… more
- NVIDIA (Santa Clara, CA)
- …design, or enterprise platform engineering. + Deep expertise in architecting large-scale distributed systems with a focus on reliability, performance, and ... record of publishing technical papers, architecture patterns, or thought leadership in AI systems . + Knowledge of observability tools, telemetry dashboards, and… more
- NVIDIA (Santa Clara, CA)
- …of NVIDIA's AI infrastructure stack. + Stay current with advances in distributed systems , large-scale computing, and AI frameworks to help shape ... and development at unprecedented scale. You will work on distributed systems , large-scale storage and compute orchestration,...NVIDIA to architect reliable , efficient, and secure systems that underpin our Managed AI Research… more
- NVIDIA (Santa Clara, CA)
- …and inference more reliable , scalable, and efficient. If you're passionate about AI , distributed systems , and high-performance computing, we want to hear ... driving down cluster downtime towards zero, ensuring that our AI systems remain robust and reliable...detection. + Hands-On Coding & Optimization: Contribute to large-scale distributed systems with high-quality, production-level C++ and… more
- NVIDIA (Santa Clara, CA)
- …to encouraging an inclusive and diverse workplace. + Hands-on experience developing large-scale distributed systems Ways to stand out from the crowd: + Strong ... orgs to build products that use LLMs and agent systems to serve the needs of NVIDIA engineering teams....the product/team. + Develop and execute strategies for scalable, reliable , and secure AI infrastructure supporting both… more
- Walmart (Sunnyvale, CA)
- …build dynamic, context-aware systems . 2. **Architecture ; Scalability:** + Architect scalable, distributed AI systems with a focus on performance, fault ... to lead the design, development, and deployment of advanced AI systems . This role involves architecting scalable...Walmart GTP, you will be building highly scalable and reliable APIs, services and applications which will drive the… more