- UKG, Inc. (Lowell, MA)
- …united by purpose, inspired by you. About the Team: We are seeking aPrincipal Observability and Reliability Tooling Engineer to lead cost-effective ... you will play a crucial part in enhancing our observability framework, ensuring robust monitoring and alerting practices that...as part of total compensation. Information about UKG's comprehensive benefits can be reviewed on our careers site at… more
- JPMorgan Chase & Co. (Chicago, IL)
- …in a realm tailored for top achievers in site reliability . As a Lead Site Reliability Engineer at JPMorgan Chase within the CIB - Global Banking, you ... advice and mentoring to other engineers. Job responsibilities Demonstrates and champions site reliability culture and practices and exerts technical influence… more
- Alchemy (New York, NY)
- …of experience as an Infrastructure Engineer focused on reliability (eg, Site Reliability Engineer , Production Engineer , Platform Engineer ). ... incident management, conducting root cause analyses and driving continuous reliability enhancements. Develop observability frameworks using Prometheus, Grafana,… more
- Nayya (New York, NY)
- …wealth for all. About the Role We are looking for a passionate and driven Senior Site Reliability Engineer (SRE) to join our growing engineering team at ... pipelines and minimal downtime. Qualifications 5+ years of professional experience in Site Reliability Engineering, DevOps, or related roles, ideally at a… more
- Kelly Mitchell (Irving, TX)
- Job Summary: Our client is seeking a Site Reliability Engineer to join their team! This position is located in Irving, Texas. Duties: Run the production ... resolve issues across the stack Build tools and automation to improve observability , deployment, and incident response Drive reliability best practices across… more
- The Judge Group Inc. (Dallas, TX)
- Our client is currently seeking a Site Reliability Engineer - Senior We're looking for a Staff Site Reliability Engineer (SRE) to join our team ... for best practices, we encourage you to apply! Responsibilities Practice a Site Reliability Engineering mindset , solving problems through automation,… more
- Rogo (New York, NY)
- …and collaboration skills. Bonus Experience with MLOps monitoring and observability . Experience with PostgreSQL, Elasticsearch, and vector databases such as ... platforms like Google Cloud Platform (GCP). Experience with distributed tracing and observability tools. Who You Are You thrive in fast-paced environments. You are… more
- Roblox (San Mateo, CA)
- …service discovery, secrets management and related software layers. We're looking for skilled Site Reliability Engineers with strong programming skills to help us ... Roblox's private cloud, productionize our growing Kubernetes-based infrastructure, and institute reliability best practices across the Roblox Compute team. You Will:… more
- Celonis (Redwood City, CA)
- …and resilience of our platform. The team applies advanced software engineering and Site Reliability Engineering (SRE) principles to drive system reliability , ... Role Join a highly technical, collaborative, and innovation-driven team that blends Site Reliability Engineering with modern Software Engineering practices to… more
- Palo Alto Networks (Reston, VA)
- …are robust and performant. This includes automation, architecture, performance, observability , troubleshooting, security, and reliability . Our Infrastructure ... Platform stack includes Terraform, Kubernetes, GitLab CI/CD, GitOps, Prometheus, Grafana, Loki, Docker, GCP, Backstage, MySQL, PagerDuty, FireHydrant, Python, Bash, Java, NodeJS and Go. Your Impact Design, build, and operate reliable, secure Cloud… more
- Tesla Motors (Palo Alto, CA)
- …range of responsibilities and impact. If you're a highly self-motivated software engineer with a passion for driving infrastructure, security and reliability , ... prototyping by development teams, while ensuring the highest levels of reliability and availability Drive the migration of large-scale, distributed fleet… more
- ServiceNow (San Diego, CA)
- …Description It all started in sunny San Diego, California in 2004 when a visionary engineer , Fred Luddy, saw the potential to transform how we work. Fast forward to ... highly technical engineers who are tasked with maintaining and developing the reliability , scalability and performance of the ServiceNow infrastructure. The SRE is… more
- Apolis (Plano, TX)
- …from cybersecurity threats. Optimize CDN configurations to enhance application speed and reliability . 2. CDN Control Plane Assist client in developing and automating ... with on-premises technologies to create a cohesive control environment. 3. CDN Observability Implement client's tool for observability to monitor and analyze… more
- General Motors (Roswell, GA)
- …respective innovation centers three times per week._ **_The Role:_** The Software Engineering Site Reliability Engineer (SRE) is responsible for ensuring the ... health. + Participate in on-call engineering duty to support production. + Instill Site Reliability best practice through automation, data insights, and … more
- MongoDB (New York, NY)
- …VictoriaMetrics, Splunk, QuickWit, Jaeger, Fluentbit, and Vector. In addition to owning our observability infrastructure, as an Engineer on the team, you'll also ... to build next-generation, AI-powered applications. **Team and Role Overview** The SRE Observability team is part of the larger Platform Engineering organization, and… more
- General Motors (Roswell, GA)
- …health. + Participate in on-call engineering duty to support production. + Instill Site Reliability best practice through automation, data insights, and ... future for generations to come. In this SRE SW Engineer role, you will develop and maintain key elements...to analyze and provide inputs in architecture, infrastructure resources, observability to achieve reliability and scalability goals.… more
- Regions Bank (Hoover, AL)
- …logging into the careers section of the system. **Job Description:** At Regions, the Site Reliability Engineer is responsible for ensuring the dependability ... Solid familiarity with Splunk, Elastic, OpenSearch, Prometheus, Grafana + Implementing Site Reliability Engineering (SRE) principles SLO/SLI + Experience… more
- Cisco (VA)
- …fun, and most significantly to each other's success. The Splunk Observability Cloud provides full-fidelity monitoring and fixing across infrastructure, applications, ... applications with low operational burden by handling and improving the reliability and resiliency of SRE-managed services and infrastructure. You thrive on… more
- Palo Alto Networks (Santa Clara, CA)
- … engineer with a passion for technology and a strong motivation for high reliability at the service level + Observability Tools: High proficiency with Thanos, ... including the design, implementation, and continuous enhancement of our comprehensive observability systems. To meet the opportunities that such a role provides,… more
- Medtronic (Mounds View, MN)
- …in a more connected, compassionate world. **A Day in the Life** Principal Observability Engineer Careers That Change Lives Transforming Patient Management with ... where you can thrive. We are seeking a Principal Observability Engineer to lead the design, automation,..., mentoring a growing team of engineers, and driving reliability through actionable insights into our systems and applications.… more