On June 27th, Heavybit member company Librato hosted their monthly SF Metrics Meetup at our San Francisco Clubhouse. The event featured two great talks from experts J Paul Reed, and James Cunningham. Sign up here to attend the next event in person.
Detecting Whispers in Chaos
J Paul Reed, Managing Partner at Release Engineering Approaches
In this talk, we look at what decades of research in the safety sciences has to say about humans interacting with and operating complex socio-technical systems, including what aircraft carriers have to do with Internet infrastructure operations, how resilience engineering can help us, and the use of heuristics in incident response. All of these provide insight into ways we can improve one the most advanced—and most effective—monitoring tools we have available to keep those systems running: ourselves.
Learn more about Paul here: http://jpaulreed.com/
Vetting your Pager
James Cunningham, Operations Engineer at Sentry
Sentry (sentry.io) receives a million requests a minute to process and store crashes from all around the world. It’s the Operations Team’s responsibility that everything goes right, but it’s also their responsibility to not burn themselves out when things go wrong.
Sentry collects fifty thousand custom metrics inside of DataDog, but only alerts on less than fifty of them. James leads Sentry’s observability initiative, creating and maintaining those alerts.
Learn about the lifecycle of an alert at Sentry, including:
- How a variety of metrics are collected efficiently
- How Sentry justifies a metric’s degree of accuracy
- Why a metric’s logical purpose is defined
- How alerts evolve from metrics, articulating its existence
- When an engineer actually gets paged and what they’re instructed to do
Subscribe For Heavybit Updates
Join our mailing list to receive the latest updates in the developer startup community. After subscribing, tell us your preferences to receive only the emails you want.
Thanks for subscribing, check your inbox to confirm!