July 18, 2014
Building and Leveraging an Open Source Developer Community.
Jade Wang manages Developer Engagement at Meteor, where she runs their monthly Devshop, deputizes community leaders, and owns community stra...
On June 27th, Heavybit member company Librato hosted their monthly SF Metrics Meetup at our San Francisco Clubhouse. The event featured two great talks from experts J Paul Reed, and James Cunningham. Sign up here to attend the next event in person.
In this talk, we look at what decades of research in the safety sciences has to say about humans interacting with and operating complex socio-technical systems, including what aircraft carriers have to do with Internet infrastructure operations, how resilience engineering can help us, and the use of heuristics in incident response. All of these provide insight into ways we can improve one the most advanced—and most effective—monitoring tools we have available to keep those systems running: ourselves.
Learn more about Paul here: http://jpaulreed.com/
Sentry (sentry.io) receives a million requests a minute to process and store crashes from all around the world. It’s the Operations Team’s responsibility that everything goes right, but it’s also their responsibility to not burn themselves out when things go wrong.
Sentry collects fifty thousand custom metrics inside of DataDog, but only alerts on less than fifty of them. James leads Sentry’s observability initiative, creating and maintaining those alerts.
Learn about the lifecycle of an alert at Sentry, including: