March 18, 2016
Dev Tools Digest – Mar. 11 – 17
The latest Dev Tools Digest includes an app announcement for PagerDuty and Splunk integration. Also included is news of a role restructuring...
I am pleased to welcome our newest Heavybit member, Jeli, as they usher in an entirely new era for Incident Management and Analysis. Jeli enables organizations to reduce downtime and improve both security and performance, by capturing critical incident data and allowing developers to analyze effects, uncover hidden issues and patterns, and then categorize and resolve problems so they don’t occur again.
We have come a long way in the 15 years since I first created the Incident Management program at Amazon and implemented early parts of what is now called Chaos Engineering. PagerDuty, one of our first Heavybit alumni, subsequently built on and expanded these early ideas and mainstreamed much of the Incident Management category as we now know it today.
While products from companies like Atlassian, PagerDuty, and Datadog now do a great job at detecting and enabling teams to respond to problems, that is only one part of the larger Incident Management cycle. After an incident is resolved, the best companies systematically review and analyze what happened, identify the sources of problems, and work to fix them in a process called a Postmortem or After-Action Review.
Despite the many advances in other areas and the truly immense value of this information to organizations, even the best teams today rely on homegrown tools and time-consuming, ad hoc workflows to manually review this data using a combination of Slack, JIRA, and Google Docs. For years, practitioners have wanted better Incident Analysis tools, and while some progress has been made, something essential has been missing… until now.
Jeli is a startup born from the unique understanding that founder Nora Jones and her team gained developing resilient systems and implementing Incident Management & Analysis at companies like Slack and Netflix, as well as from the Learning from Incidents in Software community. Jeli is the first truly dedicated Incident Analysis platform. It automatically captures and coalesces incident data and transcripts, and provides collaboration and analytics tools to understand what causes incidents and how to prevent them in the future.
Jeli is a remote-first company with employees currently across the US. They recently announced the product at SRECon and are working with a number of early partners including Indeed. For more info or to apply for early access, visit jeli.io.
Nora Jones | CEO | LinkedIn
Nora is the Founder and CEO of Jeli and a cofounder of the Learning from Incidents in Software community. She’s passionate about resilient software, people, and the intersection of those two worlds and was previously the Head of Chaos Engineering and Human Factors at Slack. She co-wrote the O’Reilly book Chaos Engineering: System Resiliency in Practice while working at Netflix and keynoted AWS re:Invent in 2017 to an audience of over 40,000 people about the technical benefits and business case behind implementing chaos engineering.