PagerDuty’s DevOps: Avoiding a Cyber Monday Fail
Last year an estimated $7.35 Billion was spent online during the Black Friday and Cyber Monday weekend. Coupled with the fact that engineering teams are often short staffed with many requesting the week off, the Thanksgiving weekend could be the makings of a perfect storm. We caught up with PagerDuty’s Dev Ops Lead Arup Chakrabarti to hear his tips for managers during this peak shopping season.
In past years and in dealing with Cyber Monday, Chakrabarti’s own team has actually been more responsive than during regular hours. He offers, “We reminded everyone of the importance of these shopping days and that they represented significant revenue for the entire year. This instills a sense of urgency and responsibility in everyone.
Once teams are well aware of the significance of the Thanksgiving weekend, Chakrabarti also ensures that teams are well-prepared for what is likely to be an onslaught of requests.
- On Call Schedules with Daily Rotation: Make sure that you have on-call schedules covered for all of your engineering teams. If you do not want someone to have to cover the entire holiday weekend, a daily rotation (instead of weekly) distributes that on-call load.
- Anticipate Traffic: Be mindful that Black Friday and Cyber Monday are major events and try to predict what your traffic pattern is going to look like. Will it be 10x, 100x, 1000x? These are numbers that any engineering team that focuses on managing their operations properly will know this because it effects the way that you plan for these major events.
- Define Escalation Path: Have the appropriate business escalation contacts defined ahead of time. During these major events, if your systems are not performing adequately, a common tactic is to disable functionality until traffic dies down, but you need the input from your business partners to make the right decisions here.
- Have a Plan: Have your incident response plan ready. Do not try to invent one on the fly when your site is down. Make sure everyone knows what is expected of them ahead of time before downtime occurs.
Subscribe to Heavybit Updates
You don’t have to build on your own. We help you stay ahead with the hottest resources, latest product updates, and top job opportunities from the community. Don’t miss out—subscribe now.
Content from the Library
What's Missing to Make AI Agents Mainstream?
2025 was to be the year of AI agents, a prediction that may or may not have come true, depending on which people you ask. For a...
The Acqui-Hire Is No Longer a Distress Sale
Acqui-Hires Then Vs. Acqui-Hires Now Throughout startup history, an acqui-hire meant a company had failed. The product didn’t...
How Back-End Engineering Is Evolving in the Age of AI
Where Back-End Systems Are Headed in the Age of AI While much has been written about AI’s ability to dramatically speed up...
