We rolled out a change that broke scheduling. This was not properly tested in our staging environment due to a mis-configuration of the staging environment, hence the faulty roll-out.
We identified the issue at 10:47, 22 minutes after it started.
We pushed a fix for the issue at 11:05.
The fix was applied to production around 11:25.
Detection of the issue and rollback of the issue could have been faster, and we have action items below to address those issues.
Follow up actions items: