-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add downtime incident log #530
Comments
digital-land/technical-documentation#53 Outage - Submit Service - 2024-10-08In attendance
DescriptionThe outage affected the Submit Service, causing users to experience 502 and 503 errors, with the landing page becoming unresponsive. The root cause was identified as a bug in handling undefined organisation IDs, which led to server crashes when users refreshed the error page. The issue persisted until a fix was developed and deployed. Running log
PostmortemThe outage was caused by a bug in handling undefined organisation IDs, leading to a "Cannot set headers after they are sent to the client" error. This was compounded by a flaw in the parallel middleware feature, which caused server crashes whenever users refreshed the error page. Once the root cause was identified, a fix was developed to properly handle undefined organisation IDs, preventing the crash from occurring. The fix was tested locally, reviewed, and deployed in a staggered manner to restore the service without further disruption. Traffic was fully restored within an hour of the initial outage. Actions to Prevent Similar Incidents in the Future
|
Slack thread: https://tpximpact.slack.com/archives/C077QAHM8TB/p1728406581259469
The text was updated successfully, but these errors were encountered: