On August 30th at 19:10 EDT Shadow Health Customer Support saw an increase of reports of account verification and password reset emails not being received by users.
Starting at 19:15 EDT Shadow Health Engineering investigated the issue and found log evidence of DNS errors resolving the API for our third party email sending service.
At 19:45 EDT, we were able to confirm that DNS did not appear to be at fault. Discussion began with the third party service’s technical support.
At 20:28 EDT the third party service acknowledged a widespread outage of their email sending API. We posted an incident on our status page and a notification banner within the Shadow Health LMS.
At 21:14 EDT the third party service publicly acknowledged the issue with a report on their status page.
At 21:33 EDT error rates for email sending from Shadow Health LMS began to decline and we were able to confirm success of emails being sent.
On August 31st at 09:10 EDT after reviewing logs from overnight we determined the issue had been resolved by approx. 21:35 EDT on August 30th. Emails were once again fully functional.
A third party email service experienced an outage which they did not report for multiple hours. We do not utilize a backup email sending services given many institutional customers' desire to allow-list our email sender.
Improved alerting for email workflows will be added to notify Engineering earlier in the event of potential email sending issues. We will work with the third party email service to understand the cause of the issue in more depth and discuss mitigation strategies.