This morning, Microsoft Teams was unavailable for many users for approximately two hours. This isn’t the first high-profile Teams outage this year. Last month, Teams was down for several hours due to an expired TLS certificate.
This outage—which occurred at about 8:30am UTC—seems to have been due to high load caused by an unprecedented number of European users logging in to work remotely. Microsoft’s communication so far has been extremely terse, announcing both the issue and its mitigation on Twitter while inviting Teams admins to examine the issue by number in the Admin Center.
Even for those who are Office 365 administrators and have access to the Admin Center, not much information was available. Many admins complained that issue TM206544 didn’t show up in Admin Center at all; those who did copied and pasted the information there, but it wasn’t particularly enlightening.
End time: Monday, March 16, 2020, at 10:35am UTC
Preliminary root cause: One of our services experienced timeouts, which impacted messaging scenarios.
Next steps:—We’re reviewing our code to improve service resiliency and avoid future incidents.
It’s unclear to what degree the issue is truly resolved; Downdetector.com still shows many reported outages, and some European users are still tweeting at @MSFT365Status that the service is unresponsive.