Nov 29 – 30, 2021
UTC timezone
OARC 36 Day 1 - begins 14:00 UTC Today 29 November

Slack’s DNSSEC Rollout: Third Time’s the Outage

Nov 29, 2021, 3:40 PM
Standard Presentation Online Workshop OARC 36 Day 1


Rafael de Elvira (Senior Software Engineer @ Slack)


On September 30th 2021, Slack had an outage that impacted less than 1% of our online user base, and lasted for 24 hours. This outage was the result of our attempt to enable DNSSEC, but which ultimately led to a series of unfortunate events.

On this talk we'll cover our DNSSEC rollout to all Slack critical domains and the three failed attempts to enable DNSSEC on ­– doing a deep dive into our third attempt (the Sept 30th outage) – where we'll cover what was done during the outage, why we did it and ultimately the root cause of the outage, which was a bug in the DNSSEC implementation on our cloud provider authoritative DNS server.

Primary author

Rafael de Elvira (Senior Software Engineer @ Slack)

Presentation materials