We were concerned about the amount of noise vs signal seen in the decade-old Conficker sinkhole and doubted whether we were using the correct algorithm for generating sinkhole domains. We use 2020 DITL data to confirm one algorithm was more likely to get hits than another.
When we see many more hits for unregistered domains than registered domains, we wonder whether our time-based DGA is working. Two algorithms shared 60% of names, but they diverged based on errors in math library implementation. One showed many more hits on 40% of names unique to each DGA. If we ran active malware, it is possible to log hits from the infected server, but standing up an infected server is quesitonable. We're glad we had access to root data.