Reduce DNS Monitoring False Positives Caused by ServFail Errors
We noticed an interesting thing when analyzing the DNS errors that DNS Check has logged. 57% of the errors are of the “ServFail” variety, and of those, the vast majority are resolved 5 minutes later, when the record is next checked.
A ServFail error occurs when there’s an error communicating with the DNS server. This could occur for a number of reasons, including an error on the DNS server itself, or a temporary networking issue.
Fortunately, most domains use multiple authoritative DNS servers, so if there is a short lived ServFail issue on one name server which doesn’t impact the others, DNS lookups should still work. That said, if a name server has chronic ServFail issues, we recommend investigating why. ServFail errors happen, but should be rare.
Some of our customers are using DNS Check to page them when there’s a DNS lookup failure. Of those, some do want to be notified if there’s a short lived error that leads to a ServFail error, but others would rather not be paged, unless the error occurs more than once in a row.
Because of this, we’ve added a new option named “Suppress first ServFail notification” to the Notification Options page to suppress isolated ServFail errors:
If this option is turned on, the DNS checker waits until a DNS record has two or more consecutive ServFail errors before notifying you about them.
If this option is turned off, the DNS checker notifies you the first time a DNS record has a ServFail error.
Other failures, such as the wrong value being returned for a DNS record are not affected by this setting. As long as you have notifications enabled, you’ll receive a notification the first time a non-ServFail error occurs.
You can turn this setting on or off for your account by going to the Notification Options page.