DNS cache poisoning & transaction ID

I read many articles about it and I do not understand how the cache poisoning is possible if:

  • the query is cancelled and marked as failure if the transaction ID of the response does not match (i.e. attackers cannot brute force the transaction ID since they will only the first response will get accepted)
  • recursive resolver send only 1 request for the same domain/type/class at a time, meaning the attackers cannot trigger many request for the same domain to brute force the transaction ID

In that configuration that looks basic, how is cache poisoning possible if the attackers need to guess the transaction ID? Will the attackers force the recursive resolver to resolve the same domain again and again to have more tries? In that case that could take hours...


It's mainly your first bullet point that is inaccurate:

  • the query is cancelled and marked as failure if the transaction ID of the response does not match (i.e. attackers cannot brute force the transaction ID since they will only the first response will get accepted)

When an unknown transaction id is received, that response is discarded. But your assumption that the outstanding query is somehow considered failed is not true.

Effectively, this attack scenario becomes a race for the attacker to get a valid response in before the response from the real nameserver arrives.

The problem with your assumption is that the whole point of the transaction id (+ UDP source port) is to match the response to an outstanding query, but when there is a response where these values are wrong (doesn't match any query), how can you tell which query you should consider failed?
And if you were to allow some form of partial match, how would you implement that in a way that doesn't replace a somewhat burdensome race for the attacker with a trivially exploitable DoS attack instead?

Options for "real" protection:

  • DNSSEC allows the query originator to validate the authenticity of the records in the response, which ensures that an attacker-generated response is not accepted for names in signed zones.
  • For queries from some form of client/forwarder to a resolver server specifically, DoT and DoH also avoid the cache poisoning problem for this particular "hop" (DoT/DoH only secure a communications channel between two hosts, DNS resolutions tend to have multiple such "hops").
    (Even just querying over TCP improves the particular scenario in the question, but the cryptography-based solutions are obviously much broader in terms of what attacks they deal with.)