summaryrefslogtreecommitdiff
path: root/src/resolve/resolved-dns-transaction.h
AgeCommit message (Collapse)Author
2016-06-24Merge pull request #3594 from poettering/resolved-servfailMartin Pitt
resolved fixes for handling SERVFAIL errors from servers
2016-06-23resolved: rework SERVFAIL handlingLennart Poettering
There might be two reasons why we get a SERVFAIL response from our selected DNS server: because this DNS server itself is bad, or because the DNS server actually serving the zone upstream is bad. So far we immediately downgraded our server feature level when getting SERVFAIL, under the assumption that the first case is the only possible case. However, this meant we'd downgrade immediately even if we encountered the second case described above. With this commit handling of SERVFAIL is reworked. As soon as we get a SERVFAIL on a transaction we retry the transaction with a lower feature level, without changing the feature level tracked for the DNS server itself. If that fails too, we downgrade further, and so on. If during this downgrading the SERVFAIL goes away we assume that the DNS server we are talking to is bad, but the zone is fine and propagate the detected feature level to the information we track about the DNS server. Should the SERVFAIL not go away this way we let the transaction fail and accept the SERVFAIL.
2016-06-21resolved: when using the ResolveRecord() bus call, adjust TTL for caching timeLennart Poettering
When we return the full RR wire data, let's make sure the TTL included in it is adjusted by the time the RR sat in the cache. As an optimization we do this only for ResolveRecord() and not for ResolveHostname() and friends, since adjusting the TTL means copying the RR object, and we don#t want to do that if there's no reason to. (ResolveHostname() and friends don't return the TTL hence there's no reason to in that case)
2016-02-22resolved: fix notification iteration logic when transactions are completedLennart Poettering
When a transaction is complete, and we notify its owners, make sure we deal correctly with the requesters removing themselves from the list of owners while we continue iterating. This was previously already dealt with with transactions that require other transactions for DNSSEC purposes, fix this for other possibly transaction owners too now. Since iterating through "Set" objects is not safe regarding removal of entries from it, rework the logic to use two Sets, and move each entry we notified from one set to the other set before we dispatch the notification. This move operation requires no additional memory, and enables us to ensure that we don't notify any object twice. Fixes: #2676
2016-02-16Use provided buffer in dns_resource_key_to_stringZbigniew Jędrzejewski-Szmek
When the buffer is allocated on the stack we do not have to check for failure everywhere. This is especially useful in debug statements, because we can put dns_resource_key_to_string() call in the debug statement, and we do not need a seperate if (log_level >= LOG_DEBUG) for the conversion. dns_resource_key_to_string() is changed not to provide any whitespace padding. Most callers were stripping the whitespace with strstrip(), and it did not look to well anyway. systemd-resolve output is not column aligned anymore. The result of the conversion is not stored in DnsTransaction object anymore. It is used only for debugging, so it seems fine to generate it when needed. Various debug statements are extended to provide more information.
2016-02-10tree-wide: remove Emacs lines from all filesDaniel Mack
This should be handled fine now by .dir-locals.el, so need to carry that stuff in every file.
2016-01-25resolved: replace DNS_TRANSACTION_RESOURCES by DNS_TRANSACTION_ERRNOLennart Poettering
Whenever we encounter an OS error we did not expect, we so far put the transaction into DNS_TRANSACTION_RESOURCES state. Rename this state to DNS_TRANSACTION_ERRNO, and save + propagate the actual system error to the caller. This should make error messages triggered by system errors much more readable by the user.
2016-01-25resolved: properly handle LLMNR/TCP connection errorsLennart Poettering
The LLMNR spec suggests to do do reverse address lookups by doing direct LLMNR/TCP connections to the indicated address, instead of doing any LLMNR multicast queries. When we do this and the peer doesn't actually implement LLMNR this will result in a TCP connection error, which we need to handle. In contrast to most LLMNR lookups this will give us a quick response on whether we can find a suitable name. Report this as new transaction state, since this should mostly be treated like an NXDOMAIN rcode, except that it's not one.
2016-01-25resolve: generate a nice clean error when clients try to resolve a name when ↵Lennart Poettering
the network is down
2016-01-11resolved: rename DnsTransaction's current_features field to ↵Lennart Poettering
current_feature_level This is a follow-up for f4461e5641d53f27d6e76e0607bdaa9c0c58c1f6.
2016-01-11resolved: don't attempt to send queries for DNSSEC RR types to servers not ↵Lennart Poettering
supporting them If we already degraded the feature level below DO don't bother with sending requests for DS, DNSKEY, RRSIG, NSEC, NSEC3 or NSEC3PARAM RRs. After all, we cannot do DNSSEC validation then anyway, and we better not press a legacy server like this with such modern concepts. This also has the benefit that when we try to validate a response we received using DNSSEC, and we detect a limited server support level while doing so, all further auxiliary DNSSEC queries will fail right-away.
2016-01-11resolved: when we get a TCP connection failure, try againLennart Poettering
Previously, when we couldn't connect to a DNS server via TCP we'd abort the whole transaction using a "connection-failure" state. This change removes that, and counts failed connections as "lost packet" events, so that we switch back to the UDP protocol again.
2016-01-05resolved: when caching negative responses, honour NSEC/NSEC3 TTLsLennart Poettering
When storing negative responses, clamp the SOA minimum TTL (as suggested by RFC2308) to the TTL of the NSEC/NSEC3 RRs we used to prove non-existance, if it there is any. This is necessary since otherwise an attacker might put together a faked negative response for one of our question including a high-ttl SOA RR for any parent zone, and we'd use trust the TTL.
2016-01-04resolved: explicitly handle case when the trust anchor is emptyLennart Poettering
Since we honour RFC5011 revoked keys it might happen we end up with an empty trust anchor, or one where there's no entry for the root left. With this patch the logic is changed what to do in this case. Before this patch we'd end up requesting the root DS, which returns with NODATA but a signed NSEC we cannot verify, since the trust anchor is empty after all. Thus we'd return a DNSSEC result of "missing-key", as we lack a verified version of the key. With this patch in place, look-ups for the root DS are explicitly recognized, and not passed on to the DNS servers. Instead, if downgrade-ok mode is on an unsigned NODATA response is synthesized, so that the validator code continues under the assumption the root zone was unsigned. If downgrade-ok mode is off a new transaction failure is generated, that makes this case recognizable.
2016-01-04resolved: block transaction GC'ing while ↵Lennart Poettering
dns_transaction_request_dnssec_keys() is running If any of the transactions started by dns_transaction_request_dnssec_keys() finishes promptly without requiring asynchronous operation this is reported back to the issuing transaction from the same stackframe. This might ultimately result in this transaction to be freed while we are still in its _request_dnssec_keys() stack frame. To avoid memory corruption block the transaction GC while in the call, and manually issue a GC after it returned.
2015-12-27resolved: add dns_transaction_close_connection()Lennart Poettering
This new call unifies how we shut down all connection resources, such as UDP sockets, event sources, and TCP stream objects. This patch just adds the basic hook-up, this function will be used more in later commits.
2015-12-27resolved: remember explicitly whether we already tried a stream connectionLennart Poettering
On LLMNR we never want to retry stream connections (since local TCP connections should work, and we don't want to unnecessarily delay operation), explicitly remember whether we already tried one, instead of deriving this from a still stored stream object. This way, we can free the stream early, without forgetting that we tried it.
2015-12-26resolved: generate an explicit transaction error when we cannot reach server ↵Lennart Poettering
via TCP Previously, if we couldn't reach a server via UDP we'd generate an MAX_ATTEMPTS transaction result, but if we couldn't reach it via TCP we'd generate a RESOURCES transaction result. While it is OK to generate two different errors I think, "RESOURCES" is certainly a misnomer. Introduce a new transaction result "CONNECTION_FAILURE" instead.
2015-12-18resolved: propagate the DNSSEC result from the transaction to the query and ↵Lennart Poettering
the the bus client It's useful to generate useful errors, so let's do that.
2015-12-18resolved: rename DNS_TRANSACTION_FAILURE → DNS_TRANSACTION_RCODE_FAILURELennart Poettering
We have many types of failure for a transaction, and DNS_TRANSACTION_FAILURE was just one specific one of them, if the server responded with a non-zero RCODE. Hence let's rename this, to indicate which kind of failure this actually refers to.
2015-12-18resolved: add support NSEC3 proofs, as well as proofs for domains that are ↵Lennart Poettering
OK to be unsigned This large patch adds a couple of mechanisms to ensure we get NSEC3 and proof-of-unsigned support into place. Specifically: - Each item in an DnsAnswer gets two bit flags now: DNS_ANSWER_AUTHENTICATED and DNS_ANSWER_CACHEABLE. The former is necessary since DNS responses might contain signed as well as unsigned RRsets in one, and we need to remember which ones are signed and which ones aren't. The latter is necessary, since not we need to keep track which RRsets may be cached and which ones may not be, even while manipulating DnsAnswer objects. - The .n_answer_cachable of DnsTransaction is dropped now (it used to store how many of the first DnsAnswer entries are cachable), and replaced by the DNS_ANSWER_CACHABLE flag instead. - NSEC3 proofs are implemented now (lacking support for the wildcard part, to be added in a later commit). - Support for the "AD" bit has been dropped. It's unsafe, and now that we have end-to-end authentication we don't need it anymore. - An auxiliary DnsTransaction of a DnsTransactions is now kept around as least as long as the latter stays around. We no longer remove the auxiliary DnsTransaction as soon as it completed. THis is necessary, as we now are interested not only in the RRsets it acquired but also in its authentication status.
2015-12-18resolved: merge two bools into a bitfieldLennart Poettering
2015-12-18resolved: cache stringified transaction key once per transactionLennart Poettering
We end up needing the stringified transaction key in many log messages, hence let's simplify the logic and cache it inside of the transaction: generate it the first time we need it, and reuse it afterwards. Free it when the transaction goes away. This also updated a couple of log messages to make use of this.
2015-12-11resolved: rework how and when the number of answer RRs to cache is determinedLennart Poettering
Instead of figuring out how many RRs to cache right before we do so, determine this at the time we install the answer RRs, so that we can still alter this as we manipulate the answer during validation. The primary purpose of this is to pave the way so that we can drop unsigned RRsets from the answer and invalidate the number of RRs to cache at the same time.
2015-12-10resolved: chase DNSKEY/DS RRs when doing look-ups with DNSSEC enabledLennart Poettering
This adds initial support for validating RRSIG/DNSKEY/DS chains when doing lookups. Proof-of-non-existance, or proof-of-unsigned-zones is not implemented yet. With this change DnsTransaction objects will generate additional DnsTransaction objects when looking for DNSKEY or DS RRs to validate an RRSIG on a response. DnsTransaction objects are thus created for three reasons now: 1) Because a user asked for something to be resolved, i.e. requested by a DnsQuery/DnsQueryCandidate object. 2) As result of LLMNR RR probing, requested by a DnsZoneItem. 3) Because another DnsTransaction requires the requested RRs for validation of its own response. DnsTransactions are shared between all these users, and are GC automatically as soon as all of these users don't need a specific transaction anymore. To unify the handling of these three reasons for existance for a DnsTransaction, a new common naming is introduced: each DnsTransaction now tracks its "owners" via a Set* object named "notify_xyz", containing all owners to notify on completion. A new DnsTransaction state is introduced called "VALIDATING" that is entered after a response has been receieved which needs to be validated, as long as we are still waiting for the DNSKEY/DS RRs from other DnsTransactions. This patch will request the DNSKEY/DS RRs bottom-up, and then validate them top-down. Caching of RRs is now only done after verification, so that the cache is not poisoned with known invalid data. The "DnsAnswer" object gained a substantial number of new calls, since we need to add/remove RRs to it dynamically now.
2015-12-08resolved: add 'next_attempt_after' field to DnsTransactionDaniel Mack
For each transaction, record when the earliest point in time when the query packet may hit the wire. This is the same time stamp for which the timer is scheduled in retries, except for the initial query packets which are delayed by a random jitter. In this case, we denote that the packet may actually be sent at the nominal time, without the jitter. Transactions that share the same timestamp will also have identical values in this field. It is used to coalesce pending queries in a later patch.
2015-12-08resolved: short-cut jitter callbacks for LLMNR and mDNSDaniel Mack
When a jitter callback is issued instead of sending a DNS packet directly, on_transaction_timeout() is invoked to 'retry' the transaction. However, this function has side effects. For once, it increases the packet loss counter on the scope, and it also unrefs/refs the server instances. Fix this by tracking the jitter with two bool variables. One saying that the initial jitter has been scheduled in the first place, and one that tells us the delay packet has been sent.
2015-12-08resolved: add mDNS initial jitterDaniel Mack
The logic is to kick off mDNS packets in a delayed way is mostly identical to what LLMNR needs, except that the constants are different.
2015-12-03resolved: add a concept of "authenticated" responsesLennart Poettering
This adds a new SD_RESOLVED_AUTHENTICATED flag for responses we return on the bus. When set, then the data has been authenticated. For now this mostly reflects the DNSSEC AD bit, if DNSSEC=trust is set. As soon as the client-side validation is complete it will be hooked up to this flag too. We also set this bit whenver we generated the data ourselves, for example, because it originates in our local LLMNR zone, or from the built-in trust anchor database. The "systemd-resolve-host" tool has been updated to show the flag state for the data it shows.
2015-12-03resolved: add a simple trust anchor database as additional RR sourceLennart Poettering
When doing DNSSEC lookups we need to know one or more DS or DNSKEY RRs as trust anchors to validate lookups. With this change we add a compiled-in trust anchor database, serving the root DS key as of today, retrieved from: https://data.iana.org/root-anchors/root-anchors.xml The interface is kept generic, so that additional DS or DNSKEY RRs may be served via the same interface, for example by provisioning them locally in external files to support "islands" of security. The trust anchor database becomes the fourth source of RRs we maintain, besides, the network, the local cache, and the local zone.
2015-11-27resolved: fallback to TCP if UDP failsTom Gundersen
This is inspired by the logic in BIND [0], follow-up patches will implement the reset of that scheme. If we get a server error back, or if after several attempts we don't get a reply at all, we switch from UDP to TCP for the given server for the current and all subsequent requests. However, if we ever successfully received a reply over UDP, we never fall back to TCP, and once a grace-period has passed, we try to upgrade again to using UDP. The grace-period starts off at five minutes after the current feature level was verified and then grows exponentially to six hours. This is to mitigate problems due to temporary lack of network connectivity, but at the same time avoid flooding the network with retries when the feature attempted feature level genuinely does not work. Note that UDP is likely much more commonly supported than TCP, but depending on the path between the client and the server, we may have more luck with TCP in case something is wrong. We really do prefer UDP though, as that is much more lightweight, that is why TCP is only the last resort. [0]: <https://kb.isc.org/article/AA-01219/0/Refinements-to-EDNS-fallback-behavior-can-cause-different-outcomes-in-Recursive-Servers.html>
2015-11-27resolved: for a transaction, keep track where the answer data came fromLennart Poettering
Let's track where the data came from: from the network, the cache or the local zone. This is not only useful for debugging purposes, but is also useful when the zone probing wants to ensure it's not reusing transactions that were answered from the cache or the zone itself.
2015-11-27resolved: store just the DnsAnswer instead of a DnsPacket as answer in ↵Lennart Poettering
DnsTransaction objects Previously we'd only store the DnsPacket in the DnsTransaction, and the DnsQuery would then take the DnsPacket's DnsAnswer and return it. With this change we already pull the DnsAnswer out inside the transaction. We still store the DnsPacket in the transaction, if we have it, since we still need to determine from which peer a response originates, to implement caching properly. However, the DnsQuery logic doesn't care anymore for the packet, it now only looks at answers and rcodes from the successfuly candidate. This also has the benefit of unifying how we propagate incoming packets, data from the local zone or the local cache.
2015-11-25resolved: fully support DNS search domainsLennart Poettering
This adds support for searching single-label hostnames in a set of configured search domains. A new object DnsQueryCandidate is added that links queries to scopes. It keeps track of the search domain last used for a query on a specific link. Whenever a host name was unsuccessfuly resolved on a scope all its transactions are flushed out and replaced by a new set, with the next search domain appended. This also adds a new flag SD_RESOLVED_NO_SEARCH to disable search domain behaviour. The "systemd-resolve-host" tool is updated to make this configurable via --search=. Fixes #1697
2015-11-18tree-wide: sort includes in *.hThomas Hindoe Paaboel Andersen
This is a continuation of the previous include sort patch, which only sorted for .c files.
2015-08-25resolved: rename DNS UDP socket to 'dns_udp_fd'Lennart Poettering
This hopefully makes this a bit more expressive and clarifies that the fd is not used for the DNS TCP socket. This also mimics how the LLMNR UDP fd is named in the manager object.
2015-08-21resolved: only maintain one question RR key per transactionLennart Poettering
Let's simplify things and only maintain a single RR key per transaction object, instead of a full DnsQuestion. Unicast DNS and LLMNR don't support multiple questions per packet anway, and Multicast DNS suggests coalescing questions beyond a single dns query, across the whole system.
2015-08-03resolved: transaction - increase number of retry attemptsTom Gundersen
With the exponential backoff, we can perform more requests in the same amount of time, so bump this a bit. In case of large RTT this may be necessary in order not to regress, and in case of large packet-loss it will make us more robust. The latter is particularly relevant once we start probing for features (and hence may see packet-loss until we settle on the right feature level).
2015-08-03resolved: transaction - exponentially increase retry timeoutsTom Gundersen
Rather than fixing this to 5s for unicast DNS and 1s for LLMNR, start at a tenth of those values and increase exponentially until the old values are reached. For LLMNR the recommended timeout for IEEE802 networks (which basically means all of the ones we care about) is 100ms, so that should be uncontroversial. For unicast DNS I have found no recommended value. However, it seems vastly more likely that hitting a 500ms timeout is casued by a packet loss, rather than the RTT genuinely being greater than 500ms, so taking this as a startnig value seems reasonable to me. In the common case this greatly reduces the latency due to normal packet loss. Moreover, once we get support for probing for features, this means that we can send more packets before degrading the feature level whilst still allowing us to settle on the correct feature level in a reasonable timeframe. The timeouts are tracked per server (or per scope for the multicast protocols), and once a server (or scope) receives a successfull package the timeout is reset. We also track the largest RTT for the given server/scope, and always start our timouts at twice the largest observed RTT.
2015-07-27resolved: transaction - introduce dns_transaction_emit()Tom Gundersen
This function emits the UDP packet via the scope, but first it will determine the current server (and connect to it) and store the server in the transaction. This should not change the behavior, but simplifies the code.
2015-07-27resolved: transaction - move DNS UDP socket creation to the scopeTom Gundersen
With access to the server when creating the socket, we can connect() to the server and hence simplify message sending and receiving in follow-up patches.
2015-07-27resloved: transaction - unify IPv4 and IPv6 socketsTom Gundersen
A transaction can only have one socket at a time, so no need to distinguish these.
2015-07-14resolved: use one UDP socket per transactionTom Gundersen
We used to have one global socket, use one per transaction instead. This has the side-effect of giving us a random UDP port per transaction, and hence increasing the entropy and making cache poisoining significantly harder to achieve. We still reuse the same port number for packets belonging to the same transaction (resent packets).
2015-07-14resolved: pin the server used in a transactionTom Gundersen
We want to discover information about the server and use that in when crafting packets to be resent.
2015-02-23remove unused includesThomas Hindoe Paaboel Andersen
This patch removes includes that are not used. The removals were found with include-what-you-use which checks if any of the symbols from a header is in use.
2014-08-05resolved: add 100ms initial jitter to all LLMNR requestsLennart Poettering
2014-07-31resolved: implement LLMNR uniqueness verificationLennart Poettering