|
Email is a complex service and email abuse adds confusing deceptions. Worse, like postal mail and even telephone service, Internet mail is inherently open, flexible and even anonymous, making things much easier for abusers. Bad actors hide their true identity and their true purpose. Most other communication tools for users also are also quite open, and problems with email are being replicated elsewhere, such as instant messaging and social media. Attackers are clever and extremely adaptable. Tools for protection often are expensive and specialized, and they often take a long time to deploy.
All of this creates an extra burden on anti-abuse workers to be extremely careful when discussing issues, facts and proposals. A recent corporate blog by TrendLabs demonstrates this need, by failing to satisfy it:
Possible Phishing with DKIM – TrendLabs Malware Blog
The blog concerns DomainKeys Identified Mail (DKIM), which uses cryptographic authentication technology for an unusual purpose. Unlike other services that use similar digital signing techniques, such as OpenPGP and S/MIME, DKIM does not “validate” a message or the message’s author, as listed in the From:
header field. [RFC5322] Rather:
[It] permits a person, role, or organization that owns the signing domain to claim some responsibility for a message by associating the domain with the message. [RFC4871bis]
That is, DKIM’s sole job is to attach an identifier that can be believed, specifically a domain name that can be unrelated to any other identifier in the message. That domain name is used for associating the reputation of the domain owner with the message. The name is carried inside a DKIM-specific header field and can be attached by any handler of the message. Multiple signatures and signatures for different domains are all entirely legal; in fact, the aggregation of signatures and reputations can aid in evaluating how to process the message.
Also, note the qualifier in the quoted text: the owner of the name takes only some responsibility for the message. The nature and degree of that responsibility are unstated, because different actors in the handling of message have very different responsibilities. DKIM does not differentiate among those. The standardized semantics of DKIM mean that the verifier knows merely that the signer is asserting involvement. The verifier then chooses how to use that information, such as consulting a compiled listing of reputations for email handlers.
Digital signature technology uses hashing, to create a unique, short “description” of the data being signed. If the message is modified in transit, the receiver will compute a different hash that will not match the original. Hence as a side-effect of doing authentication, DKIM provides some data integrity protection between the time of signing and the time of verifying. However data integrity does not mean data validity. DKIM makes no statement about the validity of any data in the message, except the signature itself and the domain name it used.
DKIM is an enabling service. It’s direct benefit is small, but it provides an essential foundation for the development of trust-based mechanisms, as well as possibly being useful for mechanisms that can detect some types of deceptive messages. One such value-added mechanism is ADSP, which does link the DKIM identifier to the author From:
field, with the primary goal of finding messages from sources not authorized to use the domain name in the From:
field. However these additional mechanisms are value-added enhancements and are outside the scope of the core DKIM signing mechanism. DKIM, itself, solely specifies how to sign a message and then how to verify a signed message, and it defines the modest meaning of the signature.
To the extent that signers and verifiers wish to have a signature based on semantics that are different from DKIM, they need a different signing specification. The core innovations of DKIM are its use of the DNS to store public keys (associated with the domain name) and its placing signature information in a header field, rather than inline with the data, as done by OpenPGP and S/MIME. This means that receivers not supporting the signature do not have to deal with it, since header fields that are not recognized are ignored. In an effort to encourage creation of such parallel signature mechanisms, DKIM’s core technology has been extracted into a specification “library” called DOSETA (DOmain SEcurity TAgging.)
An IETF working group is revising the DKIM signing specification, for the next stage of standardization. The TrendLabs blog criticizes the working group for not adding a mechanism to counteract a problem that is entirely outside the scope of DKIM’s purview:
Rather than validating DKIM’s input and not relying upon specialized handling of DKIM results, some members deemed it a protocol layer violation to examine elements that may result in highly deceptive messages when accepted on the basis of DKIM signatures.
The blog’s description of the facts, its premise about the requirements, and its apparent understanding of DKIM’s functionality all suffer from basic flaws. The DKIM specification mandates that input to DKIM must be valid according to RFC5322. In requiring this, it is placing a burden on the containing system to ensure that a message is well-formed. It is not DKIM’s job to do the basic message validation; it’s the job of the requesting software.
Again, from the TrendLabs blog:
The details are simple and the original goals were good. DKIM was intended to authenticate domain relationships with an email message bound at a minimum to that of the From header field.
This statement appears to be similar to DKIM’s actual goal, cited above, but it contains an essential error: DKIM does not validate the author From:
header field, nor any other identifier in the message, except the one used for signing. This encourages users of DKIM to focus on its role to identify handlers of email, not merely authors. For example, it is perfectly legitimate for a piece of software running a mailing list to sign messages it redistributes. In this case, the signing domain and the domain in the From:
field are almost always going to be different. This is perfectly legal with DKIM.
The only “binding” that DKIM has with the From:
field is the described one of data integrity, and again, that has nothing to do with validity. The binding that DKIM actually does is between an independent identifier, carried in a DKIM header field, and a hash of (part of) the message.
The blog continues:
The relationship was to provide a basis for message acceptance but failed to offer the intended protections whenever a message contains invalid or fake elements still considered to offer a valid signature. While it would not be a protocol violation to declare such messages with invalid or fake elements to not have a valid signature, there are some who think otherwise.
However DKIM does not make statements of specific “correctness” about messages. It merely attaches a reliable and accurate domain name that can be used for developing a reputation for the signer, with the verifier then applying the reputation to the stream of similarly-signed messages. Concerning the basic message invalidity described in the blog, the actual requirement is to make that declaration before DKIM ever sees the message!
Specifically, the blog’s analysis of the attack sequence is:
Here is how DKIM can be exploited:
...
2. Send yourself a message of a sensitive nature.
3. Prepend any dummy From header field ignored by DKIM to mislead recipients
with regard to the message’s origin.
4. Exploiting DKIM’s replay insensitivity, a malefactor can then resend the
message as a mailing list
That is, start with a valid, signed message and then add a second, bogus From:
field and redistribute the bogus message to potential victims. The goal is to have the bogus From:
field be displayed rather than the valid one, perhaps alongside some indication that DKIM validation was successful, giving the bogus From:
field credibility. However, the email standard, RFC5322, requires one, and only one, From:
field. A message with multiple From:
fields is not valid, and remember that DKIM requires its input to conform to RFC5322.
So, the exploit that is described quite possibly is an interesting one, but it really has nothing at all to do with DKIM. A message with no DKIM signature can still suffer the basic exploit and handling software needs to look for the exploit whether DKIM is present or not. DKIM is merely one component in a necessary suite of protection mechanisms.
Confusing the different mechanisms is likely to undermine their utility. For example, if this attack were handled by DKIM, rather than elsewhere in the system, there would be no coverage of the exploit for messages lacking a DKIM signature. The TrendLabs blog therefore seems to call for handling the attack in a way that would reduce protection rather than increase it!
Attempting to emphasize relative costs, the blog asserts:
The overhead related to DKIM’s cryptographically based authentication dwarfs the effort of NOT ignoring the presence of multiple From header fields.
What the blog misses is that the problems and complexity created by having the wrong part of a system implement a particular mechanism dwarf the effort to do the implementation. It’s easy to put in a small bit of software to look for something and respond to it. It’s harder to find the right place to put that software into the overall system, so that it works correctly and provides the proper protection.
An odd factual confusion in the blog is:
While SMTP still permits RFC822-compliant messages, DKIM was based on RFC5322, which stipulates the legal number of specific header fields.
Yet RFC822, which was the first standard for Internet mail format, also requires only and exactly one From:
field.
The blog also makes some recommendations that DKIM contain controls in the use of Internationalized Domain Names, again seeming to reflect the desire to have DKIM ensure perfect messages.
Ultimately, the view expressed in TrendLab’s blog is a bit like looking for lost keys under the lamppost. It’s so much more convenient to see things there, that it is easy to miss the fact that it’s not the right place to look.
The blog ends with an uncharacteristic bit of hyperbole from a commercial company:
The decision to ignore additional From header fields prepended to an email and yet still return a valid signature result (the only output of DKIM) makes this an EVIL protocol.
Not too many protocols get the honor of such an assessment. Now, if only they had meant the reference to be to the Evil Bit...
Sponsored byVerisign
Sponsored byCSC
Sponsored byWhoisXML API
Sponsored byRadix
Sponsored byIPv4.Global
Sponsored byDNIB.com
Sponsored byVerisign
Bad Doug! Bad!
My rebuttal to the misguided Trend Micro post is here.
— Barry Leiba, DKIM working group chair
I agree, the Trend post got a lot of things wrong. And FUD about authentication is the last thing the email world needs.
Further commentary here.
-MSK (DKIMbis co-author, OpenDKIM implementer)
Dave,
You are a good guy, but your statements about the DKIM specification are misleading and wrong.
In your post you stated “The DKIM specification mandates that input to DKIM must be valid according to RFC5322.” Unfortunately, the DKIM specification fails to impose this requirement. DKIM only recommends that a message comply with RFC5322, and fails to make this an essential requirement by using the requisite MUST language! In addition, it is also wrong to expect after three decades of relying on the lax RFC822 specification you authored in 1982, that now SMTP will suddenly impose stricter message format requirements. Do you really think this will happen within your lifetime?
For DKIM results to be safely used as a basis for acceptance it is absolutely imperative to ensure any pre-pended From header field, the one that will be displayed, causes a DKIM signature to be invalid.
Secondly, my Trend Micro blog only stated that DKIM authenticates the
binding
of a domain with that of the From header field. When there are multiple From header fields, where the one being displayed may be the one ignored by DKIM, is where this goes wrong. Any assurances about the ignored From header could be incorrectly obtained from an expected relationship with the signing domain.
A high value domain will have no control over what DKIM might ignore by the receiver. Since DKIM signatures can not be trusted as being bound with the visible From header field, why should anyone depend upon this seriously flawed protocol? Comments about how anti-spam filters will likely catch what DKIM misses is disheartening. It is also shameful to expend the tremendous overhead related with securely hashing the message and checking signatures be so easily undone by a pre-pended From header.
You have repeated your philosophy that when a security protocol checks an input it uses to confirm a relationship, which at a minimum covers the From header field, that ensuring there is only one such header field before reporting the relationship as valid is a protocol layer violation. Really?
Expecting spam filters to second-guess whether DKIM made a check IS a protocol layer violation, but somehow everyone seems willing to overlook this tangled mess. There will be victims where malefactors discover gaps in these undefined requirements. If DKIM can not be trusted to confirm a binding with the From header field, it can’t be trusted.
From the DKIM draft that Doug is objecting to: https://datatracker.ietf.org/doc/draft-ietf-dkim-rfc4871bis/?include_text=1 > 3.8. Input Requirements > > DKIM's design is predicated on valid input. Therefore, signers and > verifiers SHOULD take reasonable steps to ensure that the messages > they are processing are valid according to [RFC5322], [RFC2045], and > any other relevant message format standards. The difference between the SHOULD that is used and the stronger alternative of MUST is that it permits a verifier to allow an exception if and only if they are certain they know what they are doing. In practical terms, for most implementers, a SHOULD is the same as a MUST. Working groups specify SHOULD when they can envision scenarios that legitimately allow exceptions. A predicate for reading normative language in IETF specifications is understanding the normative terms that are used. This is not very difficult. However Doug is tenaciously failing that... requirement. It is worth noting at this point, after Doug's extensive pursuit of this topic for quite a few months, and now in a number of venues, that he remains the only person seriously concerned about it, with respect to DKIM. Since the IETF works on rough consensus, that's a clear lack of support for Doug's concern. d/
Last I checked, Doug, DKIM makes no claims to confirm the relationship to which you are referring. Are you sure you’re talking about the same specification we are?
Yes, this is a layering violation by definition. Each layer of an architecture has a clearly allocated responsibility, and in email the function of confirming a properly-formatted message does not lie with DKIM.
Nobody is arguing that this is not an important check to make, but it is foolish in a number of ways to put the right checks in the wrong places.
Finally, I’ve identified in past posts a few common software packages that are vulnerable to multi-From attacks, none of which know anything about DKIM. If the problem is already an old one, how can one legitimately blame DKIM for it?
This just seems like disinformation to me.
Murray,
With both you and Dave continuing to make statements about protocol layering violations when checking for multiple From header fields, this does not clarify which protocol IS responsible for ensuring a message with multiple From header fields is not reported as having a valid DKIM signature. SMTP is not making these checks.
Perhaps you are of the mind that returning a valid DKIM signature result for a message with multiple From header fields is okay? By not defining this issue in the DKIM specification using a MUST suggests there just might be valid reasons to return a valid signature result for messages having multiple From header fields. Is there?
How can DKIM be used as a basis for acceptance when signing domains may have never seen the displayed pre-pended headers? It is not difficult for a malefactor to send themselves their own crafted message, and to then resend it with pre-pended headers that appear to be from some mailing list and authored by some high value domain. Recipients are unlikely to see the mailing-list headers, but are likely to see pre-pended From header fields.
BTW, in response to Dave, the resolution for this issue was authored by three individuals. Others also expressed concern, but were dissuaded by Dave and yourself the futility of pursuing this issue.
See:
http://trac.tools.ietf.org/wg/dkim/trac/ticket/24
Hi, Doug. Nobody had to persuade me that it is not a good idea to go down this rathole. Your entire argument starts from the false assumption that a signed spam message would have a good reputation that an added sometimes-visible From: line could exploit. Not to belabor the obvious, but if this arcane attack ever became popular, the response is for spam filters to discard any message with multiple From: headers since we've never seen a good message with more than one. No DKIM or other signatures required.
John, As you know, acceptance is not always based upon "good" reputations. Often it is based upon large and "reasonably good" reputations. When a malefactor constructs a message where the body indicates it is for their intended victim, and signed with a large domain's signature, reputations of the DKIM signature must be ignored because it offers no assurance pre-pended From header fields were excluded by a valid result. Because DKIM has this flaw, little benefit can be obtained checking signing domain reputations. If anything, such checks may cause greater harm. You are the one always advocating silent discards, but this reduces emails integrity. This flaw in DKIM also affects Authentication-Results header since it can't signal whether multiple From headers were excluded either. This flaw also defeats RFC5617 since ADSP failed to consider this threat. ADSP depends upon valid DKIM signature results. The application of ADSP may prove deceptive because which From header represents the Author Address becomes undefined, nor did ADSP warn about checking for more than one Author Address! Not fixing DKIM is a worse problem than any perceived rat-hole issue you may have. Without the fix, DKIM results have NO value. Without this fix, DKIM results MUST NOT be trusted. Now email must develop signaling and protocol layers likely to cause more protracted and unpleasant discussions where the fix would have been far less disruptive. http://trac.tools.ietf.org/wg/dkim/trac/ticket/24
What Murray said
Doug, these questions have been asked and answered time and time again. Anyone wishing to do so can consult the record to review it. I'm truly sorry that the answers don't seem to fit your view of how email works, but it is for all intents and purposes a singular view. Yes, it's perfectly fine for DKIM to return success on a multi-From message. The reasons why this is not a problem have also been made clear. The solution you propose is wrong, as are its premises.