Why does DMARC operate on the From-address, and not the envelope sender (Return-Path)?
Solution 1:
Think of SPF and DKIM as ways to validate the mail path, and think of DMARC as an extension that also validates the message sender. Think of this as delivering a FedEx letter. It's easy to validate where the envelope was shipped from, and that the courier was legitimate, but it doesn't provide a way to prove that the letter inside the envelope is really from the person whose name is printed on it.
Your webserver is a valid SMTP server for mywebserver.com and that your Sender address is legitimate, but that's not enough for other servers to trust that you have permission to send as [email protected] . How does GMail know that your server hasn't been hacked or otherwise used for malicious intent? Gmail's servers aren't going to blindly trust you to send mail as one of their users -- unless maybe you are hosted by them, and then you'd probably have trouble sending to Yahoo.
To address your first part of the question, yes, it's very likely that this is why GMail is categorizing it as spam. The oldest forms of spam center around spoofing the "From" address. This is what most users see when they get a message, and is the primary field they want to trust. When a message from a legitimate mail server is sent using a From address that doesn't belong to that mail server, it's still a red flag.
As you mentioned, DMARC operates on the From address as part of the specification. Granted, it makes it harder to write web apps that send on someone's behalf, but that's sort of the point. As to why they do it - well, that's up to the designers of the specification, but it's a trade-off. They are taking the high road and making a system that works very well if you stay within that limitation. Perhaps future mechanisms will find a way around this.
The unfortunate solution is to only use addresses that you have control of. To address your third question, sign your messages with your domain name, and mention in the body that it was sent on behalf of [email protected]. Otherwise you will have to request that your recipients add the address to their whitelist. It's not much fun for a legitimate web app developer, but it will protect the sanctity of the recipient's inbox. You might have luck using the Reply-To header with the web user's email address.
There is a discussion of this limitation on this DMARC thread.
In the mean time, you can try to make sure that your server isn't blacklisted on any RBLs. It could be that you can fail DMARC but still get through some spam filters if you have good enough reputation... but I wouldn't rely on it.
Solution 2:
There are two "why" questions:
- Why does a receiving mail server perform the check in this manner
- because that's what section 3.1 of RFC 7489 says to do
- Why was DMARC designed that way?
- because the people who designed it apparently didn't read section 3.6.2 of RFC 5322, or misinterpreted it, or ignored it.
That section clearly establishes that a Sender:
header, when present, takes priority over a From:
header, for the purposes of identifying the party responsible for sending a message:
The "Sender:" field specifies the mailbox of the agent responsible for the actual transmission of the message. For example, if a secretary were to send a message for another person, the mailbox of the secretary would appear in the "Sender:" field and the mailbox of the actual author would appear in the "From:" field. If the originator of the message can be indicated by a single mailbox and the author and transmitter are identical, the "Sender:" field SHOULD NOT be used. Otherwise, both fields SHOULD appear.
Contrast this with the rationale given in RFC 7489:
DMARC authenticates use of the RFC5322.From domain by requiring that it match (be aligned with) an Authenticated Identifier. The RFC5322.From domain was selected as the central identity of the DMARC mechanism because it is a required message header field and therefore guaranteed to be present in compliant messages, and most Mail User Agents (MUAs) represent the RFC5322.From field as the originator of the message and render some or all of this header field's content to end users.
I contend that this logic is flawed, as RFC 5322 goes on to call out this error explicitly:
Note: The transmitter information is always present. The absence of the "Sender:" field is sometimes mistakenly taken to mean that the agent responsible for transmission of the message has not been specified. This absence merely means that the transmitter is identical to the author and is therefore not redundantly placed into the "Sender:" field.
I believe that DMARC is broken by design, because
- it conflates authority to send and proof of authorship;
- it misinterprets prior RFCs, and
- in doing so it breaks any previously compliant list-serv that identified itself by adding its own
Sender:
header.
If a Sender:
field is present, DMARC should say to authenticate that field and ignore the From:
field. But that's not what it says, and therefore I consider it to be broken.
RFC 7489 continues:
Thus, this field is the one used by end users to identify the source of the message and therefore is a prime target for abuse.
This is simply wrong (in the context of justifying ignoring the Sender:
header). At the time that DMARC was designed, common email clients would routinely display a combination of the information from Sender:
and From:
fields, something like From name-for-mailing-list@server on behalf of [email protected]. So it was always clear to the user who was responsible for sending the message they were looking at.
Suggestions that Reply-To:
is an adequate replacement are also flawed because that header is widely misinterpreted as "additional recipient" rather than "replacement recipient", and replacing the original sender's Reply-To:
would impair the functionality for those users.