Extracting the date of the latest message from an mbox file

How would I go about extracting the "Date:" header for the latest received message in an mbox file?

Note that it's not simply an issue of grep'ing for the latest occurrence of "^Date:", since it might as well be the date from a quoted reply, not actually the latest message received.

So, probably, some proper parsing would have to be involved.

grepmail seems to be good at grinding through mbox'es intelligently, however I can't seem to find a way to achieve this seemingly trivial task with it.

Any input?

Thanks.

E: Okay, I'm officially thick. ls -l mbox would probably do. So there.

Still, I'd be very interested in a more creative approach.


Solution 1:

Since you need something that understands the actual mbox format the canonical mail client mail or the customary more capable replacement mailx come to mind.

mailx  -f /path/to/mbox -H 

Since new messages are appended that should list your messages in order of reception.

Solution 2:

As a starting point you could do something like this to find the From line at the start of the last mail.

tac "$MAIL" | grep -m1 '^From '

A line starting with From indicates the starting point of a mail within the mbox file. It also contains the time at which it was received, which is usually more reliable than any other timestamp found within the mail headers.

If you specifically want a Date header and not the From line, you could do something like this:

tac "$MAIL" | awk '/^Date: / {print} ; /^From / {exit}' | tail -1

This will print the first line starting with Date: in the last mail. However if the headers of the last mail have no Date header, it could still be matching the contents rather than header, so you would somehow need to ensure the mail actually has a Date header.