Robert Storey and Jeff Kinz were discussing procmail scripts to "handle" HTML mail.
Robert wrote:
To send all html mail to trash, filter the heading:
Content-Type: multipart/alternative;
Or if you only want to send html mail on the Fedora list to trash, use a logical AND:
To: For users of Fedora Core releases fedora-list@redhat.com
AND
Content-Type: multipart/alternative;
That should take care of it. For good.
Jeff replied:
Say, Robert - I think this filter is too loose. This method will also trash text based email which are PGP signed. Thats not what you want.
Will it? PGP signed e-mail should be multipart/mixed or multipart/signed (depending on how much is signed).
On the other hand, it *won't* get simple HTML-only mail: Content-Type: text/html is perfectly valid.
But you are on the right track.
Perhaps something that tracks more closely on html, like this line from my procmailrc file (this line is in a recipe which looks only at email headers and it is case insensitive):
- ^content-type:.*html
That shouldn't get *any* Fedora List e-mail...
The mailman mailing list software that Red Hat uses, as you know, sticks a list signature at the end of every e-mail. It's trivial to do this for plain text, very difficult to do for arbitrarily complex HTML, and impossible for signed e-mail that uses MIME to store the signature.
So if you have a MIME e-mail that isn't just text/plain, mailman will stick in another MIME part with the signature in plain text, and set the content-type for the entire message to multipart/mixed. And when procmail is examining headers, it doesn't include MIME message part headers.
I have this in my .procmailrc: it is designed to catch *list* mail that has an HTML part *without* a corresponding text/plain part. So far, it hasn't been fooled:
:0 fhw * ^content-type: multipart/mixed * B ?? ^content-type: text/html * B ?? !^content-type: multipart/alternative * B ?? !^content-disposition: attachment | formail -A "X-Label: html-only" # sort out Red Hat's footers. Come to that, it should pick up on any text/html # that doesn't have an equivalent text/ something else. # Now we have to wait for something to fool the regexps... oh well.
Mutt can do scoring on the X-Label I'm putting in the header. And since I use mutt, I'm more interested in flagging e-mails without the main part of the e-mail in text.
(The "content-disposition" line will mean the mail doesn't get flagged if it includes an HTML file as an attachment: you may well want to leave that line out. It seems *very* difficult to make sure that the content-disposition relates to the HTML message part, and I'd rather be cautious).
Take out the multipart/alternative bit if you want to get *all* mail with HTML content.
James.