Converting Mailman "Gzip'd Text" archive files to proper mbox files

Mailman archives are often only available in the pretty useless "Gzip'd Text" format, which you cannot easily download and view locally (and threaded) in a MUA such as mutt. But that is exactly what I want to do from time to time (e.g. because I want to read the discussions of the past weeks on mailing lists where I'm newly subscribed).

After some searching I found one way to do it which I stripped down to my needs:

 $ cat mailman2mbox
 #!/usr/bin/perl
 while (<STDIN>) {
   s/^(From:? .*) (at|en) /\1\@/;
   s/^Date: ([A-Z][a-z][a-z]) +([A-Z][a-z][a-z]) +([0-9]+) +([0-9:]+) +([0-9]+)/Date: \1, \3 \2 \5 \4 +0000/; 
   print;
 }

Example run on some random mail archive:

 $ wget http://participatoryculture.org/pipermail/develop/2009-August.txt.gz
 $ gunzip 2009-August.txt.gz
 $ ./mailman2mbox < 2009-August.txt > 2009-August.mbox

You can then view the mbox as usual in mutt:

 $ mutt -f 2009-August.mbox

Suggestions for a simpler method to do this are highly welcome. Maybe some mbox related Debian package already ships with a script to do this?

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

I've been trying to do this too

Apparently, there's a poorly-documented feature of Mailman that gets you the complete list history in mbox format. See here. This is nice because email addresses and dates and headers and attachments are unmangled.

I used this on the Flashrom mailing list and got the complete history from this address, http://www.flashrom.org/mailman/private/flashrom.mbox/flashrom.mbox, for example. For that list, at least, you have to be a subscriber and logged in via the web interface in order to download the mbox file. In my opinion, that's a reasonable anti-spam measure, whereas mangling everything that looks like an email address is not.

mbox

Great, thanks a lot! Didn't know about this indeed.

Using formail ?

formail, which is part of procmail package should fill your needs, I think.

Hope this helps.

formail

Thanks, couldn't find any such option in formail's manpage though. Maybe I missed it.