MailMan archive migration tools
INTRODUCTION
This is a set of scripts to assist in migrating archives from a MailMan mailing list into another system. It was originally written to export to Google Groups via SMTP delivery but it should work with any SMTP-routable list management system.
ABOUT
MailMan's primary archival mechanism is mbox files, but it appears a bug [1] was introduced at some point which results in improper formatting of those files. The problems include:
Occasional omission of the required extra newline before the "From" message separater line, leading to run-together messages
Occurrences of "From" at the beginning of lines in the body are not escaped, possibly causing messages to be split erroneously.
There are some recipes and scripts out there to solve some of these problems (e.g. [2]) but they seem either overly complex or incomplete.
In addition, Google Groups seems to thread messages incorrectly when an archive is blasted through all at once. The default migration script introduces a two-second delay between messages to help alleviate this.
IMPLEMENTATION
The current solution consists of one perl script to clean up the message separators (filter.pl), one shell script to pipe the resulting output through formail (migrate_smtp.sh), and another shell script to introduce the delay and call sendmail (send.sh). The perl script does read the whole mbox file into memory at once (due to laziness -- this should be fixed) so big archives could be a problem.
There are also a few other scripts to aid in debugging and testing the process.
USAGE
migrate_smtp.sh is the front-end script that does everything. Pass it the path to the mbox archive file and the address of the new mailing list:
./migrate_smtp.sh oldlist.mbox newlist@example.com
Note that it will pause for two seconds between messages. Edit send.sh if you want to modify that, but beware that Google Groups seems to require the delay in order to prevent it from getting confused about message thread order.
You can run filter.pl on the mbox file first and compare the output to the original file to preview the changes before actually sending any mail:
perl filter.pl oldlist.mbox | diff -u oldlist.mbox -
NOTE: It would probably be wise to disable message delivery for the receiving list before migrating the archive, or perform the migration before any subscribers are added to the receiving list. Otherwise all subscribers of the receiving list will receive a copy of all archived messages, which could amount to a substantial amount of mail for an old or large list.