ffainelli / mailman-archive-migration

Tools to assist in exporting MailMan archives, e.g. to Google Groups

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MailMan archive migration tools

INTRODUCTION

This is a set of scripts to assist in migrating archives from a MailMan mailing list into another system. It was originally written to export to Google Groups via SMTP delivery but it should work with any SMTP-routable list management system.

ABOUT

MailMan's primary archival mechanism is mbox files, but it appears a bug [1] was introduced at some point which results in improper formatting of those files. The problems include:

  • Occasional omission of the required extra newline before the "From" message separater line, leading to run-together messages

  • Occurrences of "From" at the beginning of lines in the body are not escaped, possibly causing messages to be split erroneously.

There are some recipes and scripts out there to solve some of these problems (e.g. [2]) but they seem either overly complex or incomplete.

In addition, Google Groups seems to thread messages incorrectly when an archive is blasted through all at once. The default migration script introduces a two-second delay between messages to help alleviate this.

IMPLEMENTATION

The current solution consists of one perl script to clean up the message separators (filter.pl), one shell script to pipe the resulting output through formail (migrate_smtp.sh), and another shell script to introduce the delay and call sendmail (send.sh). The perl script does read the whole mbox file into memory at once (due to laziness -- this should be fixed) so big archives could be a problem.

There are also a few other scripts to aid in debugging and testing the process.

USAGE

migrate_smtp.sh is the front-end script that does everything. Pass it the path to the mbox archive file and the address of the new mailing list:

    ./migrate_smtp.sh oldlist.mbox newlist@example.com

Note that it will pause for two seconds between messages. Edit send.sh if you want to modify that, but beware that Google Groups seems to require the delay in order to prevent it from getting confused about message thread order.

You can run filter.pl on the mbox file first and compare the output to the original file to preview the changes before actually sending any mail:

    perl filter.pl oldlist.mbox | diff -u oldlist.mbox -

NOTE: It would probably be wise to disable message delivery for the receiving list before migrating the archive, or perform the migration before any subscribers are added to the receiving list. Otherwise all subscribers of the receiving list will receive a copy of all archived messages, which could amount to a substantial amount of mail for an old or large list.

FOOTNOTES

[1]

http://mail.python.org/pipermail/mailman-developers/2003-April/015017.html

[2]

http://wiki.list.org/display/DOC/Processing+old+mbox+archives+with+procmail-formail

About

Tools to assist in exporting MailMan archives, e.g. to Google Groups


Languages

Language:Perl 62.1%Language:Shell 37.9%