Importing Archives Into Gmane
Mailing list archives can be imported into Gmane.
Archives to be imported can be in one of two formats: Either a tar
file of a one-message-per-file directory, where the files have names
that increase numerically, or a Unix mbox file. No other formats are
acceptable. A Unix mbox file is preferred.
If you wish to have an archive of a mailing list you administrate
imported into Gmane, send a mail
to Lars with the URL of the mailing list archive, and which group
it should be imported into. Duplicates are ignored when doing an
import, so a total archive of the list is ok -- no pre-filtering of
messages is necessary.
The list admin/owner should OK this before the archive is imported.
If you're the list admin, please say so in the email where you
request the import. If not, please get in touch with the list admin
first and get an approval before you request the import. The list
admin often has access to an mbox format mail archive for the list, so
get the URL for the archive at the same time.
For the technically inclined, here's how a mailing list archive
import is done. It's not always as straightforward as it may seem.
- If there are no articles already in the group, the archive is
simply imported.
- If there are already articles in the group, things get a bit more
complicated, since Gmane tries to keep at least a loose correlation
between the order of the article numbers and the sequence in which the
messages were posted.
- Let's say there's already articles 1-1000 in the group, and there's
2000 (unstored) articles in the archive.
- Reception of new articles for the group is temporarily disabled.
- The archive is imported into the group, ignoring any articles that
have already been stored in the group. The articles from the archive
get article numbers 1001-3000.
- Articles 1-1000 are renamed to 3001-4000.
- Using a hacked-up version of the prunehistory inn command, the
storage tokens for these moved articles are altered.
- The overview file for the group is regenerated.
- Any articles that arrived while doing this operation are handled
and injected into the group.
This means that if you've read articles in the group before doing
the import, they'll suddenly become unread again, since they're
assigned new article numbers. This is inconvenient, but it's a
one-time inconvenience. Having the articles permanently out of
sequence would be a permanent inconvenience.
The web interface to the articles will still respect the old
article numbering, as well as the new. http://article.gmane.org/gmane.test/44
and http://article.gmane.org/gmane.test/348
both refer to the same article after one of these renumberings.
|