This project provides scripts for retrieving Postgres mailing list archives, populating a Postgres database with them, and rendering the archives with NextJS.
- Run
git clone --recurse-submodules lanterndata/pg
- Use
./scripts/scrape.shto download the mailing list archives and populate a Postgres database. Note thatDATABASE_URLshould be set in the environment. - Use
pgsql-lists-offlineto download the mailing list archives to a localdatadirectory. Note the dependencies onparallelandcurl. See more below. - Use
yarn scrapeto process the archives and populate a Postgres database. Note thatDATABASE_URLshould be set in the environment.
yarn dbmate new <migration-name>With DATABASE_URL set in the environment:
yarn dbmate upThe body_dense_vector column in messages was added and populated using the Lantern dashboard, and is not in the migration files.
Source: https://github.com/wsdookadr/pgsql-lists-offline
Downloads mailing list archives for PostgreSQL, PostGIS and pgRouting.
The mailing list archives that will be downloaded are in the mbox format and can be read using any mbox-aware email client (for example mutt ).
sudo apt-get install parallel curl./pgsql-lists-offline.sh -lTo download a list (for example pgsql-general):
./pgsql-lists-offline.sh -g pgsql-generalTo download a list for a particular month:
./pgsql-lists-offline.sh -g pgsql-general -m 202403