cdbeland/moss
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
See https://en.wikipedia.org/wiki/Wikipedia:Typo_Team/moss HARDWARE REQUIREMENTS * It takes at least ~6GB of RAM just for the spell-check dictionary to load * Parallelized Python code will automatically spawn one child process per CPU core (including hyperthreaded "cores") UBUNTU DEPENDENCIES sudo apt-get install git pyflakes3 pylint python3-pep8 python3-dev g++ protobuf-c-compiler protobuf-compiler mariadb-server FEDORA DEPENDENCIES sudo dnf install git python3-flake8 protobuf-c-compiler pylint python3-devel protobuf-devel g++ mariadb-server GIT SETUP (optional) git config --global color.ui true git config --global push.default simple CLONE AND SET UP: cd ~ # Or wherever you'd like the clone to be parented git clone https://github.com/cdbeland/moss.git cd moss/ ./reset_environment.sh sudo mkdir -p /var/local/moss/bulk-wikipedia/ sudo chown -R $USER /var/local/moss sudo service mariadb restart cat first-time.sql | perl -pe "s/beland/$USER/g" | sudo mysql sudo cp moss-mysqld.cnf /etc/my.cnf.d/ sudo service mariadb restart DOWNLOAD DATA: # Do not do this if Wikipedia or Wiktionary dumps are in progress at: # https://dumps.wikimedia.org/backup-index-bydb.html ./update_downloads_parallel.sh # Wait for download to complete; you can watch progress with: # tail -f /var/local/moss/bulk-wikipedia/*log RUN SPELL CHECK AND FRIENDS: ./run_moss.sh TO SPELL CHECK ONLY ARTICLES WITH TITLES STARTING WITH "X": ./run_moss.sh --spell-check-only X Omit X to run spell check only and skip other reports. You can also specify any letter or number, or "BEFORE_A" and "AFTER_Z".