Skip to content

cdbeland/moss

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1,133 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

See https://en.wikipedia.org/wiki/Wikipedia:Typo_Team/moss

HARDWARE REQUIREMENTS
* It takes at least ~6GB of RAM just for the spell-check dictionary to load
* Parallelized Python code will automatically spawn one child process
  per CPU core (including hyperthreaded "cores")

UBUNTU DEPENDENCIES
sudo apt-get install git pyflakes3 pylint python3-pep8 python3-dev g++ protobuf-c-compiler protobuf-compiler mariadb-server

FEDORA DEPENDENCIES
sudo dnf install git python3-flake8 protobuf-c-compiler pylint python3-devel protobuf-devel g++ mariadb-server

GIT SETUP (optional)
git config --global color.ui true
git config --global push.default simple

CLONE AND SET UP:
cd ~
# Or wherever you'd like the clone to be parented
git clone https://github.com/cdbeland/moss.git
cd moss/
./reset_environment.sh
sudo mkdir -p /var/local/moss/bulk-wikipedia/
sudo chown -R $USER /var/local/moss
sudo service mariadb restart
cat first-time.sql | perl -pe "s/beland/$USER/g" | sudo mysql
sudo cp moss-mysqld.cnf /etc/my.cnf.d/
sudo service mariadb restart

DOWNLOAD DATA:
# Do not do this if Wikipedia or Wiktionary dumps are in progress at:
#  https://dumps.wikimedia.org/backup-index-bydb.html
./update_downloads_parallel.sh

# Wait for download to complete; you can watch progress with:
# tail -f /var/local/moss/bulk-wikipedia/*log

RUN SPELL CHECK AND FRIENDS:
./run_moss.sh

TO SPELL CHECK ONLY ARTICLES WITH TITLES STARTING WITH "X":
./run_moss.sh --spell-check-only X

Omit X to run spell check only and skip other reports. You can also
specify any letter or number, or "BEFORE_A" and "AFTER_Z".

About

Searching for misspelling, bad grammar, and violations of the Manual of Style in Wikipedia

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors