A Python program to automatically download PDF documents from the Brazilian Chamber of Deputies (Câmara dos Deputados) website.
python -m venv venv
On macOS/Linux:
source venv/bin/activate
On Windows:
venv\Scripts\activate
pip install -r requirements.txt
python main.py
# Download 5 years back with 30 threads
python main.py 5 30
# Download 1 year back with 20 threads
python main.py 1 20
- First argument: Number of years back to download (default: 2)
- Second argument: Number of concurrent threads (default: 40)
- Downloads folder:
./downloads/YEAR/MONTH_NAME/
- Progress file:
download_progress.json
(tracks completed downloads) - Log file:
camara_downloader.log
downloads/
├── 2023/
│ ├── 01_Janeiro/
│ │ └── DCD0020230101000490000.PDF
│ └── 02_Fevereiro/
└── 2024/
└── 01_Janeiro/
The program automatically resumes from where it left off if interrupted.