This project contains the text extracted from images using Optical Character Recognition (OCR). The extracted text is saved in a text file with the same name as the input image.
To run the OCR processing, you need the following Python packages:
- pillow
- pytesseract
- opencv-python
- numpy
You can install these packages using the following command:
pip install -r requirements.txtmain.py: The main script to process images and extract text.requirements.txt: List of required Python packages.output/: Directory containing the extracted text files.target/: Directory containing the input images to be processed.
-
Ensure you have Tesseract OCR installed on your system. You can download it from here.
-
Place the images you want to process in the
target/directory. -
Run the
main.pyscript:
python main.py- The extracted text will be saved in the
output/directory with the same name as the input image but with a.txtextension.
This project is licensed under the MIT License.