This Python project extracts key information from scanned PDF invoices using OCR (Optical Character Recognition) with Tesseract and image conversion tools. It reads PDF files from an input folder, ...
This project is a Python-based OCR tool designed to extract text from images and PDF documents using pytesseract and pdf2image. I built this project to extract useful information (like names, titles, ...