Audit Vouching Code- V 4.1 – Code and Steps to install

Step 1: Installing the pre requisites

Install Python 3.9 64 Bit (If you have any other version already, please install this additionally and install the pip packages manually to this version)
- I am insisting on python 3.9 because Windows OCR works only on Python 3.9 and not later
- Ensure that you use “Add Python to PATH” – Or else things dont work.
Install Tesseract OCR
- https://github.com/tesseract-ocr/tesseract/releases/download/5.5.0/tesseract-ocr-w64-setup-5.5.0.20241111.exe
- Ensure that it is installed into “C:\Program Files\Tesseract-OCR”
Install Poppler –
- Link –https://blog.withkarthik.com/wp-content/uploads/2025/03/poppler-24.08.0.zip
- Extract the contents into “C:\Program Files\”
  - It should be ensured that the path is something like the one mentioned in the screenshot below

4. Open the command prompt and run the following code and wait for few minutes so that the required packages get installed

pip install pillow pandas clipboard opencv-python numpy pdf2image asyncio pytesseract pymupdf psutil winrt

Step 2 – Getting the code Running

Create a New folder in any place where you want to run this code (Please note that you have atleast twice the Space of the files that you are importing)
Download the below ZIP File and extract all the contents into that Folder

https://blog.withkarthik.com/wp-content/uploads/2025/06/Audit-Vouching-Code.zip
Run the “Audit Compiled Codecpython-39.pyc”

Functionalities Explained

Import the required files

2. Select the files to import

3. The rest steps are dependant on you. The UI is mostly self explanatory and hence you can see for yourself what you need

Templates – GSTR 7A and PF Payment Challan

Will share the regex as a separate post

Other Important Aspects:

Currently for some reason ,the import folder option does not work, it makes the program freeze, so please use the import files option
Select the Native PDF engine for normal computer scanned PDFS.
For Other scanned PDFs, use windows OCR or Tesseract as the case may be. Both have their own advantages and disadvantages. So, i usually use a combination of them both

<Note>
1. Ensure that the files are of similar nature. It vouches every file that is imported
2. Depending upon the functionality, check if the file is scanned or it is normal computer generated invoice. Regex matching works only on the computer generated stuffs