Do you spend lots of time to do invoices entry and How can OCR and intelligence document extraction assist you?

A common scene in Finance Departments: tables overflowing with paper, corridors filled with boxes of documents, and there is hardly any space to walk. Some of the staff may be busy looking at the paper sand doing data entries into the accounts payable model for supplier invoices. The staff looks bored and tired of doing this every day. Is this a common scene in your department as well? Can we do this better?

A better way to manage this would be to use Optical Character Recognition (OCR) and Intelligent Document Extraction with a scanner.

What is optical character recognition (OCR)?

OCR turns any image with text into digital data. An image is made up of many pixels and you cannot copy out the text. When an image is processed with OCR, it means that we put a digital layer of text on top of the image. The most common format after OCR processing is a “Searchable PDF”. (There could be non-searchable pdf, it is just an image) However, the information in the PDF still cannot be used. The information needs to be in the right structure. For example,  you will need to know which text is the invoice number, which text is the date, etc. This is where the intelligent document extraction comes in.

What is intelligence document extraction?

“Intelligence document extraction” is to extract the text and classify them as invoice number, invoice date, supplier name, supplier address,  invoice total, etc. The most difficult part of intelligent extraction is that each document (e.g. supplier’s invoices) are of different formats.

Even if the format is similar, there are tables in the invoices. These tables can be a few rows too many rows and go to the second or third page. In order to extract data from tables, Intelligent Document Extraction uses a natural language process to understand its content.

Why do you need to do document diligence?

Will you process a supplier invoice with no letterhead? In Asia, this is the most basic due-diligence. We will also some times check that there is a company stamp and the signature looks real. However, to do this, you will need an image recognition trained to recognize different letterheads, stamps, and signature. You have to make sure that such image-recognition capabilities are available in your software before you purchase it.

So again what are the benefits:

  • Better employee motivation: Staff will be motivated if they do interesting work instead of doing the same data entry every day. This will also help you to retain the staff.
  • Better month-end closing: As you can process your supplier invoices faster, you can close the finance end of the month faster.
  • Save cost: Most OCR and intelligence document extraction are running at only a few cents per copy. This is much cheaper than human labor costs.

Written by : Christopher Lim

