Appian vs. Google Cloud for Image Classification and OCR

Certified Associate Developer

So far, I have only been working on training cases or debugging existing projects, so please bear with me if this is a newbie issue.

I have a training data set of various documents (as JPG and PDF) of different kinds, and by certain features (like prevailing color, repetitive layout) I intend to classify them (e.g. invoice type 1, invoice type 2, not an invoice). In a 2nd step, I would like to OCR certain predefined areas of each document and extract e.g. the address of the company sending the invoice and the date.

The architecture I am envisioning is the following:

  1. In Appian, I have a UI where I can upload new files, which are saved by Appian locally in a directory with filenames like “document_12345689.pdf”. Meta info like upload timestamp, user, original file name should be persisted in a (Appian) database
  2. The file should is then submitted to google cloud which should do the classification step, and deliver back the label to Appian to be saved in the database.
  3. The page should be auto-cropped, i.e. black or white margins are removed. This can be done anywhere, but most probably with google cloud as well. The resulting area is also persisted in Appian.
  4. In case it is e.g. an invoice, OCR should be performed (most probably by google) for certain regions of the documents, e.g. a bounding box of spanning from the mid of the page to the right margin in the upper 10% of the cropped page. The results of the OCR should be again delivered back to Appian and persisted.

Does this approach make sense? Are there things which can be done by Appian out of the box? Is there some instructional video or training material on a similar issue?

  Discussion posts and replies are publicly visible

Parents Reply Children
No Data