Hi All,
I have requirement to read content of PDF having Table and Paragraphs which is structured and unstructured and store it into TABLES without any manual intervention.
I have used the Appian inbuilt feature doc extract but its not providing me accurate reading capability like some fields are readable but some are not.
https://docs.appian.com/suite/help/23.2/evaluate-doc-extraction.html
We would like to stick with Appian as it keeps the doc in Appian rather than using Google
Any suggestion will be helpful
Thanks
Discussion posts and replies are publicly visible
How many training cycles did you do?
What is meant by training cycle can you please help to explain , as of now I have passed the same PDF 10 time through the work flow and its unbale to read the fields
In my blog post I describe how training works: appian.rocks/.../
Hi Stefan,I cannot see the option , is it some licence issue
In the Build view, click NEW > AI Skill.
This needs to be enabled by Appian.
Currently we are on 23.1, I guess this feature is available on next version 23.2
Yes
Hi Stefan,
We have upgraded to 23.2 and have provided the training documents in the AI Training skills but still in the extraction form document "ExtractedData" nothing appears in the output , and when it reaches to the reconcile and on View form only 30% of the document is scanned and half the tables are missed.
What I have seen is that for 1st time we have do to a manual intervention in Reconcile Task to map the non readable fields manually then after that if the same set of doc is passed then it is able to read few key value pairs but still struggling with reading tables.
Our business requirement was that customer will send us PDF via email to process model and it will read the PDF having tables and non tables and store in DB without any manual intervention from customer.
I guess this new AI Training is for simple documents like invoice which has straight FW key value pairs.
Our document is a complex report having a complex piece of table with header names with special characters like
Is there any other way to read PDF
As I already tried to explain, the training for extract ONLY happens in the reconciliation step. You will need to repeat that at least a few times with various documents, so that the machine learning model understands what you want it to do.
Then, you add a data validation step after the extraction to decide whether the extraction was good or needs a manual reconciliation.
And if you have the feeling that the OOTB extraction is not powerful enough, I suggest to contact Appian and discuss your specific use case.