Using IDP to extract data from Driving License & Passport

Rahul Shrivastava over 2 years ago

Hello all,
We have a requirement to extract the data from the Driving License & Passport images. The catch here is that the user might have clicked it from his phone and sent it to us so We are converting them into PDF but we are not able to extract data from it as it is a semi or unstructured document.
Please help if any kind of extraction is possible in such documents.

Thanks in advance.

Discussion posts and replies are publicly visible

Parents

0 sarathkumar13
Certified Senior Developer
over 2 years ago

What is the issue that you are facing??
Did you train the forms or in this case the images?
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Rahul Shrivastava over 2 years ago in reply to sarathkumar13

We are trying data extraction from pdfs but due to the different formats of the docs it is not able to extract the data consistently and properly & also it gives different results when testing the same document again and again
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 sarathkumar13
Certified Senior Developer
over 2 years ago in reply to Rahul Shrivastava

I am not able to understand what you are referring here as different formats. Are you referring to different document versions?

Fields are the same in all documents right? If yes, try training the document for like max 10 times. If still not able to identify, then probably you need to check with appian
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Rahul Shrivastava over 2 years ago in reply to sarathkumar13

There are different format of licence and passports all over the world with different field names and sometimes there are no fields. We have a broader level of use case with us.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 sarathkumar13
Certified Senior Developer
over 2 years ago in reply to Rahul Shrivastava

Ok. Then its little difficult to extract the info since they are multiple version of the documents.

You probably need to check with appian for this use case.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Boddupalli Bhargav
Certified Senior Developer
over 2 years ago in reply to Rahul Shrivastava

You can try integrating to third party like AWS Textract Analyze Id ,this model is trained to extract entities from ID docs with better accuracy .Click here for reference utility, here appian OTB extraction may have less accuracy as they are images converted to docs.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Reply

0 Boddupalli Bhargav
Certified Senior Developer
over 2 years ago in reply to Rahul Shrivastava

You can try integrating to third party like AWS Textract Analyze Id ,this model is trained to extract entities from ID docs with better accuracy .Click here for reference utility, here appian OTB extraction may have less accuracy as they are images converted to docs.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Children

No Data