I am trying to use AI Skill for Document extraction , I have 10 different pdf document having key : value pairs different for eg one document has Name : XXXXX another one has Employee : XXXXX , another one document with no name , it just say Patient name : Can we use them to train the model , or we need to have same Key value pairs When extracting the data ? we should classify those as different documents and then process separately to extract , please provide your input ....
Discussion posts and replies are publicly visible
Unstructured Document(Gen AI) : AI understands context and semantics without training. It automatically recognizes that "Name," "Employee," and "Patient name" refer to similar concepts. It handles document variations effortlessly, extracts data from unseen formats immediately, and adapts to new document types without retraining. It uses token.Structured and Semi-Structured(ML):ML requires extensive training data for each document type and field variation. You must classify documents first, train separate models for each type, and retrain when new formats appear. It excels at high-volume standardized documents with consistent formats but struggles with variations. IT does not uses token.Gen AI is better for varying document formats with different key-value pairs.