What does the below warning mean in AI skills document extraction?
Files with pages with too many tokens
These files contained pages with too many tokens (token limit: 512) for the model to process. Labeled entities on these pages may not be included in training if they fall beyond the token limit.
Discussion posts and replies are publicly visible
Hi Appian Boy , this warning will be thrown whenever a document contains pages with dense text. The behavior is as follows:If a page contains over 512 tokens (~500 words), than any field you have labeled that comes after the 512th token will not be included when training the model. If you have some fields in your model that have particularly low recall while others are much higher, than the low recall fields are likely impacted by this warning.
We are evaluating options for raising this limit so that customers do not encounter it.