Issues with AI Document Center 3.1: Reconciliation Limit, Highlight Feature, Extraction Performance

Certified Associate Developer

1. The extracted document contains a table with over 300 rows, but the reconciliation screen displays only 100 fields because the related record type query is limited to retrieving 100 items. Do you have any suggestions on how we can reconcile all the rows?

2. The "Highlight Value in Document" feature on the reconciliation screen is not working properly. I’m using AI Document Center version 3.1. Is there any known reason why this issue occurs?

3. Sometimes, when a document has a large number of rows, the extraction model does not extract all the data. For example, if the document has 600 rows, it only extracts about 350 rows. Are there any limitations on the document extraction model?

4.The extraction process is taking a considerable amount of time. Are there any best practices or configuration changes to improve extraction performance?

Environment Details:

  • AI Document Center Version: 3.1
  • LLM: 4.5 (Reasoning)

  Discussion posts and replies are publicly visible

  • 0
    Certified Lead Developer

    I'm exploring the Document Center with real-world documents now, so I might be able to give better Appian-specific insights in a few weeks. But here are a few suggestions based upon my experience with 'raw' gen-AI extraction methods from the last 6 months:

    #3 - It can help to have a secondary extraction field to determine that your critical business information has extracted correctly. This is akin to airplanes having redundant sensors - it's expensive, but sometimes necessary. In your case, could you setup a field which determines how many rows need to be extracted, and then use some logic to adjust your extraction prompt? Even then, understand that the prompts might still produce invalid results a statistical % of the time.

    #4 - Claude Sonnet 4.5 will take a monumental amount of time compared to Claude Haiku, but the results should be better (in theory). This is not an Appian-specific problem, either. Generally speaking, gen-AI extraction should never be presented to users as being "real time". The 'thinking' models in particular are slow, but don't 'feel' slow when used on OpenAI / Gemini / Claude's websites because they give the user something to read while the algorithms run. But the actual results take a long time to show. Also keep in mind that if your document has 100 fields to extract, you may get better performance (with much greater expense) with multiple different extraction prompts running in parallel (1 API call per field) rather than attempting to get all fields at once. I'm not sure what Appian uses under the covers for a multi-page table though.