AI Doc Center: sourceDocumentId equals documentId causing reconciliation failures after document deletion

Certified Associate Developer

Hi all,

I’m using AI Document Center (AIDC) extraction models in a process to save extracted data to the database, and I’ve run into an inconsistent behavior between two models.

Scenario

      1. A user uploads a document into my application.
      2. I save that document into an application folder.
      3. I send the document to an AIDC extraction model.
      4. After extraction, I delete the uploaded document from my application (using standard document deletion so it’s removed from Appian).

Observed behavior

For Extraction Model A, the Extraction Instance record has:

  • sourceDocumentId = 46852
  • documentId = 46856

I delete the document my app knows about (46852). I can still open the Extraction Instance and reconcile without errors, which suggests AIDC is using documentId (46856), not sourceDocumentId.


For Extraction Model B, the Extraction Instance record has:

  • sourceDocumentId = 46908
  • documentId = 46908 (same value)

When I delete the document from my application (46908), I can no longer open the Extraction Instance. I get an expression error indicating the document is not available or has been deleted. This matches the general behavior that if a referenced document is deleted, any process or interface that tries to load it will fail.

Questions

  1. What is the intended meaning and usage of sourceDocumentId vs documentId on the AIA Extraction Instance record?
  2. Under what conditions would sourceDocumentId and documentId be different (Model A) vs the same (Model B)?
  3. Is there any supported way to ensure that the document used by AIDC for reconciliation is a different document ID than the one my application deletes (like in Model A), so that deleting my “application copy” doesn’t break reconciliation?
  4. Is the equality of sourceDocumentId and documentId for Model B expected behavior, a configuration issue, or a bug?
  5. Given that the public docs don’t describe these fields, what is the recommended pattern for:
    • Safely deleting or cleaning up documents used in extraction, and
    • Avoiding reconciliation failures when documents are removed from the application?

  Discussion posts and replies are publicly visible

  • 0
    Certified Lead Developer

    sourceDocumentId = document you pass to AIDC (your upload).
    documentId = document AIDC uses for extraction instance/reconciliation (may be same or derivative).
    Different when AIDC derives/stores separately; same when it uses your original directly (expected).
    No way to force separation via config.
    Don't delete documentId until reconciliation complete; use retention/cleanup afterward.

  • 0
    Certified Associate Developer
    in reply to Shubham Aware

    For Extraction Model A, the Extraction Instance record has:

    • sourceDocumentId = 46852
    • documentId = 46856

    I delete the document my app knows about (46852). I can still open the Extraction Instance and reconcile without errors, which suggests AIDC is using documentId (46856), not sourceDocumentId.

    Here, This model A creates document in AIDC so we can delete the uploaded document from our application. however, why Model B is not creating the document in AIDC like Model A and using sourceDocument?