How to extract hyperlink annotations (URLs) from PDF in Appian?

Certified Lead Developer

Hi everyone,

I’m working with procurement documents (PDFs) that contain hyperlinks (e.g. links to official tender platforms). When I use the Get PDF Text smart service (from PDF Tools), the visible text comes through, but the embedded hyperlinks (annotations/URI actions) are not returned.

I also tested with AI Skills (Document Extraction), but it seems to behave the same way — as if Appian is only sending the extracted text to the AI model rather than the full PDF object. As a result, the links are lost there as well.

Has anyone found a way in Appian to extract these hidden links directly from PDFs?

  • Is there any existing plug-in or connected system that supports reading hyperlink annotations?

  • Or would the only option be to build a custom plug-in with something like Apache PDFBox?

Thanks in advance for your insights!

  Discussion posts and replies are publicly visible

Parents Reply
  • 0
    Certified Lead Developer
    in reply to Stefan Helzle

    That’s what I was suspecting as well. Thanks for confirming it!
    I’ve started working on a custom plug-in to handle PDF annotations/URI actions directly (since the standard functions only return visible text), but I wanted to double-check with the community before going too far.

    It’s a bit of a pity that the document itself isn’t passed to the AI Skills — if it were, extracting the hyperlinks would be immediate (I’ve tried with ChatGPT and it returns them right away).

Children
No Data