<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="https://community.appian.com/cfs-file/__key/system/syndication/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>How To Extract Data from Microsoft Publisher (.PUB) Files in Appian?</title><link>https://community.appian.com/discussions/f/general/40484/how-to-extract-data-from-microsoft-publisher-pub-files-in-appian</link><description>Has anyone worked with Microsoft Publisher (.PUB) files in Appian? 
 Our requirement is to allow users to upload .pub files and extract specific data from them for further processing. AI Capabilities are not enabled. Is there any Appian-native approach</description><dc:language>en-US</dc:language><generator>Telligent Community 12</generator><item><title>RE: How To Extract Data from Microsoft Publisher (.PUB) Files in Appian?</title><link>https://community.appian.com/thread/154872?ContentTypeID=1</link><pubDate>Tue, 09 Jun 2026 06:17:08 GMT</pubDate><guid isPermaLink="false">d3a83456-d57b-489c-a84c-4e8267bb592a:709ca902-056d-4e15-9a0a-203ec4af37e6</guid><dc:creator>ravitejap370818</dc:creator><description>&lt;p&gt;Thanks for the suggestions.&lt;br /&gt;&lt;br /&gt; We explored the Python-based approach and were able to extract the data; however, it requires hosting, so we are evaluating alternatives.&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;div&gt;From what I understand DOCX/HTML approach would still require a conversion step, as Appian does not natively handle PUB files.&lt;/div&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;We are currently exploring Appian-based options (such as plugins) to handle the extraction as much as possible within the platform.&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: How To Extract Data from Microsoft Publisher (.PUB) Files in Appian?</title><link>https://community.appian.com/thread/154862?ContentTypeID=1</link><pubDate>Tue, 09 Jun 2026 03:09:14 GMT</pubDate><guid isPermaLink="false">d3a83456-d57b-489c-a84c-4e8267bb592a:43a8d7dc-754c-44af-ba22-ccccfe24d0cd</guid><dc:creator>mehwishy973927</dc:creator><description>&lt;p data-start="976" data-end="1039"&gt;Since Appian doesn&amp;#39;t natively understand &lt;code data-start="1017" data-end="1023"&gt;.PUB&lt;/code&gt; files, you can:&lt;/p&gt;
&lt;ol data-start="1041" data-end="1178"&gt;
&lt;li data-section-id="u3rbr8" data-start="1041" data-end="1073"&gt;Upload &lt;code data-start="1051" data-end="1057"&gt;.PUB&lt;/code&gt; file to Appian.&lt;/li&gt;
&lt;li data-section-id="13q21j3" data-start="1074" data-end="1122"&gt;Send it to a custom API (Java, .NET, Python).&lt;/li&gt;
&lt;li data-section-id="ja3owb" data-start="1123" data-end="1152"&gt;API extracts text/content.&lt;/li&gt;
&lt;li data-section-id="a6rnq7" data-start="1153" data-end="1178"&gt;Return JSON to Appian.&lt;/li&gt;
&lt;/ol&gt;
&lt;p data-start="1180" data-end="1197"&gt;Example response:&lt;/p&gt;
&lt;p data-start="1180" data-end="1197"&gt;&lt;pre class="ui-code" data-mode="text"&gt;{
  &amp;quot;customerName&amp;quot;: &amp;quot;John Smith&amp;quot;,
  &amp;quot;invoiceNumber&amp;quot;: &amp;quot;INV-1234&amp;quot;,
  &amp;quot;amount&amp;quot;: &amp;quot;$500&amp;quot;
}&lt;/pre&gt;&lt;/p&gt;
&lt;p data-start="1298" data-end="1329"&gt;Appian then processes the JSON.&lt;/p&gt;
&lt;p data-start="1298" data-end="1329"&gt;&lt;br /&gt;OR try&amp;nbsp;&amp;nbsp;structured text rather than OCR:&lt;/p&gt;
&lt;div class="relative w-full mt-4 mb-1"&gt;
&lt;div class=""&gt;
&lt;div class="contents"&gt;
&lt;div class="relative"&gt;
&lt;div class="h-full min-h-0 min-w-0"&gt;
&lt;div class="h-full min-h-0 min-w-0"&gt;
&lt;div class="border border-token-border-light border-radius-3xl corner-superellipse/1.1 rounded-3xl"&gt;
&lt;div class="h-full w-full border-radius-3xl bg-token-bg-elevated-secondary corner-superellipse/1.1 overflow-clip rounded-3xl lxnfua_clipPathFallback"&gt;
&lt;div class="relative"&gt;
&lt;div class="pe-11 pt-3"&gt;
&lt;div class="relative z-0 flex max-w-full"&gt;
&lt;div id="code-block-viewer" dir="ltr"&gt;
&lt;div class="cm-scroller"&gt;
&lt;pre class="cm-content q9tKkq_readonly m-0"&gt;&lt;code&gt;&lt;span&gt;PUB &amp;rarr; DOCX &amp;rarr; Extract Text &amp;rarr; Appian&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;p data-start="817" data-end="819"&gt;or&lt;/p&gt;
&lt;div class="relative w-full mt-4 mb-1"&gt;
&lt;div class=""&gt;
&lt;div class="contents"&gt;
&lt;div class="relative"&gt;
&lt;div class="h-full min-h-0 min-w-0"&gt;
&lt;div class="h-full min-h-0 min-w-0"&gt;
&lt;div class="border border-token-border-light border-radius-3xl corner-superellipse/1.1 rounded-3xl"&gt;
&lt;div class="h-full w-full border-radius-3xl bg-token-bg-elevated-secondary corner-superellipse/1.1 overflow-clip rounded-3xl lxnfua_clipPathFallback"&gt;
&lt;div class="relative"&gt;
&lt;div class="pe-11 pt-3"&gt;
&lt;div class="relative z-0 flex max-w-full"&gt;
&lt;div id="code-block-viewer" dir="ltr"&gt;
&lt;div class="cm-scroller"&gt;
&lt;pre class="cm-content q9tKkq_readonly m-0"&gt;&lt;code&gt;&lt;span&gt;PUB &amp;rarr; HTML &amp;rarr; Parse Content &amp;rarr; Appian&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;p data-start="870" data-end="928"&gt;This can preserve more structure than a PDF in some cases.&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: How To Extract Data from Microsoft Publisher (.PUB) Files in Appian?</title><link>https://community.appian.com/thread/154844?ContentTypeID=1</link><pubDate>Mon, 08 Jun 2026 06:06:24 GMT</pubDate><guid isPermaLink="false">d3a83456-d57b-489c-a84c-4e8267bb592a:50867d23-5f2d-4c6f-bddf-aed740377b44</guid><dc:creator>Stefan Helzle</dc:creator><description>&lt;p&gt;I never heard of any such use case. If you can turn the files into PDF first, you can do the extraction with OOTB functions.&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item></channel></rss>