Appian Community
Site
Search
Sign In/Register
Site
Search
User
DISCUSS
LEARN
SUCCESS
SUPPORT
Documentation
AppMarket
More
Cancel
I'm looking for ...
State
Not Answered
Replies
7 replies
Subscribers
6 subscribers
Views
3342 views
Users
0 members are here
Share
More
Cancel
Related Discussions
Home
»
Discussions
»
Process
Hi, We're trying to fetch data from an html page where content is
skumar
over 9 years ago
Hi,
We're trying to fetch data from an html page where content is distributed under multiple tabs. We're using "Send HTTP Request" smart service for this purpose. Content that we receive as an output of this smart service consists of data from first tab only.
Is there a better way to get entire page content (from all tabs) and then extract data out of it using xpath ?
Thanks in advance,
Sandeep Kumar
OriginalPostID-192551
OriginalPostID-192551
Discussion posts and replies are publicly visible
0
Stefan Helzle
A Score Level 3
over 9 years ago
It depends on how the website is built. If switching between tabs is done using javascript and the tabs are not accessible by some URL parameters it will get difficult.
Cancel
Vote Up
0
Vote Down
Sign in to reply
Verify Answer
Cancel
0
skumar
over 9 years ago
Thanks @stefanh791
Attached is the response body that we received as "Send HTTP Request" service output for url:
ted.europa.eu/udl
This page has two tabs: Original Language, Data
But we're getting details of 1st tab only.
response_body.html
Cancel
Vote Up
0
Vote Down
Sign in to reply
Verify Answer
Cancel
0
Stefan Helzle
A Score Level 3
over 9 years ago
I see. The tabs are rendered dynamically and only the content is delivered for the selected tab. So to get both you would have to do two calls.
Cancel
Vote Up
0
Vote Down
Sign in to reply
Verify Answer
Cancel
0
skumar
over 9 years ago
We're using following code to extract title of active tab for which content is displayed in response body.
xpathsnippet(pv!responseBodyText, "//[id='tabs']//ul/li[@class='activated']/div/a/text()")
But unfortunately it throws an error saying "Error parsing xml..."
Cancel
Vote Up
0
Vote Down
Sign in to reply
Verify Answer
Cancel
0
Stefan Helzle
A Score Level 3
over 9 years ago
The problem is that HTML does not have to be valid XML to be rendered by a browser, so many websites use broken HTML which can not be parsed using xPath.
Cancel
Vote Up
0
Vote Down
Sign in to reply
Verify Answer
Cancel
0
Stefan Helzle
A Score Level 3
over 9 years ago
I would not recommend to scrape content from a website using Appian.
Cancel
Vote Up
0
Vote Down
Sign in to reply
Verify Answer
Cancel
0
skumar
over 9 years ago
Thanks for your suggestion stefan
Cancel
Vote Up
0
Vote Down
Sign in to reply
Verify Answer
Cancel