DRAFT KB-XXXX - The extract() function is unable to handle HTML line breaks

Symptoms

The extract() function is unable to handle the HTML line break tags: <p>, <br>, and <div> (as well as </p>). It parses them as line breaks and does not return the characters.

For example:
extract("<b>Text</b>", "<", ">") works correctly. It returns "b" and "/b".
extract("<p>Text</p>", "<", ">") returns nothing. It should return "p" and "/p".
extract("<div>Text</div>", "<", ">") returns "/div", but not "div".
extract("<<p>>Text</p>", "<", ">") returns a line break instead of the text "<p>".

Cause

This issue has been reported to the Appian Product Team. The reference number for this issue is AN-96094.

Action

Use the substitute() function to replace specific HTML tags. To remove them, substitute the values for an empty string.

For example:
substitute(substitute("<p>Text</p>","<p>",""), "</p>", "") would remove both the <p> and </p> tags.

Additionally, the reduce() function can be used to strip all the undesired HTML characters, as shown below:
reduce(
fn!substitute(_, _, ""),
ri!text,
{"<p>", "</p>", "<br>", "<div>", "</div>"}
)

Affected Versions

This article affects Appian 16.3 and later.

Last Reviewed: January 2018