You are currently reviewing an older revision of this page.

DRAFT KB-XXXX - The extract() function is unable to handle HTML line breaks

Symptoms

The extract() function is unable to handle the HTML line break tags: <p>, <br>, and <div> (as well as </p>). It parses them as line breaks and does not return the characters.

For example:
extract("<b>Text</b>", "<", ">") works correctly, it returns "b" and "/b"
extract("<p>Text</p>", "<", ">") returns nothing, but it should be returning "p" and "/p"
extract("<div>Text</div>", "<", ">") returns "/div" but not "div"
extract("<<p>>Text</p>", "<", ">") returns a line break rather than the text "<p>"

Cause

This issue has been reported to the Appian product team. The reference number for this issue is AN-96094.

Action

Use the substitute() function to replace specific HTML tags. To remove them, substitute the values for an empty string. For example:
substitute(substitute("<p>Text</p>","<p>",""), "</p>", "")
would remove both the <p> and </p> tags.

Additionally, the reduce() function can be used to strip all the undesired HTML characters, as shown below:
reduce(
fn!substitute(_, _, ""),
ri!text,
{"<p>", "</p>", "<br>", "<div>", "</div>"}
)

Affected Versions

16.3+