So I have the expression rule that will abbreviate a string. Thoughts on how to improve?
/*Replace all special characters with spaces*/ /*Replace extra spaces with since space*/ /*Wrap numbers in spaces*/ /*Split words and number sets into array*/ /*ForEach item in array grab number or first letter of word*/ /*Join what is left, upper case string and grab left 50 characters*/ with( local!nospecialcharacters: regexreplaceall( "[^\w\s]", ri!in, " " ), local!extraspaceswithsinglespace: regexreplaceall( "\s+", local!nospecialcharacters, " " ), local!numberwrapper: if(regexmatch("\d+",local!extraspaceswithsinglespace), regexinsertmatchmarkers( "\d+", local!extraspaceswithsinglespace, " ", " ", false ), local!extraspaceswithsinglespace ), local!segments: split( regexreplaceall( "\s+", local!numberwrapper, "|" ), "|" ), local!keepers: a!forEach( local!segments, if( isnull( fv!item ), null, if( not( isnull( tointeger( fv!item ) ) ), fv!item, if( fv!item = " ", null, if( regexmatch("^[a-zA-Z]+$", fv!item ), left( fv!item, 1 ), null ) ) ) ) ), left( upper( joinarray( local!keepers, "" ) ), 50 ) )
Discussion posts and replies are publicly visible
As a thought experiment, it's pretty cool. Compared to a similar special character function I had to deal with in the past, it's pretty speedy (though we also had to preserve special characters via lower-ascii equivalents). Our target was 4,000 characters in under 100ms. As for the potential utilities I could see this type of thing being used for, it's a bit flawed since it wipes a away an average of >80% of its data in the output. What's it for?
I'm curious to know how an equivalent reduce() function performs for local!keepers.
I assume it's sourced from this plugin.
Hello Rick,
Now answering your question.
When I first saw your post I thought something simple like this: (assuming that your first letter of the product is UpperCase and the rest are lowercase, if this works for you, maybe do this as simple as this regex) =D
with( local!regex: "[^A-Z0-9]", regexreplaceall(local!regex,ri!in,"") )
In case you cannot ensure this maybe this will work for you:
(here I am splitting the text and then getting the first letter of each making it uppercase and then making what I did in the first option I gave )
with( local!regex: "[^A-Z0-9]", local!text: "Product 123 Ve3ndor 2ASD3", local!separator:"", joinarray( a!forEach( items: split( local!text, " " ), expression: regexreplaceall( local!regex, upper( fv!item[1] ) & lower( mid( fv!item, 2, len( fv!item ) ) ), "" ) ), local!separator ) )
Please let me know what you think. You can remove all the special characters and everything at once. given you example Product 123 Vendor 098 the first option works perfectly.
Jose Perez
Hi Rick, Try the code below. I tested it with few variations seems to be covering all, let me know if you see any difference or you may try changing if you see a missed scenario.
Performance wise I used a string - "This is a 1 re45al big te8xt to be abbre3viat88ed and can 77 have perf67ormance i !!ssue" and I think Regex is taking 10+ ms in best run and the code snippet i used is doing in <1 ms. Also on performance improvement, I would suggest use apply and not forEach. We have noticed 5 times of performance degradation when you loop 100+ items. Since in this case we are not looping extensively it might not be noticeable easily. For the above input if you turn on the For each (commented) you will see it goes 2+ ms.
load( /*place spaces around all numeric values in the string*/ local!formattedString:reduce( fn!substitute, lower(ri!in), merge( enumerate(10), {" 0 "," 1 "," 2 "," 3 "," 4 "," 5 "," 6 "," 7 "," 8 "," 9 "} ) ), /*break the above string to arraym, separated by space */ local!words: split( cleanwith( local!formattedString, /*your allowed character list */ "abcdefghijklmnopqrstuvwxyz0123456789 " ), char(32) ), /*abbreviate */ upper( joinarray( /*a!forEach(*/ /*local!words,*/ /*charat(fv!item,1)*/ /*)*/ apply( charat(_,1), local!words ) ) ) )