Abbreviate

So I have the expression rule that will abbreviate a string. Thoughts on how to improve?

 

/*Replace all special characters with spaces*/
/*Replace extra spaces with since space*/
/*Wrap numbers in spaces*/
/*Split words and number sets into array*/
/*ForEach item in array grab number or first letter of word*/
/*Join what is left, upper case string and grab left 50 characters*/
with(
  local!nospecialcharacters: regexreplaceall(
    "[^\w\s]",
    ri!in,
    " "
  ),
  local!extraspaceswithsinglespace: regexreplaceall(
    "\s+",
    local!nospecialcharacters,
    " "
  ),
local!numberwrapper: 
  if(regexmatch("\d+",local!extraspaceswithsinglespace),
  
  regexinsertmatchmarkers(
    "\d+",
    local!extraspaceswithsinglespace,
    " ",
    " ",
    false
  ),
  local!extraspaceswithsinglespace
  
  ),
  local!segments: split(
    regexreplaceall(
      "\s+",
      local!numberwrapper,
      "|"
    ),
    "|"
  ),
  local!keepers: a!forEach(
    local!segments,
    if(
      isnull(
        fv!item
      ),
      null,
      if(
        not(
          isnull(
            tointeger(
              fv!item
            )
          )
        ),
        fv!item,
        if(
          fv!item = " ",
          null,
          if(
            regexmatch("^[a-zA-Z]+$",
              fv!item
            ),
            left(
              fv!item,
              1
            ),
            null
          )
        )
      )
    )
  ),
  left(
    upper(
      joinarray(
        local!keepers,
        ""
      )
    ),
    50
  )
)

  Discussion posts and replies are publicly visible

  • As a thought experiment, it's pretty cool.  Compared to a similar special character function I had to deal with in the past, it's pretty speedy (though we also had to preserve special characters via lower-ascii equivalents).  Our target was 4,000 characters in under 100ms.  As for the potential utilities I could see this type of thing being used for, it's a bit flawed since it wipes a away an average of >80% of its data in the output.  What's it for?

    I'm curious to know how an equivalent reduce() function performs for local!keepers.

  • Hi Jesse, I use this in our interfaces while users enter a product name. It tries to help them formulate an abbreviation and auto-populates the abbreviation textbox which they can update but this gives them something to work with. Their request was to keep the first letter of each word and the entire number. I have not noticed any performance issues in the sail interfaces as of yet.

    Example: Product 190 - Version 2 = P190V2
  • Is regexreplaceall a new function? I don't see that in the documentation but maybe I am looking in the wrong place.
  • Thanks! If you are correct it is worth noting that the plugin has not been updated since 2011 and its last supported version is 6.6
  • coltonb is correct about the pluggin. It is still working with our 17.3 install.
  • Hello Harrison,

    Just to complement your comment, The Regex is standard Java functionality, which was released in early versions of Java, this is a powerful tool and as you said its old and stable, I can tell I like the regex because is so powerful.

    Jose
  • Hello Rick, 

    Now answering your question.

    When I first saw your post I thought something simple like this:  (assuming that your first letter of the product is UpperCase and the rest are lowercase, if this works for you, maybe do this as simple as this regex)  =D

     

    with(
      local!regex: "[^A-Z0-9]",
      regexreplaceall(local!regex,ri!in,"")
    )
    
    

    In case you cannot ensure this maybe this will work for you:

    (here I am splitting the text and then getting the first letter of each making it uppercase and then making what I did in the first option I gave )

    with(
      local!regex: "[^A-Z0-9]",
      local!text: "Product 123 Ve3ndor 2ASD3",
      local!separator:"",
      
      joinarray(
        a!forEach(
          items: split(
            local!text,
            " "
          ),
          expression: regexreplaceall(
            local!regex,
            upper(
              fv!item[1]
            ) & lower(
              mid(
                fv!item,
                2,
                len(
                  fv!item
                )
              )
            ),
            ""
          )
        ),
        local!separator
      )
    )

     

    Please let me know what you think. You can remove all the special characters and everything at once. given you example Product 123 Vendor 098 the first option works perfectly. 

     

    Jose Perez

  • +1
    Certified Lead Developer

    Hi Rick,

    Try the code below.
    I tested it with few variations seems to be covering all, let me know if you see any difference or you may try changing if you see a missed scenario.

    Performance wise I used a string - "This is a 1 re45al big te8xt to be abbre3viat88ed     and can 77 have perf67ormance i !!ssue" and I think Regex is taking 10+ ms in best run and  the code snippet i used is doing in <1 ms.

    Also on performance improvement, I would suggest use apply and not forEach. We have noticed 5 times of performance degradation when you loop 100+ items. Since in this case we are not looping extensively it might not be noticeable easily. For the above input if you turn on the For each (commented) you will see it goes 2+ ms.

     

     

    load(
      /*place spaces around all numeric values in the string*/
      local!formattedString:reduce(
        fn!substitute,
        lower(ri!in),
        merge(
          enumerate(10),
          {" 0 "," 1 "," 2 "," 3 "," 4 "," 5 "," 6 "," 7 "," 8 "," 9 "}
        )
      ),
      /*break the above string to arraym, separated by space  */
      local!words: split(
        cleanwith(
          local!formattedString,
          /*your allowed character list  */
          "abcdefghijklmnopqrstuvwxyz0123456789 "
        ),
        char(32)
      ),
      /*abbreviate */
      upper(
        joinarray(
          /*a!forEach(*/
            /*local!words,*/
            /*charat(fv!item,1)*/
          /*)*/
          apply(
            charat(_,1),
            local!words
          )
        )    
      )
    )

  • First, thanks to all that have responded. I have tried everything suggested. I really like regular expressions but can appreciate the no plugin response by manisht209 which is what I think I will go with for now.

    I did make one change to manisht209's code. I updated line 14 from "local!formattedString," to "lower(local!formattedString)," which kept capital letters.

    Thanks again