Need Design Suggestions

Hi All,

I need some design suggestions on my requirement below .

I have a excel which has around 1000 rows and has 5 columns . All four columns are some labels and the last column has VALUE(Sample below)

My requirement is to upload this excel file and save into Appian DB but before i save this data i want to perform some validations on VALUE column let say :

1. first 5 rows should fall into tolerance band of 100-200 

2. Next five rows should fall into tolerance band of 10-35

3. first 5 rows should sum upto 100

If values doesn't pass above validations then i need to tell user to re upload the excel with correct values .

Col 1 Col 2 VALUE
ABC XYZ 200
ABC XYZ 1290
ABC XYZ 506
ABC XYZ 6.88
ABC XYZ 5.18

Thanks in advance

  Discussion posts and replies are publicly visible

Parents
  • I'm not personally an expert in Stored Procedures, so I can't speak to whether the method suggested by Stewart above would be easier or not.  For me, though, it would be harder than simply handling the logic within Appian.

    Here's a quick breakdown of my suggested design/process flow:

    1. The user provides the Excel file on a SAIL form via a!fileUploadField() - this will include a validation that the uploaded file's extension is exactly ".xlsx".  The user will click a button or submit link labelled "validate", which will submit from the form and run through a set of script tasks
    2. The first script task will call an expression rule which first uses readExcelSheet() (found in the Excel Tools plug-in) to read the contents of the excel file and formats it into an array of your own custom-defined CDT type
    3. The second script task will call a second expression rule that accepts the previously-defined CDT array as an input, and (assuming it's not blank, empty, or in some other rule-out exception state) validates the values based on the logic you've defined above.  Appian expression rules can easily handle logic like your requirements (or even quite a bit more complicated) with some practice.  The expression rule would return "true" if the values pass your validation, and "false" otherwise.  This would save into a PV i.e. "pv!isValidExcel".
    4. If the validation value is "false", loop back to the form and pass the failed validation state into a Rule Input, which you use in order to display some validation text to the user.
    5. If the validation passes, here is a design choice for you: you can:
      1. loop back to the form and show the user a friendly "valid data" message, followed by an additional submit button labelled i.e. "Save", which will let them submit and save the values to the database.  The advantage of this method is visibility and also facilitates allowing the user to remove their original upload and provide a new version if they're unsatisfied with it for some other reason.  They can also be required to provide other on-form details, etc.
        • (This would be my personal preference, unless required otherwise by external factors such as customer preference etc).
      2. continue straight to the Write to DB node(s) and then exit.  This method would be most useful in cases where it's an assumption on the user's part that they will usually only need to perform a single step to upload/save a valid file, and the validation error will be a rare event and there isn't other on-form data that they'd be expected to provide.  It's not as good for visibility but minimizes clicks (from 2 clicks to 1 click, anyway).
    6. Regardless of the route taken above, the process would eventually write the validated CDT data to the database per your particular requirements.

    A slightly simplified version of the process flow I'm thinking of is pictured here:

  • So this works too, but if the size (number of rows and.or number of validations that need to be applied gets very large then you need a blindingly fast way to achieve this, and that's when Stored Procs come into play, so I guess I was pre-empting these issues by suggesting the solution I did.

  • That makes sense.  I guess it comes down to the data size/density/etc in the poster's "1000 row" excel sheet.  I assume it'd need some testing out first.

  • So just to satisfy my own curiosity - I created a test excel sheet with just over 3600 rows of data in 3 total columns, and all in one test expression I've

    1. read the data using readExcelSheet()
    2. looped over all resultant rows and captured the data into a CDT-style dictionary
    3. Pulls out the first 5 rows of data and does some arbitrary validation on values contained therein

    The whole shebang executes on average in less than 250 ms, which should be acceptable in all but the most extreme circumstances.  Obviously it bears further testing with a sheet with "realistic" data populated in 5 columns x 1000 rows, but, at the very least, it bodes well for the use case / solution method.

Reply
  • So just to satisfy my own curiosity - I created a test excel sheet with just over 3600 rows of data in 3 total columns, and all in one test expression I've

    1. read the data using readExcelSheet()
    2. looped over all resultant rows and captured the data into a CDT-style dictionary
    3. Pulls out the first 5 rows of data and does some arbitrary validation on values contained therein

    The whole shebang executes on average in less than 250 ms, which should be acceptable in all but the most extreme circumstances.  Obviously it bears further testing with a sheet with "realistic" data populated in 5 columns x 1000 rows, but, at the very least, it bodes well for the use case / solution method.

Children
No Data