Recommended approach to process large amount of data from Excelsheet

Hi there,

I have a question on the design of one of the projects I am working on. In one of the projects, we have a requirement of fetching an Excel sheet (size ~12 MB) having ~55000 rows into Appian daily from external source and process it. I would like to know what would be the better/recommended approach -

a) fetch and parse excel file through Appian (using Excel Tools plugin). [Per my understanding, fetching and parsing 55k rows of data daily using Appian may impact the performance of Appian.]

b) fetch and parse excel file through some other tool like Informatica and implement only workflow part of it in Appian.

Thanks!

  Discussion posts and replies are publicly visible

Parents
  • 0
    Certified Lead Developer
    Hello Rohit,

    Processing large amount of data will impact the performance of Appian environment. We have different Excel Tolls plugin / smartservice to read the excel and import to db and then a separate call to another stored proc to execute business logic, refine and further process the data.
    This can be one of the approach where we will not hold data into the CDT or in process model. Which will not consume more AMU.

    Another approach is handle complete operation outside Appian environment. We can have a java batch process program / webservice / file upload and process program outside, which will process the data directly and then we can have the statistics which can be initially stored into a table and later fetched on Appian and business users and check the processing results.

    I would suggest to keep the implementation outside which will also a scalable solution for the future enhancement.

    Thank You
    Bibhuti
  • 0
    A Score Level 1
    in reply to Bibhuti
    Hi Bibhuti,

    Can you please let me know the different Excel Tools plugin / smartservice to read the excel and import to db which you have mentioned in your comment.

    Thanks.

    Regards,
    Balaji.R
  • 0
    Certified Lead Developer
    in reply to rp_balaji
    Hello Balaji,

    You can use "Import Excel to Database", "Load CSV File to Database" smart services which comes under Excel Tools plugin. These 2 smart services takes the excel file as an input along with No. of columns to read, Sheet number, Table name, data source and New Column / New Column value(the last 2 are optional mentioned during configuration, but required a value otherwise throwing error).

    One more thing which needs to be take case during implementation is that, if the excel sheet is having any blank columns or rows but formatting like border colour or anything is applied then the smart service importing those blank rows into the table and also in case of blank columns it will try to find columns in table and will fail execution. Hence take case of the exact data to be present in the sheet during upload.

    Thank You.
  • 0
    A Score Level 1
    in reply to Bibhuti
    Hi Bibhuti,

    Thanks for the details and have a question, while using the smart service you mentioned if we are processing a file of 12 MB (as mentioned in the post) wont the smart service gets timed out coz of large amount of data.

    Regards,
    Balaji.R
  • 0
    Certified Lead Developer
    in reply to rp_balaji
    Hello Balaji,

    Yes I guess that is possible based on the Heap memory usage and cause tmeout also. So only I had suggested to import externally using other application or tool.

    Thank You.
Reply Children
No Data