Recommended approach to process large amount of data from Excelsheet

rohitw536 over 7 years ago

Hi there,

I have a question on the design of one of the projects I am working on. In one of the projects, we have a requirement of fetching an Excel sheet (size ~12 MB) having ~55000 rows into Appian daily from external source and process it. I would like to know what would be the better/recommended approach -

a) fetch and parse excel file through Appian (using Excel Tools plugin). [Per my understanding, fetching and parsing 55k rows of data daily using Appian may impact the performance of Appian.]

b) fetch and parse excel file through some other tool like Informatica and implement only workflow part of it in Appian.

Thanks!

Discussion posts and replies are publicly visible

Parents

0 Bibhuti Mohapatra
Certified Lead Developer
over 7 years ago

Hello Rohit,

Processing large amount of data will impact the performance of Appian environment. We have different Excel Tolls plugin / smartservice to read the excel and import to db and then a separate call to another stored proc to execute business logic, refine and further process the data.
This can be one of the approach where we will not hold data into the CDT or in process model. Which will not consume more AMU.

Another approach is handle complete operation outside Appian environment. We can have a java batch process program / webservice / file upload and process program outside, which will process the data directly and then we can have the statistics which can be initially stored into a table and later fetched on Appian and business users and check the processing results.

I would suggest to keep the implementation outside which will also a scalable solution for the future enhancement.

Thank You
Bibhuti
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Reply

0 Bibhuti Mohapatra
Certified Lead Developer
over 7 years ago

Hello Rohit,

Processing large amount of data will impact the performance of Appian environment. We have different Excel Tolls plugin / smartservice to read the excel and import to db and then a separate call to another stored proc to execute business logic, refine and further process the data.
This can be one of the approach where we will not hold data into the CDT or in process model. Which will not consume more AMU.

Another approach is handle complete operation outside Appian environment. We can have a java batch process program / webservice / file upload and process program outside, which will process the data directly and then we can have the statistics which can be initially stored into a table and later fetched on Appian and business users and check the processing results.

I would suggest to keep the implementation outside which will also a scalable solution for the future enhancement.

Thank You
Bibhuti
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Children

0 rp_balaji
A Score Level 1
over 7 years ago in reply to Bibhuti Mohapatra

Hi Bibhuti,

Can you please let me know the different Excel Tools plugin / smartservice to read the excel and import to db which you have mentioned in your comment.

Thanks.

Regards,
Balaji.R
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Bibhuti Mohapatra
Certified Lead Developer
over 7 years ago in reply to rp_balaji

Hello Balaji,

You can use "Import Excel to Database", "Load CSV File to Database" smart services which comes under Excel Tools plugin. These 2 smart services takes the excel file as an input along with No. of columns to read, Sheet number, Table name, data source and New Column / New Column value(the last 2 are optional mentioned during configuration, but required a value otherwise throwing error).

One more thing which needs to be take case during implementation is that, if the excel sheet is having any blank columns or rows but formatting like border colour or anything is applied then the smart service importing those blank rows into the table and also in case of blank columns it will try to find columns in table and will fail execution. Hence take case of the exact data to be present in the sheet during upload.

Thank You.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 rp_balaji
A Score Level 1
over 7 years ago in reply to Bibhuti Mohapatra

Hi Bibhuti,

Thanks for the details and have a question, while using the smart service you mentioned if we are processing a file of 12 MB (as mentioned in the post) wont the smart service gets timed out coz of large amount of data.

Regards,
Balaji.R
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Bibhuti Mohapatra
Certified Lead Developer
over 7 years ago in reply to rp_balaji

Hello Balaji,

Yes I guess that is possible based on the Heap memory usage and cause tmeout also. So only I had suggested to import externally using other application or tool.

Thank You.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Bibhuti Mohapatra
Certified Lead Developer
over 7 years ago in reply to rp_balaji

Just to add-on I guess there is also one more why if it needs to be handled from Appian only. The Smart service is having one more parameter batchsize which is 1000 by default. So there can be one more solution to break the large excel to chunks and then upload using apian with a safer batch size which will not consume more heap memory.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 rp_balaji
A Score Level 1
over 7 years ago in reply to Bibhuti Mohapatra

Hi Bibhuti,

Thanks for the response. I have a query here, do we need to execute the smart service in loop by setting the batch size explicitly till the last row of the file is parsed or will the smart service takes care of the batch till EOF by itself by executing the service only once?

Regards,
Balaji.R
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Bibhuti Mohapatra
Certified Lead Developer
over 7 years ago in reply to rp_balaji

Hello Balaji,

We need to explicitly pass the batch size to process and it will not do looping automatically instead we need to create a loop until completion. But again we need to be very careful about the memory usage.
Thank You
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 rp_balaji
A Score Level 1
over 7 years ago in reply to Bibhuti Mohapatra

Hi Bibhuti,

Yes, thanks that answers my query.

Regards,
Balaji.R
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Bibhuti Mohapatra
Certified Lead Developer
over 6 years ago in reply to rp_balaji

Hello Balaji,

Thanks. Requesting you to mark my answers if that that really helped you. :)

Regards
Bibhuti.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 rp_balaji
A Score Level 1
over 6 years ago in reply to Bibhuti Mohapatra

Hello Bibhuti,

As I am not the owner of this question, system is not allowing me to mark this as answer :).
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel