We have a process report consisting of more than 1 lakh records. We are trying to create a process model that can fetch the data from report and add it to CSV file. For that, We have used "Export Process report to CSV" smart service. Now, This smart service only fetch 10000 records at a time, Due to which we are not able to fetch whole report data at once. Can someone please guide us about the best way to achieve it? Thanks in advance
Discussion posts and replies are publicly visible
Is it possible for you to logically partition the data? You can apply a filter as part of the configuration of the smart service which might allow you to extract the full data by getting it in chunks, each of which is <10k records, and then merging the output.
I can use the date base filter but than the question is how we will decide upto which limit the loop should be executed? and how can we apply the loop on that smart service?
Unless you can examine the generated CSV file generated and count the rows then perhaps you'll have to run on instance for a know set of "slices" of the data and accept that some may just be empty (e.g. if your data can be partitioned by day and you're only interested in the last 7 days then run one instance per day and then merge the 7 data-sets generated)
You say partition the data, You mean creating different report and dividing the data in them based on some filter? If such is the case, Than it will increase the manual efforts for us. Especially when we have more than 1 lac data. However, Can we add row number inside process reports? If yes, Than this will resolve my problem. As I will loop in the report based on indices and get those data.
Can you explain a bit more why you need to export all of this data? The reason I ask is that I'm guessing that there is a use case behind this question (for example, maybe you want use process data in a third-party BI tool?). Depending on what you need to do with this data, there are other options you could consider that may be better than exporting using the process report to CSV smart service.
For instance, you could:
The other thing I would suggest considering is to use a database table as a system of record rather than your process instances. Databases are much easier to query from and can handle a much larger volume of data than process instances. (I realize this may requiring refactoring of your application to achieve this, but it would make it much easier to access this data in the future).
Thanks for your reply. As you said, This is the actual implementation we are currently doing with our workflow. Initially everything was residing inside appian processes and now we have changed the implementation wherein we are now storing everything in database table. Now, As a part of release we have to take into consideration the old data that we have currently inside appian and we need to move them to database. Hence, For moving the data from Appian to DB we are doing this stuff
Ah okay, that makes sense! I'm assuming you would only have to do this once, right? Since you're changing your future processes to directly write to the database, it would be a single migration and then you would be all set.
I think your best bet is the batching that Stewart mentioned - find the shortest time period that has less than 10k rows (e.g. a month / week / day) and then run the process a few times for each time period to get all the data.
Then you can always merge the files created to have one big data export.
Yes you are right, Its an one time task. Thanks for the Suggestion Peter, I will go with this batch processing. This seems to be the best solution so far. Appreciate your help!
© 2020 Appian. All rights reserved.