Hi All, I need your advice on below approaches for handling a scenari

Hi All,

I need your advice on below approaches for handling a scenario.

Scenario:- We have something like a Header/Detail tables. Each header record can contain around 100-200 detail records. User input the Header ID ; we need to get all the detail records and process each one of them and does DB update
Approach 1: Can we go for MNI ,for each detail record ; simultaneously processing all detail records.
Approach 2: In a single sub-process get all detail records ; subject each record through some business logic. Once all detail records are subjected to processing , the CDT which holds all these detail records is used for the Detail DB Update. (Single Update ,but containing all records)
Which among the above two approaches is best for handling this scenario w.r.t to performance and stability of platform. fyi, this is not a batch process ,but an interactive process where the header is keyed in by the user.



  Discussion posts and replies are publicly visible

  • Question: Are the detail records based on input from the header detail or does each detail record require additional information from the user? In other words, are the details templated data or differentiated?
  • From user he just keys in the header record primary key into a text box ; nothing else.Based on header record primary key ,the detail records (100-200) detail records are fetched from detail table and processed
  • @georgej I would like to comment and add few tweaks as follows on the approaches specified by you:

    Approach 1: If you mean by 'simultaneously processing' that you will try to launch all the 100 - 200 processes (1 process per 1 detail record, and on the whole 200 processes for 200 detail records as per your example) at the same time , then I would like to suggest to refrain from doing so. As you are saying that you need to update the detail record finally at the end, there is a potential danger of triggering 200 database update operations at the same time which might tentatively cause the exhaustive connection pool issue. Further you are also saying that you need to process each record. I don't think we can't judge the complexity of the process from this statement, but probably what I can suggest is to measure this complexity by checking the node execution times and the space occupied. But please do bear in mind that, a complex processing, that too triggered 200 times (i.e. per 200 detail records) in parallel flow could literally bring the performance down drastically. And this could be visible from the end user perspective, as he starts experiencing severe slowness until these processes complete. If the processing of each record is quiet simple, and the cdt(table) holds less amount of data, then ensure that Appian is able to do it for you quiet comfortably. But also bear in mind that increase in complexity of processing of detail record or increase in amount of data hold by detail record causes a severe bottle necks in performance and further leads to significant changes in design. Sometimes the time required to redesign can also equal (or exceed at times) the actual time we spend when we first deal with use-case.
    I can suggest few tweaks as follows to this approach:
    1. If the process is designed to run each instance one by one, it won't cause any issue. But make sure that the time taken ultimately doesn't effect your requirements(obviously 200 one by one processes takes good time).
    2. If possible try to initiate the processes by making use of messaging. You can find some notes re this under 'Uneven Process Instance Distribution' at https://forum.appian.com/suite/help/16.1/Appian_Health_Check.html.

    Approach 2: I would like to suggest to refrain from processing all the records in one go.
    I can suggest few tweaks as follows to this approach:
    1. Make sure that the batching is implemented. That is, let's say you need to play with X records, always ensure that Appian is able to handle them prior to making a decision. If so, that's fine. Else make the X into two halves and try again. Continue until Appian ensures you that it is able to handle the process comfortably. For instance let's say you have written a complex rule which will operate on any number of detail records. Now let's assume that you have put the expression rule in a script task and inputted 200 detail records to it. Let's assume that the script task has consumed 10 minutes because of the amount of data being handled by the expression rule. This kind of processing is not at all appreciated by Appian. Instead fix your batch size as 100 where each takes just 1 minute (Please note that I am just applying general logic). So the only tweak i would like to suggest to your approach 2 is, batching. And nevertheless to say, we shouldn't forget that the batches also should run one by one, but not simultaneously.
  • Now we might generally get a doubt re how we must ensure that Appian is able to handle the process and all the operations in it comfortably. This should be answered by Appian Health Check at forum.appian.com/.../Appian_Health_Check.html. Here are a few examples:
    Example 1: Let's say you have opted for Approach 2 (without any tweaks) and have written 200 detail records with huge amount of data. This operation might fall under High/Medium risk and the category falls under 'Slow data store operations' at forum.appian.com/.../Appian_Health_Check.html.
    Example 2: Let's say you have opted for Approach 2 (without any tweaks) and have processed 200 detail records(as you have earlier said that each detail record needs some processing prior to update) in a single shot. The operation might be flagged as High/Medium risk and the category falls under 'Slow expressions' at forum.appian.com/.../Appian_Health_Check.html.

    If you aren't aware of the Health Check and its applications, still you can also make a solid design by studying the Health Check concepts and the practices suggested in it.

    Further I am not sure if you have designed the processes (which deals with huge data-sets) under Health Check's constraints. But doing so for some time will give you a wider insight and experience over how to deal with the scenarios which you have asked for.
  • @georgej Hope the information provided by me is of some use to you. Keep us posted in case of any follow up questions or concerns and the Appian practitioners or core team here might assist you depending upon the level of the complexity. Let's see if we can get much more valuable inputs from the community.
  • Agree with @sikhivahans, if you can make this use "messaging" and give the process enough time to distribute the requests this will prevent any negative performance/memory penalties in trying to push all at once. Here is a link for your convenience: forum.appian.com/.../Messaging_Best_Practices.html

    Pay special attention to "passing data by reference" to the sub-process and ensuring the payload itself is small.
  • Thank you Sikhivahans/Nicholas ,your inputs are really valuable.Regarding messaging to even out the load across engines,I am afraid as my understanding is messaging may not work well if I need this processes to be synchronous.Unless and until i update all 200 odd detail records,I cannot proceed further with my process.So not sure whether messaging which otherwise is a best solution suits this situation.I will try with batching as explained by you guys
  • george - when you say that the detail records are processed, what exactly does that mean? are you simply updating a field or doing major data transformations?
  • Around 5-6 columns of each detail record will get updated after checking some logic.
  • @georgej I would like to suggest to follow the asynchronous approach by keeping the memory intensive operations we are performing. So, from my perspective, it's better to educate the user saying that activity chaining can't be offered and run the process asynchronously thereby giving the process enough time to complete, and in parallel we can have a good design in place.

    Further I would like to add one more input - See what database can effectively do for you in this entire use-case. For instance, few operations can be effectively implemented in database when compared to Appian because of the power and flexibility it has got. The documentation at https://forum.appian.com/suite/help/16.1/Database_Performance_Best_Practices.html might give you an idea.