I have a lot of data in which is to be written to Database. It can be like an ar

I have a lot of data in which is to be written to Database. It can be like an array of 24,000 CDT variables. I want to write it to DB. Now, I have two approaches for this. I can A) Either Write to Database in batches - I can decide a batchSize of 200 and write 200 rows at a time sequentially. This approach will take some time to completely write to DB B) I can use MNI in process models. I will still use batchSize, but now more than one Write to DS smart service will execute simultaneously. This will increase the parallel writes to DB and will also complete the process model quickly.

My question: Will the parallel writes to DB using MNI affect the performance? Also I think it will open too many DB connections to database, so is it a good idea?

OriginalPostID-215023

OriginalPostID-215023

  Discussion posts and replies are publicly visible

  • Just to clarify, option A will be using looping - writing Data to DB in batches. option B - will be using MNI instead of looping and it also will write 200 records at a time
  • @chetany Hi, I would strongly recommend to refrain from parallel writes and to the best of my knowledge, this creates exhaustive connection pool issue and some others which I have elaborated in the comments made some time ago at the URLs mentioned below. I have previously made some comments on the similar posts at following URLs, not sure if you have gone through them, if not I will suggest doing so as I believe that you might find some useful information:

    https://forum.appian.com/suite/tempo/entry/e-191410
    https://forum.appian.com/suite/tempo/entry/e-171106
    /search?q=OriginalPostID-189203
  • @sikhivahans, Thank you. I went through those links. Your comment that 1000 records with 6 columns was identified as health risk by Appian health check gives an idea about how much data can be safely written at a time.
    So, I will go with approach of sequential writes to DB. I was only hoping that parallel writes may get the work done faster. But since it may cause issues, I would rather avoid it.
  • What's the source of your data that you are trying to write? 24000 records in a CDT at any point in time in a process is huge and will consume significant memory in Appian engine server. Depending on how you are sourcing and feeding the Data to the process that's supposed to write, you might want to consider using the "RDBMS CSV Import Export" smart service. Although I've not used this smart service, but using this will make sure that you are not flooding your process with huge amount of data in the form of process variables as the parsing and writing the data will be done in the JVM. But again you need to be mindful of the JVM memory issue processing large set of data and you should do it only during off hours.
  • @prosenjitd, It is not actually 24000, I was just explaining with an example. Currently my data is in range of 2000- 5000 records(It may increase depending on some factors). Also, my process model is short, it just gets data from a source and then writes it to DB and is then terminated.
    Anyways, Thank you for your suggestion of using "RDBMS CSV Import Export" smart service.
  • You must consider the growth of the data in the foreseeable future and design your solution such a way that it handles the increase in the data volume efficiently and continues to function without break. It's always desirable to achieve a particluar functionality leveraging Appian OOTB (exactly what your solution does) as long as it's possible and adopting this OOTB approach could be the best option in your case , but you must consider all aspects of your solution (impact of future data growth,making multiple DB calls, performance etc..) before finalizing it. I'll recommend for you to go through this link if you have not already done so - forum.appian.com/.../Transferring_Processing_Large_Data_Sets_(ETL).html.
  • For best performance rather than taking only 1 of the approaches i would recommend
    taking hybrid approach, use MNI but in a controlled way,
    You execution of sub process that write the data as MNI that execute one by one.
    and pass 10 to 20 data records to the DB write process model and configure the write to DB node as MNI that execute in parallel.

    This may be a agreeable solution for your situation.
  • I think creating a subprocess and then using mni for each item in cdt can be efficient because appian fetches the connection from connection pool , it doesnot everytime creates a new connection. I hope i am making sense here.
  • Be it a RDBMS CSV Import Export or Execute Stored Procedure, if they are used to process the large data sets in a single shots without batch processing, they may cause performance bottle necks.

    As far as my knowledge is considered, there is a limit on the response duration of the smart service (I guess it's 4000 ms) and if this limit is exceeded, we need to revisit our design and again MNI might be one of the effective solutions. So I would suggest keeping an eye on the response duration of the any plugin prior to making it deal with large data sets. We have had this issue earlier and Health Check has categorised the operation as 'High Risk' stating that the response duration has exceeded the limits.