We need to build a management reporting system to answer various queries through our system.

We need to build a management reporting system to answer various queries about the progress of customer cases through our system. When each customer case begins, a new row will be created in the reporting table. As the case progresses, the row attributes will be updated with date & time values (and other values) to show the progress of the case and the user who completed some of the case functions. In total there will be about 40 columns of data. There will be approximately 100k rows in the table (by the time we have added data from historical and current cases).

The table will have CREATE and UPDATE operations of a frequent basis - say 10 times per minute. The READ operation will be much less frequent - say once per hour.

Are there any recommendations about using one datastore to access the table for CREATE / UPDATE and another datastore for READ?

Are there any other recommendations for achieving the Write and Read operations in the most efficient ...

OriginalPostID-208978

  Discussion posts and replies are publicly visible

  • 0
    Certified Lead Developer
    From your description the recommendation would be to have your management reports read from a database view which would handle all the effort of joins, aggregation and sorting for example. This would naturally separate the datastore from your create/edits and your reads for the reports. I can't think of any performance benefit though as all transactions will still go through the same connection pool. What style of management reports are you talking about? Are the aggregation based or going to be handling all 100k rows?
  • You do not need separate datastores.

    In case you need to aggregate or group or prefilter some data a view for reading could be a good idea. Just create the view and a matching CDT and modify the XSD to match. See Appian docs for that.

    In case you query for other fields than id, adding a index can speed up things.
  • Any reason for not using the data from Process instances?
  • 0
    Certified Lead Developer
    @navajithh data retention and performance would be major reasons I assume
  • @paule I could make a suggestion with regards to your question - Are there any other recommendations for achieving the Write and Read operations in the most efficient manner? It might be that you may know some of these already but I am trying to present a bird's eye-view.

    If it's a write operation, I would like to suggest to write the records in batches. Though you may have hundreds of thousands of records to write, set a batch size and then write those many number of records at once as specified by batch size and this batch size should be controlled by a constant. The process should be designed in such a way that only the number of records specified by batch size would be written at a time. Idea is not to engage the processor for much time and perform memory intensive operations which otherwise could create problems as specified at https://forum.appian.com/suite/help/16.1/Appian_Health_Check.html#USRX-0030 and https://forum.appian.com/suite/help/16.1/Appian_Health_Check.html#USRX-0029. I should state a example here that I have seen in our implementation - Health Check has flagged our operation of writing 1000 records (6 columns per record) as High Risk. So I would suggest checking the process execution times after setting a batch size and make sure that each operation isn't carried out for more than 2000ms (As per my knowledge, 2000ms is maximum). Finalize a batch size after running the process many times and checking the process execution times by including the worst case scenarios but the process completes before 2000ms. Whether it's a read/write operation, the operation undergoes Prepare/Execute/Transform phases from which we can also try to understand where things are going wrong if we start analysing the time being spent by these phases in various operations from the log file at https://forum.appian.com/suite/help/16.1/Logging.html#Data_Store_Performance_Logs. In case, if you have any sub-processes that needs to be run as part of the updates, try to initiate the sub-processes by making use of messaging if possible. Else there might be a chance that this may cause imbalance in Process Instance distribution among execution engines as specified at https://forum.appian.com/suite/help/16.1/Appian_Health_Check.html#DSGN-0032.

    With regards to read operation, again I would suggest reading the data in batches. Though I am not quiet sure about the reporting requirements, I could suggest you reading the data in batches. Let's say you want to surface data in a paging grid, in this case, query a few set of records (may be ten or twenty) and get the next set of records upon pagination. Remember that we can avoid the unnecessary queries by placing the query into saveInto of paging grid rather than with(). And further make sure that the interface is designed in such a way that data is obtained on demand. That is, for instance, a table might have 40 columns, but get only few columns initially and get the rest of the data by clicking a dynamic link where a query will be made in its saveInto.

    Further you may also consider the possibility of delegating the operation to database(for instance, view for read operations, stored procedure for read/write operations) if you can, as specified at https://forum.appian.com/suite/help/16.1/Database_Performance_Best_Practices.html but we may need to bear in mind again that the operation shouldn't be run for more than 2000ms as per my knowledge. I would suggest focusing on the parts that you can delegate to the database and this isn't much different when compared to suggestions made by @stefanh/@tim.clarke.

    To the best of my knowledge, as long as we delegate certain operations to the database wherever we can (off-course this needs proper analysis) and that too in a best possible way (so that the objects such as views/stored procedures are written with good quality of code) and design a process in Appian that sticks to the best practices as laid down by Health Check, there might not be any issues.