Query Rule times out due to slow db view

Query Rule times out due to slow db view (with minimal returned data):
We have a process model that uses a script node to call a query rule. The query rule looks at an MS SQL view and returns a small set of data, usually less than 10 rows with 16 columns of data. We then email this relatively small CDT of data to users. This happens in the background hourly and the UI is invisible to the users, so the length of time the process takes is somewhat irrelevant to us (it runs async and could take 1-2 minutes without anyone caring).

The view's business logic performs a left join between 2 tables and includes about 6 items in the where clause. We currently do not have any indexes on our base tables and in development this wasn't a problem but after deploying this particular view to production we are seeing the script node time out most of the time. The prod tables are joining 2 tables with 100 thousand & 1.6 million rows (hinting at tabl...

OriginalPostID-221207

OriginalPostID-221207

  Discussion posts and replies are publicly visible

  • ...e scan issues).

    The docs for 7.11 show the setting conf.data.query.timeout=10, which I believe is what we are using. The query running directly on the production db is taking 4-25 seconds to run (sub 1 second in dev). While we are hoping to create indexes on the tables to speed up the performance, short term we would like to get this running. In theory we could change the system timeout but this could affect other apps on the server.

    1) does the conf.data.query.timeout setting apply to ALL queries in Appian or just Query Rules?
    2) would rewriting the query rule to be a queryEntity allow us to bypass the 10 second time limit?
    3) is there another way to get our process model to not have the script node crash while waiting for the query rule to receive the results from the db?

    Yes, optimizing the db to return data in <10 seconds would be best, but if we don't care how long it runs is there a way to quickly rewrite the query rule, script node or process model to ...
  • ...not break and send an alert at 10 seconds? (assuming other ways of querying do not utilize the 10 second config limit)
  • 0
    Certified Lead Developer
    @ryanh Per my understanding, a!queryEntity does not bypass the time limit. If you're looking for a way to fix the issue without optimizing the view, the documentation page on Query Rule Limits (https://forum.appian.com/suite/help/16.2/Post-Install_Configurations.html#Query_Rule_Limits) has a good suggestion: "consider using the query rules paging parameter to return less data (or return data in batches) and therefore avoid the limit"
  • 0
    Certified Lead Developer
    I know this is not what you are asking, but keep in mind one bad query (especially one that runs repeatedly) can affect all performance activity within the db. Running top will tell you where this falls in the land of affecting the db. Explain plan will show you where to focus your tuning efforts.
    You may know all this, I just want to say upping limits to let this continue could cause down stream impacts to any processes writing to or querying the same tables as well as just other usage within the db.
    Good luck!
  • @ryan
    Looks like you are there could be a problem in your view statement and this might bring you performance issues going forward. When using a VIEW script we need to ensure that there should not be any direct select statements. If you are joining let's say more than 5 tables other than joins if possible try creating a resultset and make alias inside your view statement. Make sure all the tables involved in view have proper indexing.

    Once this is properly aligned make use of queryEntity only if all the 14 or 20 columns of your view is not required , else if you can use queryrule.

    1)Afaik., that configuration 10 sec is for queryrule but query entity always has upper hands.
    2)I have seen QE in reports which ideally took more than 20 sec
    3)One more options could be try getting data in batches (use paging parameter like startindex 1 to 5 , next 6 to 10 and so on ) inside your process model and append the data to result set and try using that.

    Hope either anything above could help !!!
  • Another possible solution: try restructuring your source information into datamarts, which generally lend themselves better to querying and reporting. Many times queries are performed directly on transactional data, which is usually optimized for insertion (e.g., activity/audit logs); unfortunately, this same structure can have detrimental effects on querying. Can the large tables that you mentioned be broken down into smaller tables that can be more easily queried?
  • I am going to try and get with our DBAs today to see if we can get some indexes built, up until now it hasn't been an issue, but when this code moved to production a day ago we started seeing it. It ran fine in under 4 seconds early morning and late night when others weren't hitting the db, but during the day it dragged on and the node timed out and hence the process.

    In terms of using paging on the data, the problem is, we are only attempting to return 0-10 rows of data, so it isn't the quantity being returned, but the length of time the query takes. It appears as if most of the performance problems are caused by the very last and line where we check the values is not one of 3 unique values. The date column doesn't seem to cause any appreciable slow down issues. Here is a simplified version:

    select 16 columns from table A left join table B
    on (a.filenum_txt = b.filenum_txt and a.batchnum_txt = b.batchnum_txt)
    Where b.riskcallingid_int is null
    and a.riskapprovaldecision_txt = 'Pending'
    and (a.suspendedcategory_txt = 'Risk' or a.suspendedcategory_txt = 'Multiple')
    and a.fromdate_dte >= '2016-03-30'
    and a.previousstatus_txt = 'Suspended'
    and not a.processstatus_txt = 'Complete'
  • 0
    Certified Lead Developer
    @ryanh It might be worth considering in the long term using lookup tables for the different statuses and categories, and having table A use foreign keys instead of storing the text names. Comparing integers (especially once you add indexes) will perform significantly better than comparing text strings on such a large table.

    In the mean time though, how many rows are returned if you remove the last line that seems to be causing the problems? It might be an option to return all results regardless of processstatus_txt (and maybe combine this with the batching approach if there are a lot of rows), and finish the filtering in Appian. It will probably be slower overall, but it might at least reduce the frequency of your script task erroring.
  • Hi Ryan, if indexes are not helping much, depending upon the frequency, please see if a materialized view can be an option, and refreshed on nightly basis which may increase the performance and load as the table sizes referred are huge.
  • Removing the last row where we check Complete doesn't affect the results much if any, we get less than 10 rows back either way.

    I created an index on each of the 2 production tables this query joins. They are on the filenum_txt and a.batchnum_txt columns (the index is on the combination of the 2 columns). During normal daytime loading in production the query was running 15-30 seconds, immediately after adding the indexes to the 2 tables it dropped to <1 second. Thankfully the 2 columns in the index are the main columns used in the majority of our queries and should help out other parts of the app (though since we don't join these tables normally the speed improvements won't be nearly as big).

    I have since added a number of other indexes in development and am testing with them. As for the other apps my coworkers build/support, I am pushing them to evaluate their larger tables that are heavily queried.

    Our monthly system health check report from Appian only shows a small number of queries that run longer than a few seconds which is good and this will be a starting point for us.