KPI on more than 5000 records

Carmine Mele

Certified Senior Developer

1 month ago

For a Dashboard with a series of KPI's we have to elaborate more than 5000 entries,

Because we have to calculate median, mode, min, max on a series on time range on a actions log.

How we can achieve that using only records and not doing some use of stored procedure?

Discussion posts and replies are publicly visible

Top Replies

Stefan Helzle 1 month ago +1 suggested
Certified Lead Developer

Some of your requirements should be easily covered by the standard aggregation functions.
Stefan Helzle 1 month ago in reply to Carmine Mele +1 suggested
Certified Lead Developer

I typically try the most simple and obvious approach first.
Jesse Knight 1 month ago +1
Certified Lead Developer

Median can be determined if you have a totalCount already. You can set the startIndex of the query for exact middle element of your data using a!pagingInfo(startIndex: totalCount/2 , batchSize: 1) . This…

Parents

0 Jesse Knight
Certified Lead Developer
1 month ago

Median can be determined if you have a totalCount already. You can set the startIndex of the query for exact middle element of your data using a!pagingInfo(startIndex: totalCount/2 , batchSize: 1). This requires a separate query before your KPI's load, however, but works just as well for a date/time field as it does for any other field.

Mode is a query that groups on the column you need the mode from, and then select the data element with the highest count. This might be doable directly within the KPI element, but it's probably easier to do as a separate query. If you reverse sort by the alias that has the COUNT on it, then you only need a paging size of 1. However, it is very likely that a date time (that includes seconds) aggregation with COUNT will be prohibitively expensive, even with the interval field of the grouping function. So it's worth testing query performance on your actual dataset or a representation of it. If you find that the query generates slowness in the data fabric (even for other apps), it may be worthwhile to have a sync-time custom record field that rounds the timestamp to the nearest day, hour, or minute, depending on your requirements. Your KPI would determine the mode based upon the custom record field rather than the original.

I know you didn't list this one, but it often comes up with other statistics. If you need standard deviation - this one is somewhat tricky to do with the Data Fabric and I wouldn't recommend it with a date/time field. I have accomplished this in the distant past with a database view and its built-in MySQL functions. The main issue with this approach is that the standard deviations could be calculated *before* Appian filters are applied - i.e. the standard deviation is calculated for all rows within the view for every query to the view, and then the filters for specific rows are applied. There are some SQL-specific workarounds that include wrapping the view in a stored procedure and other shenanigans. Hopefully your requirements don't send you down that rabbit hole.
Cancel
Vote Up +1 Vote Down

Sign in to reply

Verify Answer

Cancel

Reply

0 Jesse Knight
Certified Lead Developer
1 month ago

Median can be determined if you have a totalCount already. You can set the startIndex of the query for exact middle element of your data using a!pagingInfo(startIndex: totalCount/2 , batchSize: 1). This requires a separate query before your KPI's load, however, but works just as well for a date/time field as it does for any other field.

Mode is a query that groups on the column you need the mode from, and then select the data element with the highest count. This might be doable directly within the KPI element, but it's probably easier to do as a separate query. If you reverse sort by the alias that has the COUNT on it, then you only need a paging size of 1. However, it is very likely that a date time (that includes seconds) aggregation with COUNT will be prohibitively expensive, even with the interval field of the grouping function. So it's worth testing query performance on your actual dataset or a representation of it. If you find that the query generates slowness in the data fabric (even for other apps), it may be worthwhile to have a sync-time custom record field that rounds the timestamp to the nearest day, hour, or minute, depending on your requirements. Your KPI would determine the mode based upon the custom record field rather than the original.

I know you didn't list this one, but it often comes up with other statistics. If you need standard deviation - this one is somewhat tricky to do with the Data Fabric and I wouldn't recommend it with a date/time field. I have accomplished this in the distant past with a database view and its built-in MySQL functions. The main issue with this approach is that the standard deviations could be calculated *before* Appian filters are applied - i.e. the standard deviation is calculated for all rows within the view for every query to the view, and then the filters for specific rows are applied. There are some SQL-specific workarounds that include wrapping the view in a stored procedure and other shenanigans. Hopefully your requirements don't send you down that rabbit hole.
Cancel
Vote Up +1 Vote Down

Sign in to reply

Verify Answer

Cancel

Children

0 Carmine Mele
Certified Senior Developer
1 month ago in reply to Jesse Knight

Thank you soo much for your detailed answer
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel