Use Database and Materialized Views in Appian

Database Views

As a general recommendation, design your applications using Appian Records with Data Sync enabled. The Data Sync feature provides the ability to relate and transform data using record relationships and custom record fields, even across multiple data sources. It also caches your source data in Appian and provides the Appian Data Fabric framework to efficiently query data. Keep in mind though that it is still recommended to follow database design best practices when using synced records to store your data. If you still need to use a database view, this play provides a basic introduction to creating views for Appian applications.

Reasons for Use

Database views hide the complexity of multiple joins. Database views optimize joins or lookups that would be slow using CDTs and query rules.
Database views are useful for casting certain database types to datatypes that Appian recognizes.
Optimize joins or lookups that would be slow using CDTs and query rules

View Basics

Refer a tutorial site such as w3 schools for details on creating views.
The following tips are generic tips for creating views. Your actual database design (such as using an index or not) will vary based on your schema and your database vendor
- Be sure to join on PK / FK relationships.
- Use meaningful table aliases when creating database views
- Add Indexes to joins using columns that are not PK or FK.
- Run an explain/execution plan on the query with the volume of data expected in production a year after go-live to ensure that the view will perform well under expected data volume.
- Incorporate testing functionality that uses the view as part of your Performance Test Plan.
- Refer to Database Schema Best Practices when designing your view and mapping to a CDT.
- To troubleshoot poorly performing views refer to the performance troubleshooting section of the Database Performance Best Practices.
- Ensure to only use a FULL JOIN when necessary, a properly used LEFT JOIN can significantly reduce the data set and therefore reduce the work the DB has to do
- When possible, database views should generally contain date range filters to reduce size of response
- Archive (or delete, if applicable) older data so that the current view is always procured/maintained and reduced to what is necessary/relevant for querying purposes

Types of Views

A regular view is virtual only and runs the query definition each time it is accessed.
Materialized views are cached and refreshed periodically based upon the query definition.
Converting a view into a materialized view is one of the options to improve the performance of a slow running view.

Materialize Views

One of the primary reasons developers choose to materialize views is performance. Thanks to the Appian Data Fabric, the ability to create complex data structures and related queries is easier and more performant than ever. That said, there are constraints related to the Appian Data Fabric that make it not a good fit for certain use cases. In the event that you need to use the database, rather than synced records and record relationships, then materializing views can improve performance.

Materializing views can be helpful when the requirements for a view result in the need for any of the following:

Large number of records and/or columns with the following calculations causing slow performance:
- GROUP BY
- UNION
- DISTINCT
Searching or sorting against calculated columns

The best way to determine if Materializing is necessary is to perform Database Volume Testing as early as possible in the development cycle. It is also recommended to review the data model guidance to ensure that your data model is scalable.

Ensure Proper Performance

When you have a potentially slowly performing view you will want to test all of the following to ensure proper performance in production:

Sort by all of the sortable columns in the grid
Test grid paging
Test filtering by all provided search filters that will be used against the view.

If you find your view is slow you will want to try all of the following before resorting to materializing the view:

Add relevant indices and viewing the explain plan to ensure performant joining
Investigate rewriting portions of the views to improve performance (google is your best friend here)
Refactor the view into multiple views or refactor the queries to the slow view into multiple simpler and faster queries from smaller views or directly from the tables
Push back on requirements that necessitate complicated views

Materializing Views

When all other avenues have been exhausted the last step is to "materialize" the view. This is in quotations because MySQL does not technically support materialized views. However, the same effect can be achieved by creating a table which stores the same data as the view and is updated each time any of the underlying tables are updated. Note that "materialized" views may not be the best solution if the data in the view is frequently written to the database or if views require real time updates. You will need to test your views after they are materialized to ensure that there is no negative performance impact.

The steps to materialize a MariaDB SQL view are as follows:

Create a new table with the same name as your view, except instead using _M_V_ instead of _V_, e.g. APP_M_V_VIEW instead of APP_V_VIEW.
- The easiest way to do this is to export the structure of the view and then copy the Create table statement from the export sql and rename the table. This will ensure your table has the same columns as the view in the same order. Make sure that the table has a unique primary key and appropriate indices.

Create a stored procedure which can be called to update the materialized view. The inputs to the stored procedure should be the identifiers of tables that are updated throughout your process that have an effect on the materialized view. If all of the inputs are 0 or empty, then the stored procedure should perform a complete refresh on the materialized view.

The stored procedure will accomplish the updates to the M_V by using INSERT INTO <M_V> (<COLUMNS>) VALUES <SELECT SQL FROM VIEW> ON DUPLICATE KEY UPDATE.
- Information on how this works in MySQL can be found here.
It is important that you do not use SELECT <X,Y,Z> FROM <VIEW> WHERE <FILTERS> and instead use the actual SQL that makes up the view which you can then apply the filters to. If you simply select from the view with filters the entire view will have to be calculated first before those filters are applied if your view does not support MERGE algorithm
- Aggregate functions or window functions (SUM(), MIN(), MAX(), COUNT(), and so forth)
- DISTINCT
- GROUP BY
- HAVING
- LIMIT
- UNION or UNION ALL
- Subqueries in the select list
- Assignments to user variables
- References only to literal values (in this case, there is no underlying table)
If you have filters on your view which use data that can be modified by the application (only include rows where IS_ACTIVE=true for example) but you do not have the IDs of rows which should be removed from your M_V table, you will need to account for possible changes in your stored procedure.
- Example:
- ```
IF i_objectId IS NOT NULL THEN
               DELETE FROM APP_M_V_VIEW  WHERE ID_PK IN
(
    SELECT ID_PK FROM APP_TABLE where IS_ACTIVE<>1 and ID_PK=i_obejctId
);
             END IF;
```
If you have hard deletes in your applications, you will need to include that deletion in the stored procedure.
- Example:
- ```
IF i_deletedObjectId IS NOT NULL THEN
DELETE FROM APP_M_V_OBJECT WHERE ID = i_deletedObjectId;
END IF;
```

Example Stored Procedure:

APP needed to materialize its APP_V_VIEW view. This view's definiton uses data from the APP_PARENT, APP_CHILD, and APP_OBJECT tables and therefore the stored procedure APP_UpdateView has inputs for each of those objects so when a change is made to any of them it is easy to recalculate only the rows in the M_V affected by that update.
The stored procedure would be created as follows. Note that the highlighted section is where only the input data is updated. If this is not included, the stored procedure will take as long to execute as the view takes to load.

DELIMITER $$

CREATE PROCEDURE `APP_UpdateView`(IN `i_childId` INT, IN `i_parentId` INT, IN `i_objectId` INT)
NO SQL
BEGIN

IF (
(i_childId IS NULL OR i_childId = 0)
AND (i_parentId IS NULL OR i_parentId = 0)
AND (i_objectId IS NULL OR i_objectId = 0)
) THEN
TRUNCATE APP_M_V_VIEW;

END IF;

INSERT INTO
APP_M_V_VIEW (
OBJECT_ID_PK,
CHILD_ID_FK,
OBJECT_TITLE,
OBJECT_DESCRIPTION,
OBJECT_OWNER,
OBJECT_DUE_DATE,
IS_INITIATED,
CHILD_SHORT_DESCRIPTION,
CHILD_COMMITMENT,
PARENT_START_DATETIME,
PARENT_IDENTIFIER,
IS_PARENT_HIDDEN,
PARENT_TITLE
)

SELECT
`obj`.`OBJECT_ID_PK` AS `OBJECT_ID_PK`,
`obj`.`CHILD_ID_FK` AS `CHILD_ID_FK`,
`obj`.`TITLE` AS `OBJECT_TITLE`,
`obj`.`DESCRIPTION` AS `OBJECT_DESCRIPTION`,
`obj`.`OWNER` AS `OBJECT_OWNER`,
`obj`.`DUE_DATE` AS `OBJECT_DUE_DATE`,
(
CASE WHEN(`obj`.`IS_INITIATED` = 1) THEN 'Initiated' ELSE 'Not Initiated' END
) AS `IS_INITIATED`,
`ai`.`SHORT_DESCRIPTION` AS `CHILD_SHORT_DESCRIPTION`,
`ai`.`COMMITMENT` AS `CHILD_COMMITMENT`,
`parent`.`EXPECTED_START_DATETIME` AS `PARENT_START_DATETIME`,
`parent`.`OBJECT_NUMBER` AS `PARENT_IDENTIFIER`,
`parent`.`IS_HIDDEN` AS `IS_PARENT_HIDDEN`,
`parent`.`TITLE` AS `PARENT_TITLE`

FROM
`Appian`.`APP_OBJECT` `obj`
LEFT JOIN `Appian`.`APP_PARENT_CHILD` `ai`
ON `ai`.`PARENT_CHILD_ID` = `obj`.`CHILD_ID_FK`
LEFT JOIN `Appian`.`APP_PARENT` `parent`
ON `parent`.`PARENT_ID_PK` = `ai`.`PARENT_ID_FK`

WHERE`obj`.`IS_ACTIVE` = 1)
AND(`ai`.`IS_DELETED` <> 1
AND`parent`.`IS_ACTIVE` = 1
AND IF(ISNULL(i_childId) OR i_childId = 0, 1, `obj`.`CHILD_ID_FK`) = IF(ISNULL(i_childId) OR i_childId = 0, 1, i_childId)
AND IF(ISNULL(i_parentId) OR i_parentId = 0, 1, `ai`.`PARENT_ID_FK`) = IF(ISNULL(i_parentId) OR i_parentId = 0, 1, i_parentId)
AND IF(ISNULL(i_objectId) OR i_objectId = 0, 1, `obj`.`OBJECT_ID_PK`) = IF(ISNULL(i_objectId) OR i_objectId = 0, 1, i_objectId)

GROUP BY
`obj`.`OBJECT_ID_PK`

ON DUPLICATE KEY UPDATE
OBJECT_ID_PK= VALUES(OBJECT_ID_PK),
CHILD_ID_FK= VALUES(CHILD_ID_FK),
OBJECT_TITLE= VALUES(OBJECT_TITLE),
OBJECT_DESCRIPTION= VALUES(OBJECT_DESCRIPTION),
OBJECT_OWNER= VALUES(OBJECT_OWNER),
OBJECT_DUE_DATE= VALUES(OBJECT_DUE_DATE),
IS_INITIATED= VALUES(IS_INITIATED),
CHILD_SHORT_DESCRIPTION= VALUES(CHILD_SHORT_DESCRIPTION),
CHILD_COMMITMENT= VALUES(CHILD_COMMITMENT),
PARENT_START_DATETIME= VALUES(PARENT_START_DATETIME),
PARENT_IDENTIFIER= VALUES(PARENT_IDENTIFIER),
IS_PARENT_HIDDEN= VALUES(IS_PARENT_HIDDEN),
PARENT_TITLE= VALUES(PARENT_TITLE);

IF
i_childId IS NOT NULL THEN
DELETE FROM APP_M_V_VIEW WHERE OBJECT_ID_PK IN
(SELECT OBJECT_ID_PK FROM APP_OBJECT where IS_ACTIVE<>1 and CHILD_ID_FK=i_childId);
END IF;

END$$
DELIMITER ;

Execute the new stored procedure in all process models which update the underlying tables referenced in the view. After the data writes in the process model, call a sub-process that calls your stored procedure to update the M_V, passing in the necessary input IDs.
- Example subprocess for materialized view updates:
Create a scheduled process model that runs once nightly which completely refreshes the materialized view. This will be helpful for picking up any rarely updated fields like reference values or data updated manually in the database.
- Example:
If you are already using views and it would be unreasonable to replace all CDT references to use a new CDT for the M_V table, the only change required in the Appian objects is to change the table that is mapped in your view CDT XSD to the M_V table.

Topics