Deleting Data in Appian Applications

Introduction

Are you having trouble finding data to debug an issue? Are your queries to a transaction table slow? Did you get a requirement to display activity history for a user? Your Appian application’s data model design and data deletion strategy can have a major role in how these questions or concerns can be addressed. When designing the data model for your Appian application, there are different approaches to consider for data deletion use cases. The approach you choose will probably depend on the nature of the data, the number of transactions, and  the business rules associated with the use case. The most common approaches include:

  • hard deletes - actually deleting rows from the table.
  • soft deletes - marking rows as deleted using a flag column e.g. isDeleted or isActive.
  • audit log - creating a separate audit table for recording the nature of transactions taking place on the original table.

This article describes the pros and cons of  each deletion approach and provides recommendations on when to use these approaches. For the purposes of this article, the relational database data model from the Quote to Order case study is referenced for examples.

Quote to Order ERD

Figure 1: Quote to Order ERD

Hard Deletes

The term ”hard delete” is generally accepted to describe the irrevocable removal of data.  When a value or row is hard-deleted from a table, it means a DELETE operation has been executed on a row in the table. This is typically the easiest and fastest option for developers, independent of business rules. 

NOTE: It is recommended to use the Delete from Data Store Entities smart service for hard deleting data from Appian applications. Make sure the deletes are based on an indexed field (ideally the primary key field). You may run the EXPLAIN statement to see the execution plan prior to the delete. If you are deleting in bulk and  there’s HA setup at the database layer, it’s preferred to purge in batches (limit to 1000 rows per batch) and during off hours to minimize the impact (database locking) on the application . Be aware of the foreign key constraints and cascade attribute settings in the database schema and make sure your data is backed up prior to the delete in case you need to restore it.

Pros

  • Transaction times are faster.
  • Lets you design simple unique and check constraints on table columns.
  • Better data integrity and cardinality of columns in tables.
  • Design allows for creating foreign keys with cascading delete.

Cons

  • Risk of data loss due to defects in the application.
  • Lack of traceability of data for business reporting or technical debugging.

When to Use

  • A hard delete strategy is better suited for transactional tables with possibly high transaction volume and which do not need historical data for business or technical purposes.
  • In reference to the Quote to Order tables, an example of a hard delete would be deleting a draft order from the QTOT_ORDER table when a user cancels/deletes a draft order.

Soft Deletes

By contrast to a hard delete, data is typically retained according to a  soft delete strategy. Instead of deleting the data, the table row to be soft deleted  is marked with a value in a dedicated column.. This option seems less risky in terms of losing data due to hard deletes but could be risky in terms of unintended data showing up in reports or other parts of the application or compromising the performance of the table.

Pros

  • Safer approach in general from a perspective of making sure data is not lost.
  • Easy to undo delete operations or reactivate the row if needed.
  • Entire history of the table can be traced in a single table.

Cons

  • Tables can grow rapidly and can lead to unnecessary disk usage and bad performance.
  • Complicates maintenance of data integrity due to the need of complex unique indexes and check constraints.
  • Risk of defects being introduced due to designers missing to code the isActive flag into read/write database operations from the application e.g. database view cdts/tables missing the isActive flag field, CDT value used in the write to database smart service missing a value for the isActive flag, etc.
  • Design to support ad-hoc reporting is demanding, in terms of development, with the need to have database views to filter data. Significantly complex views would require materialized views.
  • Low cardinality of the isActive flag column might impact performance of data retrieval despite indexing on the table, especially when 99.99% of the rows have inactive status.

When to Use

  • Consider a soft delete approach for operational or reference data as opposed to transactional data or relational mapping tables.
  • For reference data tables instead of creating just an isActive flag to indicate a deleted entry, a START_DATE and END_DATE pair of columns provides a more granular control over retrieval of the data and thus the possibility of viewing the historical usage of your data. See QTOT_REFERENCE table in the ERD above.

Audit Log

A less invasive approach to deleting data in a table involves creating an audit log of delete operations on the table. It is very common for application designers to have an auditing strategy to record changes done by users in the application. So the audit log approach for recording deletes that happen on a table, as a result of a user’s action or as a business rule, fits right into that strategy. Detailed guidance on implementing audit tables for Appian applications can be found in the Auditing Guidance docs and the Field Level Auditing play.

Pros

  • Better data reporting capability for original data and historical reporting of audit data.
  • Separate table for audit allows the original table to perform better due to less accumulation of unnecessary data.
  • Troubleshooting data issues becomes easier given the data in audit tables.
  • Audit tables can have more detail than just a isActive/isDeleted flag e.g. old value, new value, last updated on, update operation - create, update, delete, etc. See QTOT_ACCOUNT_AUDIT table in the ERD above.

Cons

  • Longer development time.

When to Use

  • For transactional tables e.g. QTOT_ACCOUNT or QTOT_ORDER use the hard delete approach in conjunction with an audit table implementation. 
  • Applications typically have event tables to track business events in general. The audit for delete operations can be stored in such event tables. In such implementations, designers would have to transform the data from the original table structure to the event table structure or some other structure of the audit table. Using a JSON structure for the audit data eliminates the need to transform it and makes the implementation easier.

Hybrid Approach

The Quote to Order data model includes the table QTOA_HISTORICAL_ORDER as part of a data archival strategy. This type of strategy helps with overall application performance.   “Mirror” tables that match the structure of the original transaction tables are core to this approach. With these tables in place, a design can be employed in which the application switches between them to access appropriate data depending on its age. For example, in the Quote to Order data model the QTOT_ORDER table has a corresponding QTOA_HISTORICAL_ORDER table. Data from the QTOT_ORDER table is archived to the QTOA_HISTORICAL_ORDER table if it is older than six months and eventually to long term storage after it ages out further. This keeps the size of the QTOT_ORDER table reasonably low leading to better performance. 

Conclusion

A data deletion strategy should be based on the nature of the data and the business case. In summary, use a hybrid implementation approach wherein: 

  1. Hard deletes are performed on transaction tables in conjunction with audit tables tracking the delete operations. The transaction tables are archived periodically based on the age of the data.
  2. Soft deletes are performed on operational or reference tables by marking the data with columns such as START_DATE and END_DATE.