Field Level Auditing

This article discusses methods for achieving field level auditing. That is, how to keep records of changes to any field in a table, including who made the change and when. The main engineering guidance on auditing can be found here.

Table Structure

Our recommendation for storing the data on updates is to create a history table for every table on which you wish to carry out field level auditing. For example, we might have a customer table which looks a little like this:

For this table, we should construct a history table for customer, called audcustomer, structured like this:

That is, we have every field from the main table plus a new primary key. The primary key for the main table is a foreign key in the history table. Every time the main table is updated, we insert a copy of the new values into the history table. 

The fields ‘capturedon’ and ‘capturedby’ allow us to keep track of the last user to edit the row in the main table and when they did so (it represents when this row was captured). In the example above, the user John Jones originally set up the customer Malachi Smith as living in London, and he did this on October 1, 2021. (In the real world you will almost certainly use a datetime field for capturedon, rather than a date). Then, on October 7, 2021, Joanna Bloggs updated Malachi’s town to be Liverpool. 

Depending on the use case, you may need additional fields which are similar to capturedon and capturedby to be stored. For example, if you are using soft deletes, you may wish to have deletedon and deletedby. If you frequently need to access the creation properties of an object, it might make sense to have createdon and createdby in each table as well. If a data row might be replaced by another, having updatedon and updatedby might be needed to keep track of when a row is no longer in use. For more on data modeling choices, see this article.

The data in audit history tables should be displayed on a table by table basis. Attempting to display many or all audit history tables in one place risks performance problems - just as you wouldn’t try to display too many standard tables in one place.

Deletion

There are a few options for deletion, and this will depend heavily on the use case and client requirements. Using soft deletes with using a column ‘isdeleted’ or alternatively ‘deletedon’ and ‘deletedby’ has the benefit that the entire history of one table row can be traced in a single table.

Often when implementing auditing, there will be use of an events table so that not just changes to fields are recorded, but also events like approvals or receipt of documents. In such cases, deletion of rows can be stored in the same events table.

A final option, when using hard deletes and when there is no generic events table, is to add a generic deletion table. This table requires auddeleteid, tablename, deletedid, deletedby and deletedon as fields. The field ‘auddeleteid’ is the primary key, tablename records the name of the table from which the deletion occurred, as a varchar, and deletedid is the primary key of the row that has been deleted. This can then be updated using a single process model.

Updating the Tables

The recommended way to update the history table is to use database triggers. It is possible to write triggers for each table (two are needed per table, one for update and one for create) which automatically populate the trail tables after an insert on the table (the create event) and after each update on the table. The trigger should check that the data has in fact changed before writing the event to the history table. This is achieved by using ‘NEW’ and ‘OLD’, as you can see in the customer_update trigger example below. 

The following code demonstrates how to create an insert trigger on the customer table example:

CREATE TRIGGER `customer_insert`
AFTER INSERT ON `customer`
FOR EACH ROW

INSERT INTO audcustomer (
	`customerid`,
	`firstname`,
	`lastname`,
	`town`,
	`country`,
	`capturedon`,
	`capturedby`
	)
VALUES (
	NEW.`customerid`,
	NEW.`firstname`,
	NEW.`lastname`,
	NEW.`town`,
	NEW.`country`,
	NEW.`capturedon`,
	NEW.`capturedby`
	);

The code for an update trigger would be as follows:

CREATE TRIGGER `customer_update` AFTER UPDATE ON `customer`
 FOR EACH ROW INSERT INTO `audcustomer` (
	`customerid`,
	`firstname`,
	`lastname`,
	`town`,
	`country`,
	`capturedon`,
	`capturedby`
	)
SELECT
	NEW.`customerid`,
	NEW.`firstname`,
	NEW.`lastname`,
	NEW.`town`,
	NEW.`country`,
	NEW.`capturedon`,
	NEW.`capturedby`
FROM DUAL
WHERE NOT (
		            OLD.`firstname` <=> NEW.`firstname`
			AND OLD.`lastname` <=> NEW.`lastname`
			AND OLD.`town` <=> NEW.`town`
			AND OLD.`country` <=> NEW.`country`
		)

We recommend using triggers, and we do not recommend using process models. Using process models for trail changes creates many more nodes in standard processes which are not necessary. Furthermore, using process models does not guarantee the sequence of events written to the audit table. If the processes are running concurrently and executed on different exec engines, the sequence of audit events will not always be linearizable.

Performance Considerations

The implementation of audit trail makes heavy demands on your database, as every write operation also has a corresponding trigger which must run a second write. Furthermore, an audit trail table contains every copy of every row in the corresponding standard table, and hence will be many times larger than a standard database table. 

As a result of this, an audit trail should be used on an application only when it is absolutely necessary to meet your requirements. If another method will work, such as milestone tracking, we recommend avoiding a full audit trail. If an audit trail is necessary, then only trail the tables and changes that you need.  

Do not use Record Type Sync Data for the trail tables. The volume of data could hit the limit of the Sync Data capabilities and require a redesign of this part of the application in the future as data continues to grow.

When estimating the size of the database, you must consider the audit trail tables. Make an estimate of the number of updates that are likely to occur based on likely user behavior and business needs. Then estimate the size of both the standard tables and the trail tables after 1, 2, and 5 years. An archiving policy is likely to be needed to manage this sooner for an application with audit trail tables than one without. 

The size of the trail tables will mean that use of the tables should be minimized too. Only use trail tables when it is necessary to. Users should have to take specific action, such as clicking an icon or link, in order to view audit trail data and it should not be automatically displayed on an interface if a user may not need it.

Common Pitfalls

  • Define the scope – which fields in which tables need to be trailed and displayed. Otherwise maintenance becomes a drag on all development work
  • Trail tables could be substantially bigger that the main application tables, so use of them should be limited and they must be prepopulated with data for load testing
  • Eventually an archiving or deletion policy will be needed
  • Every trailed change may represent a legal or regulatory risk, as every change may be scrutinized
  • For a complex record, don’t display all trailed data from all child tables at once. This risks performance problems