KB-1678 Engine checkpoint FAQ

The purpose of this article is to answer some of the common questions regarding the Appian engine checkpointing.

Table of Contents:

How can automatic checkpointing be disabled?
Automatic checkpoints are negatively impacting site performance. Is this expected behavior?
If checkpoints are scheduled manually (to avoid automatic checkpoints during business hours) and the number of manually scheduled checkpoints is much less than the number of checkpoints that would occur with automatic checkpointing, will the time to checkpoint increase greatly?
What is the recommended interval for manually scheduled checkpoints?
Will the environment be completely unusable during the (fewer) manually scheduled checkpoints?
Would adding execution and analytics engines to the environment diminish the effects of checkpoints for users?
If adding more execution and analytics engines will not minimize user impact due to engine checkpoints, what will?

How can automatic checkpointing be disabled?

It is important to understand that automatic checkpoints cannot be disabled. However, automatic checkpoints can be controlled by specifying the following properties and values in custom.properties:

serviceManager.checkpoint.automatic.boundary.time=30 hours
serviceManager.checkpoint.automatic.boundary.replay=30 hours

By configuring these settings, it makes it possible to avoid triggering checkpoints during business hours. This way, manually scheduled checkpoints (via cron job or scheduled task) can be configured to initiate checkpoints during off-hours, as explained in the Configuring Checkpoint Frequency section of the Configuring Application Checkpointing documentation. Note: If the manually configured checkpointing mechanism were to fail, automatic checkpoints will still run as a backup plan.

Automatic checkpoints are negatively impacting site performance. Is this expected behavior?

When automatic checkpoints for the Appian engines are conducted during business hours, the effects may temporarily reduce system performance for self-managed, non-HA Appian environments. When a particular engine performs a checkpoint, it becomes unable to respond to any further requests until the checkpoint is complete. Therefore, if there are not multiple replicas of that engine, then that particular engine will be temporarily unresponsive which may cause user requests to be delayed or time out.

If checkpoints are scheduled manually (to avoid automatic checkpoints during business hours) and the number of manually scheduled checkpoints is much less than the number of checkpoints that would occur with automatic checkpointing, will the time to checkpoint increase greatly?

Not necessarily. How long it takes for an engine to checkpoint is more determined by the total size of the engine, rather than the number of transactions that have occurred since the last checkpoint.

What is the recommended interval for manually scheduled checkpoints?

The recommended interval for manually scheduled checkpoints will vary based on individual business requirements. Failing to regularly checkpoint the engines introduces the risk of longer transaction replay time in the event of engine failure. Having more frequent checkpoints will help manage the amount of time it may take for transactions to replay in this case. In general, it is good practice to checkpoint at least once every 12 hours.

Will the environment be completely unusable during the (fewer) manually scheduled checkpoints?

The environment should not be completely unusable. Service Manager will restrict checkpoints which have been requested so that not too many happen simultaneously. User actions that require a specific engine that is currently checkpointing are the only actions that will be affected.

Would adding execution and analytics engines to the environment diminish the effects of checkpoints for users?

It is not likely that adding more execution and analytics engines would help in this case. If the user actions/requests that are failing are trying to gather information about all processes, adding more execution and analytics engines would not be helpful, since any one of the engines checkpointing will still impact user actions. However, if the user action is requesting data from a specific engine, having more of them may help slightly, simply because of the decreased odds that the data being requested is stored in the engine that is currently checkpointing.

If adding more execution and analytics engines will not minimize user impact due to engine checkpoints, what will?

A high availability configuration should help to minimize the impact of any engine checkpoint. In a high availability configuration, there are at least three instances of each Appian engine. As a result, if one instance of an engine is checkpointing, the remaining instances of that engine are available to serve user requests.

Affected Versions

This article applies to Appian 17.3 and later.

Last Reviewed: August 2018