KB-2039 500 Internal Server Errors displayed when accessing some processes with single execution engine running at 100% CPU

This issue has been resolved in an Appian hotfix/new Appian version. Please apply the latest hotfix to your Appian installation or upgrade to the latest version of Appian.

Symptoms

When trying to initiate some processes in an environment, a 500 - Internal Server Error is displayed and the following errors are seen in the application server log:

ERROR com.appiancorp.process.engine.StartProcessByEventRequest - Could not start process model [draftId=<ID>, version=<version>, userContext=<user>]
com.appian.komodo.api.exceptions.SignalException: domain
ERROR com.appiancorp.common.config.ConfigObject - An error occurred while trying to initialize the config LoadExceptionHandling [0ms] [resources 16ms] com.appian.komodo.api.exceptions.SignalException: domain at 

When checking CPU usage on the server hosting the Appian engines, a single execution engine is found to be running at 100% CPU. Steps to verify this are listed below:

Linux

  1. Run the top command on the server, and then press c, shift+p. This will sort processes by CPU usage.
  2. Under the CPU column, look for an execution engine process running at 100% CPU.
    1. You can identify an engine process if it contains the string <APPIAN_HOME>/server/_bin/k/linux64/k <APPIAN_HOME>/server/_lib/adb
    2. You can identify the specific engine as an execution engine by looking at the end part of the string. For example, <APPIAN_HOME>/server/process/exec/02/gw1 would indicate that the engine is execution02.

Windows

  1. Open up Task Manager and navigate to the "Processes" tab.
  2. Right click on the "Name" column header, and select the option to add the "Command line" column:
  3. Under the CPU column, look for an execution engine process running at 100% CPU (you will likely need to expand the "Command line" column in order to see the full string).
    1. You can identify an engine process if it contains the string <APPIAN_HOME>/server/_bin/k/linux64/k <APPIAN_HOME>/server/_lib/adb
    2. You can identify the specific engine as an execution engine by looking at the end part of the string. For example, <APPIAN_HOME>/server/process/exec/02/gw1 would indicate that the engine is execution02.

Cause

This issue occurs when one of the below engine performance logs reaches 10 megabytes in size:

  • execution_by_category_*
  • top_models_by_time_*
  • top_processes_by_time_*

These logs can be found in the <APPIAN_HOME>/logs/perflogs directory. These logs will roll over anytime an execution engine is restarted. Therefore, it has been calculated that this behavior should only occur if an execution engine has been running continuously for no less than approximately 60 days. 

The reason these logs can cause an execution engine to max out at 100% CPU utilization is a result of how select values in these logs are calculated. Transient tables within the execution engine that populate values in these logs become too massive to scale any further, which causes the engine to lock while trying to process the data. This issue has been addressed via AN-146311 in the following hotfixes/versions:

Action

Apply the latest hotfix to your Appian installation or upgrade to the latest version of Appian.

Workaround

Cloud

Open a case with Appian Support and note to the case description that you are experiencing behavior in line with this article.

On-Premise

This issue can be resolved by restarting the environment. To avoid downtime in the environment, moving the above mentioned log files to a directory other than <APPIAN_HOME>/logs/perflogs and restarting the afflicted engine will resolve this issue. A single execution engine can be restarted with the following procedures:

  1. From <APPIAN_HOME>/services/bin, execute ./stop.sh (.bat) -p <PASSWORD> -s executionXX. Wait for the engine to stop.
  2. From <APPIAN_HOME>/services/bin, execute ./start.sh (.bat) -p <PASSWORD> -s executionXX.

If the above steps fail or take an extended amount of time, follow the procedures below to forcibly kill and restart the engine:

Linux

  1. Run the top command on the server, and then press cshift+p. This will sort processes by CPU usage.
  2. Under the CPU column, look for the execution engine process running at 100% CPU. The PID of this process will be listed in the left most column of the top output.
  3. You can use the following command to forcibly kill the engine: kill -9 <PID>
  4. From <APPIAN_HOME>/services/bin, execute ./start.sh (.bat) -p <PASSWORD> -s executionXX to restart the engine.

Windows

  1. Open up Task Manager and navigate to the "Processes" tab.
  2. Right click on the "Name" column header, and select the option to add the "PID" and "Command line" columns:
  3. Under the CPU column, look for the execution engine process running at 100% CPU. The PID of this process will be listed in the newly selected PID column.
  4. You can force kill the engine by right clicking the engine process in task manager, and selecting the option to "End task."
  5. From <APPIAN_HOME>/services/bin, execute ./start.sh (.bat) -p <PASSWORD> -s executionXX to restart the engine.

Affected Versions

This article applies to Appian 18.3 to 19.4

Last Reviewed: June 2020

Related
Recommended