Server Monitoring Checklist

Each of the monitoring actions in this checklist will require one or more recipients or actors.

  • Each warning should have recipients that are interested in receiving the flag (such as the system administrator and project manager).
  • Each alert must have at least one actor that is responsible for taking active action. Additionally each alert may have passive recipients that are interested in receiving the flag, but don't directly take remedial action.

For monitoring applications deployed to Appian, see Application Monitoring.

Engine Server Monitoring Checklist

  • 70% or more average CPU utilization over the last 15 minutes raises an alert.
  • 70% or more utilization of available physical RAM raises an alert.
  • An additional alert is raised for each 5% increase of RAM above 70%.
  • 70% or more utilization of available HDD space raises an alert.
  • An additional alert is raised for each 5% increase of HDD space above 70%.
  • A 10% increase of HDD space overnight raises a warning.
  • Any rollback, write-to-disk, or transaction log file created in the last 30 minutes raises an alert.
  • Any of the expected number of processes is not running or is continuously restarting raises an alert. The expected number will match the number of <engine> entries in appian-topology.xml (default 15 engines).
  • The engine monitor Java process not running raises a warning.

See Engine Monitoring Recommendations for more information.

Application Server Monitoring Checklist

  • 70% or more average CPU utilization over the last 15 minutes raises an alert.
  • 80% or more utilization of available physical RAM raises an alert.
  • 80% or more utilization of available HDD space raises an alert.
  • An additional alert is raised for each 5% increase of HDD space above 80%.
  • 70% or more utilization of available Java Heap Space raises an alert.
  • An additional alert is raised for each 10% increase of Java Heap Space above 70%.
  • The application server Java process not running raises an alert.
  • 10 or more ERROR messages in the application server log in the last 30 minutes raises an alert

See Application Server Monitoring Recommendations for more information.

Search Server Monitoring Checklist

  • 70% or more average CPU utilization over the last 15 minutes raises an alert.
  • 80% or more utilization of available physical RAM raises an alert.
  • 80% or more utilization of available HDD space raises an alert.
  • An additional alert is raised for each 5% increase of HDD space above 80%.
  • 70% or more utilization of available Java Heap Space raises an alert.
  • An additional alert is raised for each 10% increase of Java Heap Space above 70%.
  • The search server Java process not running raises an alert.

See Search Server Monitoring Recommendations for more information.

Web Server Monitoring Checklist

  • 70% or more average CPU utilization over the last 15 minutes raises an alert.
  • 80% or more utilization of available physical RAM raises an alert.
  • 80% or more utilization of available HDD space raises an alert.
  • An additional alert is raised for each 5% increase of HDD space above 80%.
  • The web server process not running raises an alert.
  • 10 or more 404 and 503 error codes in the access log in the last 30 minutes raises an alert.

See Web Server Monitoring Recommendations for more information.

Data Service Monitoring Checklist

  • 70% or more average CPU utilization over the last 15 minutes raises an alert.
  • 70% or more utilization of available physical RAM raises an alert.
  • An additional alert is raised for each 5% increase of RAM above 70%.
  • 70% or more utilization of available HDD space raises an alert.
  • An additional alert is raised for each 5% increase of HDD space above 70%.
  • The health script not returning healthy flags for the cluster and the nodes raises an alert

External Component Monitoring

For other systems external to Appian (such as databases, web services, etc.) refer to the vendor documentation for the recommended monitoring procedures. We also recommend reviewing the vendor specific monitoring recommendations for your web server and application servers.