When I start the docker containers, all the containers change to "healthy" once they are up and running, except for the data-server container. This is preventing the web-application container from starting successfully. I've attached the logs of the data-server container. There is no error thrown by the data-server immediately. It sits idle for about 15 minutes. Then the status changes to "unhealthy". Any ideas how to troubleshoot this?
In the screenshot I started the rdbms, search-server, service-manager, and data-server containers in that order.
Appian: 19.3
$ docker-compose up data-server Starting base ... done service-manager is up-to-date Starting data-server ... done Attaching to data-server data-server | Waiting for these containers to start: service-manager data-server | Currently waiting for service-manager data-server | service-manager is up! data-server | Executing start command: /usr/local/appian/ae/data-server/bin/start.sh -f data-server | 0 [main] INFO com.appian.data.TopologyFactory - Loading topology /usr/local/appian/ae/data-server/conf/appian-topology.xml data-server | 451 [main] INFO com.appian.data.server.config.CustomPropertiesConfiguration - Configuration file custom.properties was loaded successfully. (Reload check = 10000 ms) data-server | 478 [main] INFO com.appian.data.server.Watchdog - Creating directory '/usr/local/appian/ae/data-server/data' data-server | 484 [main] INFO com.appian.data.server.Watchdog - Checking permissions on directory '/usr/local/appian/ae/data-server/data' data-server | 486 [main] INFO com.appian.data.server.Watchdog - Creating directory '/usr/local/appian/ae/data-server/node' data-server | 488 [main] INFO com.appian.data.server.Watchdog - Checking permissions on directory '/usr/local/appian/ae/data-server/node' data-server | 488 [main] INFO com.appian.data.server.Watchdog - Creating directory '/usr/local/appian/ae/data-server/logs' data-server | 488 [main] INFO com.appian.data.server.Watchdog - Checking permissions on directory '/usr/local/appian/ae/data-server/logs' data-server | 490 [main] INFO com.appian.data.server.Watchdog - Creating directory '/usr/local/appian/ae/data-server/data-metrics' data-server | 490 [main] INFO com.appian.data.server.Watchdog - Checking permissions on directory '/usr/local/appian/ae/data-server/data-metrics' data-server | 495 [main] INFO com.appian.data.server.Watchdog - Initializing log4j (processName=watchdog, logsDir=/usr/local/appian/ae/data-server/logs) from /usr/local/appian/ae/data-server/conf/log4j.properties data-server | {"level":"INFO","time":"2019-10-14T22:15:11.567Z","logger":"watchdog","caller":"Watchdog.java:1007","thread":"main","msg":"Starting Data Server. Build-Version[0.148.4], Build-Date[2019-08-16], Target-Version[2019-04-21]"} data-server | {"level":"INFO","time":"2019-10-14T22:15:12.099Z","logger":"watchdog","caller":"Logging.java:53","thread":"main","msg":"Total memory: 81264640, Max memory: 119537664, Free memory: 61261104, Num processors: 2"} data-server | {"level":"INFO","time":"2019-10-14T22:15:12.101Z","logger":"watchdog","caller":"Logging.java:56","thread":"main","msg":"JVM version: 1.8.0_222, JVM arguments: -Duser.timezone=GMT -Dds.home=/usr/local/appian/ae/data-server -Xms64m -Xmx128m, JVM current dir: /usr/local/appian/ae"} data-server | {"level":"INFO","time":"2019-10-14T22:15:12.238Z","logger":"watchdog","caller":"VertxServer.java:105","thread":"vert.x-eventloop-thread-3","msg":"Listening on data-server:5400..."} data-server | {"level":"INFO","time":"2019-10-14T22:15:12.241Z","logger":"watchdog","caller":"VertxServer.java:105","thread":"vert.x-eventloop-thread-0","msg":"Listening on data-server:6400..."} data-server | {"level":"INFO","time":"2019-10-14T22:15:13.246Z","logger":"watchdog","caller":"MigrationEngine.java:122","thread":"main","msg":"Version on disk (2019-04-21) is verified. No migration required..."} data-server | {"level":"INFO","time":"2019-10-14T22:15:13.255Z","logger":"watchdog","caller":"TopicManager.java:181","thread":"main","msg":"Topic ads_transaction_effects already exists"} data-server | {"level":"INFO","time":"2019-10-14T22:15:13.255Z","logger":"watchdog","caller":"TopicManager.java:281","thread":"main","msg":"Ensuring topic "ads_transaction_effects" has the correct configuration"} data-server | {"level":"INFO","time":"2019-10-14T22:15:13.379Z","logger":"watchdog","caller":"TopicManager.java:298","thread":"main","msg":"Topic "ads_transaction_effects" has the correct configuration"} data-server | {"level":"INFO","time":"2019-10-14T22:15:13.379Z","logger":"watchdog","caller":"TopicManager.java:168","thread":"main","msg":"Active brokers: [engines:9092 (id: 0 rack: null)]"} data-server | {"level":"INFO","time":"2019-10-14T22:15:13.383Z","logger":"watchdog","caller":"TopicManager.java:342","thread":"main","msg":"Topic ads_transaction_effects has desired assigned replicas: [engines:9092 (id: 0 rack: null)]"} data-server | {"level":"INFO","time":"2019-10-14T22:15:13.445Z","logger":"watchdog","caller":"ScriptUtils.java:64","thread":"main","msg":"Killing data server component 'appender'"} data-server | {"level":"INFO","time":"2019-10-14T22:15:13.503Z","logger":"watchdog","caller":"ScriptUtils.java:64","thread":"main","msg":"Killing data server component 'hs'"} data-server | {"level":"INFO","time":"2019-10-14T22:15:13.508Z","logger":"watchdog","caller":"AppenderStarter.java:39","thread":"process-starter-0","msg":"Starting appender engine (port=5405)..."} data-server | {"level":"INFO","time":"2019-10-14T22:15:13.510Z","logger":"watchdog","caller":"QStarter.java:53","thread":"process-starter-0","msg":"Starting Q process: Config{script=appender.q, hostAndPort=data-server:5405, args=[]}"} data-server | {"level":"INFO","time":"2019-10-14T22:15:13.517Z","logger":"watchdog","caller":"ScriptUtils.java:64","thread":"main","msg":"Killing data server component 'rts:0'"} data-server | {"level":"INFO","time":"2019-10-14T22:15:13.523Z","logger":"watchdog","caller":"HistoricalStoreStarter.java:38","thread":"process-starter-1","msg":"Starting HS engine (port=5403)..."} data-server | {"level":"INFO","time":"2019-10-14T22:15:13.524Z","logger":"watchdog","caller":"QStarter.java:53","thread":"process-starter-1","msg":"Starting Q process: Config{script=hs.q, hostAndPort=data-server:5403, args=[]}"} data-server | {"level":"INFO","time":"2019-10-14T22:15:13.593Z","logger":"watchdog","caller":"RealtimeStoreStarter.java:40","thread":"process-starter-2","msg":"Starting RTS-0 engine (port=5451)..."} data-server | {"level":"INFO","time":"2019-10-14T22:15:13.595Z","logger":"watchdog","caller":"QStarter.java:53","thread":"process-starter-2","msg":"Starting Q process: Config{script=rts.q, hostAndPort=data-server:5451, args=[-rts, 0]}"} data-server | {"level":"INFO","time":"2019-10-14T22:15:15.166Z","logger":"watchdog","caller":"HistoricalStoreStarter.java:40","thread":"process-starter-1","msg":"Started HS engine (port=5403)"} data-server | {"level":"INFO","time":"2019-10-14T22:15:15.171Z","logger":"watchdog","caller":"HistoricalStoreStarter.java:74","thread":"process-starter-1","msg":"Starting HS gateway (port=5401)..."} data-server | {"level":"INFO","time":"2019-10-14T22:15:15.172Z","logger":"watchdog","caller":"GoStarter.java:63","thread":"process-starter-1","msg":"Starting Go process: Config{executable=ads-hs-linux, hostAndPort=data-server:5401, args=[-ds.home, /usr/local/appian/ae/data-server, -custom.properties, /usr/local/appian/ae/data-server/conf/custom.properties, -gw.host, data-server, -gw.port, 5401, -gw.metrics.port, 6401, -hs.peers, data-server:5401, -kfk.brokers, engines:9092, -kfk.topic, ads_transaction_effects, -q.port, 5403]}"} data-server | {"level":"INFO","time":"2019-10-14T22:15:15.249Z","logger":"watchdog","caller":"AppenderStarter.java:41","thread":"process-starter-0","msg":"Started appender engine (port=5405)..."} data-server | {"level":"INFO","time":"2019-10-14T22:15:15.251Z","logger":"watchdog","caller":"AppenderStarter.java:76","thread":"process-starter-0","msg":"Starting appender gateway (port=5404)..."} data-server | {"level":"INFO","time":"2019-10-14T22:15:15.252Z","logger":"watchdog","caller":"GoStarter.java:63","thread":"process-starter-0","msg":"Starting Go process: Config{executable=ads-appender-linux, hostAndPort=data-server:5404, args=[-ds.home, /usr/local/appian/ae/data-server, -custom.properties, /usr/local/appian/ae/data-server/conf/custom.properties, -gw.port, 5404, -gw.host, data-server, -gw.metrics.port, 6404, -kfk.brokers, engines:9092, -kfk.topic, ads_transaction_effects, -engine.port, 5405, -hs.engine.port, 5403, -hs.gw.address, data-server:5401, -rts.gw.addresses, data-server:5450]}"} data-server | {"level":"INFO","time":"2019-10-14T22:15:15.259Z","logger":"watchdog","caller":"RealtimeStoreStarter.java:42","thread":"process-starter-2","msg":"Started RTS-0 engine (port=5451)"} data-server | {"level":"INFO","time":"2019-10-14T22:15:15.263Z","logger":"watchdog","caller":"RealtimeStoreStarter.java:77","thread":"process-starter-2","msg":"Starting RTS-0 gateway (port=5450)..."} data-server | {"level":"INFO","time":"2019-10-14T22:15:15.264Z","logger":"watchdog","caller":"GoStarter.java:63","thread":"process-starter-2","msg":"Starting Go process: Config{executable=ads-rts-linux, hostAndPort=data-server:5450, args=[-ds.home, /usr/local/appian/ae/data-server, -custom.properties, /usr/local/appian/ae/data-server/conf/custom.properties, -gw.host, data-server, -gw.port, 5450, -gw.metrics.port, 6450, -kfk.brokers, engines:9092, -kfk.topic, ads_transaction_effects, -q.port, 5451, -rts, 0]}"} data-server | {"level":"INFO","time":"2019-10-14T22:15:16.446Z","logger":"watchdog","caller":"HistoricalStoreStarter.java:76","thread":"process-starter-1","msg":"Started HS gateway (port=5401)"} data-server | {"level":"INFO","time":"2019-10-14T22:15:16.559Z","logger":"watchdog","caller":"RealtimeStoreStarter.java:79","thread":"process-starter-2","msg":"Started RTS-0 gateway (port=5450)."} data-server | {"level":"INFO","time":"2019-10-14T22:15:17.490Z","logger":"watchdog","caller":"AppenderStarter.java:78","thread":"process-starter-0","msg":"Started appender gateway (port=5404)..."} data-server | {"level":"INFO","time":"2019-10-14T22:15:17.496Z","logger":"watchdog","caller":"Watchdog.java:451","thread":"heal-executor","msg":"Healing is enabled. To disable, set property "wd.healInterval.seconds" in file "/usr/local/appian/ae/data-server/conf/custom.properties" to a non-positive number ..."} data-server | {"level":"INFO","time":"2019-10-14T22:15:17.519Z","logger":"watchdog","caller":"ResourceMetricsCollection.java:78","thread":"resource-metrics-collector","msg":"Resource metrics capturing is enabled. To disable, set property "wd.resourceMetricsInterval.seconds" in file "/usr/local/appian/ae/data-server/conf/custom.properties" to a non-positive number ..."} data-server | {"level":"INFO","time":"2019-10-14T22:15:17.521Z","logger":"watchdog","caller":"BranchCleanup.java:67","thread":"branch-cleaner","msg":"Branch cleanup is enabled. To disable, set property "bg.branchCleanupInterval.seconds" in file "/usr/local/appian/ae/data-server/conf/custom.properties" to a negative number..."} data-server | {"level":"INFO","time":"2019-10-14T22:15:17.522Z","logger":"watchdog","caller":"Watchdog.java:252","thread":"main","msg":"Ready in 5422 ms"} data-server | {"level":"INFO","time":"2019-10-14T22:15:17.523Z","logger":"watchdog","caller":"LogRotation.java:57","thread":"log-rotator","msg":"LogRotation is enabled. To disable, set property "wd.log.rotation.interval.seconds" in file "/usr/local/appian/ae/data-server/conf/custom.properties" to a negative number"} data-server | {"level":"INFO","time":"2019-10-14T22:30:36.937Z","logger":"watchdog","caller":"BranchCleanup.java:39","thread":"branch-cleaner","msg":"Starting branch cleanup..."} kdata-server | Oct 14, 2019 10:34:39 PM io.vertx.core.impl.BlockedThreadChecker data-server | WARNING: Thread Thread[vert.x-eventloop-thread-2,5,main] has been blocked for 3428 ms, time limit is 2000 ms kdata-server | {"level":"INFO","time":"2019-10-14T22:35:12.618Z","logger":"watchdog","caller":"ScriptUtils.java:64","thread":"heal-executor","msg":"Killing data server component 'appender'"} data-server | Oct 14, 2019 10:35:47 PM io.vertx.core.impl.BlockedThreadChecker data-server | WARNING: Thread Thread[vert.x-eventloop-thread-3,5,main] has been blocked for 2961 ms, time limit is 2000 ms data-server | {"level":"ERROR","time":"2019-10-14T22:36:24.862Z","logger":"watchdog","caller":"Watchdog.java:437","thread":"heal-executor","msg":"Failed to heal components... Cause: "} data-server | java.lang.IllegalStateException: Unable to kill data-server component using command: [/usr/local/appian/ae/data-server/lib/scripts/kill-component.sh, appender] data-server | at com.appian.data.server.ScriptUtils.killComponent(ScriptUtils.java:73) data-server | at com.appian.data.server.Watchdog.heal(Watchdog.java:463) data-server | at com.appian.data.server.Watchdog.access$200(Watchdog.java:99) data-server | at com.appian.data.server.Watchdog$1.onInterval(Watchdog.java:432) data-server | at com.appian.data.server.IntervalTask.safeRun(IntervalTask.java:76) data-server | at com.appian.data.server.IntervalTask.run(IntervalTask.java:62) data-server | at java.lang.Thread.run(Thread.java:748) data-server | Caused by: java.util.concurrent.TimeoutException: Timed out waiting for java.lang.UNIXProcess@647bd949 to finish, timeout: 15000 milliseconds, executed command [/usr/local/appian/ae/data-server/lib/scripts/kill-component.sh, appender] data-server | at org.zeroturnaround.exec.ProcessExecutor.newTimeoutException(ProcessExecutor.java:1079) data-server | at org.zeroturnaround.exec.ProcessExecutor.waitFor(ProcessExecutor.java:1042) data-server | at org.zeroturnaround.exec.ProcessExecutor.execute(ProcessExecutor.java:860) data-server | at com.appian.data.server.ScriptUtils.killComponent(ScriptUtils.java:71) data-server | ... 6 more data-server | {"level":"ERROR","time":"2019-10-14T22:36:50.643Z","logger":"watchdog","caller":"ResourceMetricsCollection.java:63","thread":"resource-metrics-collector","msg":"Failed to retrieve resource metrics... Cause: "} data-server | java.lang.IllegalStateException: Unable to capture resource metrics using [ps, -o, pid,%cpu,%mem,cpu,rss, -p, 106, -p, 88, -p, 73, -p, 79, -p, 96, -p, 10, -p, 107] data-server | at com.appian.data.server.ResourceMetricsCollection.getProcessMetrics(ResourceMetricsCollection.java:136) data-server | at com.appian.data.server.ResourceMetricsCollection.captureResourceMetrics(ResourceMetricsCollection.java:88) data-server | at com.appian.data.server.ResourceMetricsCollection.onInterval(ResourceMetricsCollection.java:61) data-server | at com.appian.data.server.IntervalTask.safeRun(IntervalTask.java:76) data-server | at com.appian.data.server.IntervalTask.run(IntervalTask.java:62) data-server | at java.lang.Thread.run(Thread.java:748) data-server | Caused by: java.util.concurrent.TimeoutException: Timed out waiting for java.lang.UNIXProcess@7dcd453a to finish, timeout: 15000 milliseconds, executed command [ps, -o, pid,%cpu,%mem,cpu,rss, -p, 106, -p, 88, -p, 73, -p, 79, -p, 96, -p, 10, -p, 107], output was 0 bytes: data-server | data-server | at org.zeroturnaround.exec.ProcessExecutor.newTimeoutException(ProcessExecutor.java:1079) data-server | at org.zeroturnaround.exec.ProcessExecutor.waitFor(ProcessExecutor.java:1042) data-server | at org.zeroturnaround.exec.ProcessExecutor.execute(ProcessExecutor.java:860) data-server | at com.appian.data.server.ResourceMetricsCollection.getProcessMetrics(ResourceMetricsCollection.java:132) data-server | ... 5 more
Note: I can start the web-application container even though the data-server container status never changes to "healthy". The web-application thinks the data-server has completed its startup script. Timeouts immediately occur.
Discussion posts and replies are publicly visible
I was able to solve this. The problem was I did not increase the default docker memory. The default is set to 2GB. Once I changed it to 6GB Appian started up successfully.
I seem to have run into this exact same issue, have increased to 6GB or memory and everything but the dataserver wants to start up. The dataserver doesn't switch to a healthy state. When using the WSL v2, the dataserver DOES start up correctly however the service-manager fails with a CRASHLOOPBACKOFF message which is discussed in this thread. https://community.appian.com/discussions/f/general/19355/issue-with-docker-startup/75842#75842
Is there some other HyperV or Docker setting that you changed other than just memory?
Have abandoned the WSL v2 route and disabled it however also stuck trying to use the regular HyperV option.
The docker engine (v19.03.12) is using the out of box settings...
{ "registry-mirrors": [], "insecure-registries": [], "debug": true, "experimental": false}
Any thoughts?