Issue upgrading Appian from v22.4 to v24.3

Certified Lead Developer
Hi all,
I am writing with regards to an issue that we have found while upgrading from Appian 22.4 to Appian 24.3 in both the Dev and Test environments.

We followed the procedure described in the documentation and found no issues during the upgrade process. However, when starting the Appian engines, we noticed the 3 analytics engines and the 3 execution engines were in CRASHLOOPBACKOFF state. The other engines were in RUNNING state as expected.
We resolved the issue running the recovery script with these parameters:
./recovery.sh -p <password> -s <failing_engines> -ni
 Then, the engines started correctly, and all the servers started up fine as well. However, the exiting process instances had disappeared from the Appian design view.
Is there any way to refine the parameters used in the recovery script or another solution to recover the engines in CRASHLOOPBACKOFF state? As the same issue has happened in the Dev and Test environments we are concerned that it could happen again in production, where we dont want to lose running instances.
Kind regards,

Jesus

  Discussion posts and replies are publicly visible

  • CRASHLOOPBACKOFF state usually occurs when there is a problem with the sync of the engines..In that situation the only way to recover the system is the recovery script, and you will always loose information....

    As a recomendation, always backup your instances, (and if possible database), and stop servers and engines following the order stablished (tomcat, search-server,. data-server, services)....

  • 0
    Certified Senior Developer

    hi  

    As mentioned in issue it seems data file for process are not copied correctly. (/server/process/analytics/0000/gw1/  .../server/process/analytics/0002/gw1/  )

    Please copy those files then it will work.

    Regards,

    Shrikant Pol

  • 0
    Certified Lead Developer

    Thanks for your answers. We use a script that copies the folders from the backup and have been used in multiple upgrades without problems. The cp command did not return any failure.

    We also stop Appian in the order indicated in the documentation: tomcat, search-server,. data-server, services

  • 0
    Certified Lead Developer
    in reply to jesusa310

    Hi all again,

    I am attaching the service manager log with the transaction exception that was added for one of the engines. Other failing engines show a similar error. Is it possible to recover the engines knowing the transaction id without losing running processes? Or at least minimising the number of instances lost?

    2024-11-25 13:47:50,920 [KafkaTransactionLogService [execution00]-execution00-REPLICA STARTING] INFO  com.appian.komodo.log.KafkaTransactionLogService - execution00 REPLICA transaction log state calculated in 2 seconds: TransactionLogState{appendedCount=14879, firstTransactionId=Optional[8050800], lastTransactionId=Optional[8065678]}
    2024-11-25 13:47:50,921 [shared-cached-threadpool-10] ERROR com.appian.komodo.translog.TransactionReplayer - Unable to find transaction 1 in the transaction log for execution00.  Please verify that an old kdb was not used.
    2024-11-25 13:47:50,921 [shared-cached-threadpool-10] ERROR com.appian.komodo.translog.TransactionReplayer - Transaction replay failed for execution00
    com.appian.komodo.translog.TransactionNotFoundException: null

  • 0
    Certified Lead Developer
    in reply to jesusa310

    Check the folders, because in one of latest upgrades we realized that we had to copy another folder... maybe you are missing someone...

    Check the list in the updagre guide that correspond to your version..