Hello We are using stored procedures and it's executed by auto processe

Hello
We are using stored procedures and it's executed by auto processes that executes every few minutes. Sometimes the DB node throws the error "com.appiancorp.kougar.driver.exceptions.SafeRetryException: Unable to acquire a Write connection. Safe to retry. ". If we manually reexecute the node, it works just fine. Do anybody come across this issue? Currently the pool size is set to 5000 in our appian ds xml. If we increase the pool size, that would help to resolve this? any thoughts? Is there a way we retry the DB node if this error occurs?...

OriginalPostID-72757

OriginalPostID-72757

  Discussion posts and replies are publicly visible

  • Hello Prasadh,

    I am continuing to research this topic.
    In the meantime, allow me to explain that the error in your post, "com.appiancorp.kougar.driver.exceptions.SafeRetryException: Unable to acquire a Write connection. Safe to retry." refers to a write connection failure to the Appian engine k database files, as opposed to the relational database configured by appian-ds.xml. So, I think we will need to look beyond increasing the pool size in appian-ds.xml as a solution for this particular error message.
  • As Matt mentions, this error indicates that the engines where busy to respond in time to the request that this smart service did. There are several areas of the product that will retry when this error occurs, but query rules, write to DS and query DB nodes do not offer a retry logic for this error. That is why restarting the node works.

    In this case you may want to move these processes that are failing to times where the engines are not checkpointing nor experiencing heavy load. For instance, let's say you know that every day at 7AM you have a lot of automated/scheduled processes to run, and then you realize that your stored procedures in a different model fail around this time. At this point the solution is to re-schedule those automated process to run outside business hours for example.

    An enhancement request has been created to support a retry logic in these nodes when the engines are busy to respond to the request. The reference number is AN-40499

    Eduardo Fuentes
    Appian Technical Support
  • Thank you guys! The issue is - We are running this automated process every 30 mins. I dont know how to find a work around for this. Strangely, the error is coming out of specific nodes of the process. Not all DB nodes throws this error now. Do you think creating a support ticket would help us resolve this?
  • I understand increasing the pool size is not a long term solution, can we try increasing the pool size for primary DS if that would help us in resolving this. thanks!
  • Unfortunately, the pool size for the relational databases does not affect the Appian engines connection timeout time. So, I do not think increasing the pool size will affect the problem at hand.

    Can you please expound on “the error is coming out of specific nodes of the process?” Are there some DB nodes that never throw this error, and some DB nodes that are likely to throw this error? If so, is there anything different in the configurations between these two groups of DB nodes?
  • Sure. All the DB nodes we have in the process are Stored Proc Nodes. Only those nodes are throwing the error. and Write to Data Store node is throwing error but very intermitent.
  • Hi Eduardo - " this error indicates that the engines where busy to respond in time to the request that this smart service ". Does this anything related to Stored Proc execution time? Some SP are faster and some are little slow because of the business logic incorporated.
  • Please help us - Is there a way that this write connections uses any kind of time out custom properties? Interested to know whether increasing the time out values helps the write connections to persist. thanks a lot.
  • Hello Prasadh,
    You are welcome to create a support case, in which an engineer can analyze your db_*.log files for transactions at these busy 30-minute intervals. Given what you have described, support can confirm that the automated process is causing the engines to be busy, thus the "Unable to acquire a Write connection" error. To my knowledge, it is outside the custom properties settings to lengthen the write connections time. It may be useful to consult professional services on how to design for an automated process that runs on a regular interval.