Scenario: a process may fail due to errors in unattended tasks, which can be safely retried as they depend on external systems (e.g. a web service is unavailable).
Imagining a large number of instances at any given time, it is not desirable to schedule automatic retries.
In this situation, the instances go in the "Active with error" state.
Requirement: it would be desirable to manually handle massive retry of these instances. As I understand, no "resume" operation is available, as the instances are not paused, and one would have to restart them one by one by opening them in monitor mode.
Is there a smarter way to "restart" nodes in multiple instances?I'm looking for any methods, in order of preference:
Thanks in advance.
Discussion posts and replies are publicly visible