Hey there!
If you want to understand the Optimistic lock exception which occurs in Camunda engine most of the times and if you would like to debug it and tweak it according to your needs, you are in the right place.
At the end of this article, you will be able to understand the reason for Optimistic lock exception occurrence and way to avoid this in Camunda engine.
Let’s get started.
Why it occurs?
Optimistic lock exception occurs in the DB when there is concurrent transaction.
For ex. There is a row in table (say R1), which is getting updated via two different transactions. That is when one transaction successfully updates the Row R1, whereas the other gets the exception.
In Camunda tables, for each row, version will be maintained (column rev_). Whenever there is an update in the row, version gets increased. So, when there are concurrent transactions, both rows get a same version of row and tries to update. One transaction’s update will be successfully done, version will be increased, whereas the other transaction’s update will fail with Optimistic exception as it is trying to update lower version of row.
Where it occurs?
- In the join gateway of parallel/inclusive gateway. When the parallel processes finish at same time and both the executions reaches the join gateway at the same time, both the executions will be trying to update the job information. One of them would get success and other would get the exception.
- In the tasks which is present in between parallel/inclusive gateway fork and join. If the tasks are async before or after, the When we use same variable name in both the execution, when engine tries to persist the data into DB, there is a possibility of getting the exception, both executions may try to persist the variable at the same time.
How to resolve?
1.For resolving this issue, Camunda engine has a concept of job retry. Whenever engine gets job exception, it attempts to retry the task (whichever tasks exists after last save point). This would come handy in many places. By default, number of retries will be 3. We can reduce this or disable by explicitly configuring it.
If the process is transactional and retrying the task may impact the workflow progression and there is no data dependency on the Camunda tables (Camunda variables are not used outside the current process), then we can simply disable the Camunda engine retry configuration.
Retry configuration can be done in two places,
Task level
Engine level
For configuring in task level, in BPMN process, in the async task properties, there will be Retry Time Cycle, this can be set as R0/PT5S (Retry zero times for every 5 seconds). This will override the default configuration.
For configuration in engine level, we can use below.
engineConfiguration.setFailedJobRetryTimeCycle("R0/PT5S")
2.If we implement the first solution, job will not be retried, and whichever execution gets committed first will be present in DB. This holds good for join gateway exception.
But, in case the exception is in the task in-between the fork and join gateway due to the user defined variable, this needs to be sorted out in the code. If not, one of the parallel process would get stuck in the task where it got the Optimistic lock exception and engine will assume that job is finished as the retry value would be zero.
Try to use unique variable names in the execution wherever there is a parallel process.
Configure exclusive job in the parallel process.
Hope this gives some idea on how Camunda engine works internally.
Happy debugging! 😊
Top comments (0)