on 12-22-2016 10:51 AM
Hi All,
We have issue in PROD where most of the cronjob missed its trigger then we nee dto trigger it manually but this doesn't happen everytime.
issue being discussed with HYbris as well but no specific solution yet we tried few configuration now wokring on nodegroup configuration.
Please let me know if anybody encountered similar problem in their PROD and have final solution for it.
thanks
Hey, were you able to find the solution? We faced similar issue and did employ a fix for it. The issue is with the Timer Task replacing the Trigger task in 5.7.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Shankara,
Please check ECP-3197 - Cronjob scheduler is unstable, it has patches in:
1811.1
1808.5
6.7.0.10
6.6.0.14
6.5.0.17
6.4.0.20
Regards,
Luke
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Luke,
That JIRA issue specifically states "From /backoffice, when we update Trigger item or launch manually a Cronjob which is running". In our case, we're just seeing pretty much random cases of a trigger no longer being fired without doing either of those things in the backoffice.
However, the "Delete the trigger and recreate" approach also works for us (once we've detected that a cronjob isn't triggering). I'm wondering if the description (quoted above) in the defect is a symptom of a bigger underlying issue, and the fix/patch will also likely resolve the (slightly different) symptoms that we're seeing?
Thanks.
Hi all!
Same issue here. We've a lot of cronjobs started by a trigger with a free interval. Sometimes the trigger do not start the job and do not update the next activation time. From this moment on the next activation times is in the past and the job won't start ever again. Most of the times the solution is to create a new trigger, but we had cases where we had to clone the whole job. There are more of these failures after a restart. The maxAcceptableDelay is -1 and the clusternode is defined.
Are there any solutions?
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi
We're using 6.1.0.2 with a cluster of 6 nodes and jgroups. We don't have any nodegroup configuration. Yesterday we had a restart of node 5 (all jobs are configured with "execute on node: 5") and after this restll jobs stopped working, because we had a problem with the cluster. After fixing the problem the cluster worked fine, but all ~180 jobs didn't stgain because the next activation time is in the past.
I looked for messages from the task engine, but I only found some normal logging from DefaultTaskService like "Task engine starting up"
Did you ever face clustering issues in your PROD cluster? The triggers can be missed if your cluster is not setup properly and messages are not reaching to the nodes within cluster. Search for JGROUPS related errors in your console.
Thanks Pratik
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
No Permanent solution yet, As per hybris suggestion we removed nodeid from cronjobs and added that in nodegroups. Also increased max delayaccess of trigger. This change increased performance still few occurrence of that issue.
Hope this help in your environment. it also depends on environment. let me know you enviroment details then may be I can provide more support. Like number of srever. which all function using task engine etc.
Regards Shankar
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
5 | |
1 | |
1 | |
1 | |
1 | |
1 | |
1 | |
1 | |
1 | |
1 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.