11-10-2005 4:36 PM
Hi, I have 3 jobs running, job A runs first and inserts data into a table, several minutes later, jobs B and C run, the job B finds the data and process it, when job C runs, it sometimes doesn't find any data, we have put job C to retry for 10 minutes, sometimes it runs before job B, and because of the retry, also runs after job B, and still doesn't find the data.
We have two servers, one for de DB and the other for APP, the workload is balanced between them, the problem is not constant, as far as we have noticed, it occurs during heavy workloads on the system.
Any ideas or suggestions are welcome.
Thanks
11-10-2005 4:46 PM
This could be a buffering problem, but since you are running job C every ten minutes, it should have been updated. But what you can do is, change your job C's select statements with the option BYPASSING BUFFER.
11-10-2005 5:25 PM
Hi Robert!
Because of second run and 'still doesn't find the data' - I don't believe a buffering problem. Usually buffer synchronization is after 1 - 2 minutes. Long update task runs can take half an hour and more - but then job B would have to wait, too.
I guess, the problem is not based on additional data selects of job C, e.g. something out of V2 updates...
There are also programming errors, which can occur randomly: e.g. having some read table ... binary search on a not correctly sorted internal table. If this is the prerequisite for a DB-select, you won't find anything.
Also not getting the correct output might be in a later execution, but still with read table ... binary search. If your data was indeed selected, but can't be find by the read, you might think the selection wasn't successful.
For buffering and other problems, check if job B and C were executed on same application server (in detailed joblog) in a buggy situation. If all the time: no, then it might be a server / DB problem.
But otherwise, please give some details of the selects.
Regards,
Christian
11-10-2005 5:29 PM
But they are not seeing this problem consistantly. So it may not be something in the code, since this problem happens randomly.
Srinivas
11-10-2005 5:39 PM
Yes, buffering might be a reason - my feeling just goes more for code checks.
Also code behaves different based on different data situation.
On the other hand: problems only(?) during high workload. That's of course typical for update / buffer problems.
And still: any other idea will be very welcome!
11-14-2005 9:16 PM
Hello again, sorry for not responding before, but I had some problems accesing the site:
regarding the code, job B does this:
SELECT * FROM zfactura
WHERE route IN p_route
AND status IN p_status.
Job C does the fallowing sql statement
SELECT * FROM zfactura
WHERE route IN p_route
AND date IN p_date.
Job A makes the inserts on zfactura.
We've reviewed the jobs log and job B usually runs on the app server.
We also checked the table zfactura, it didn't had the technical options (it's an old table), so we put the technical options and disabled the buffering option.
Based on your suggestions we have put some extra code in the job's programs, and since that, the problem has been reduced to only two routes, so we're thinking it could be something else that's affecting these two routes.
Thanks
11-14-2005 9:25 PM
Your job C has an additional criteria of "date". How is this filled in job A and/or modified in job B? May be the dates are not matching because of formatting issues. Is the date a real date field? Is it stored consistantly in YYYYMMDD format for all routes?
It looks more like a data related issue now. Look in the table for the entries and see if there is anything.
How is "route" field defined? Does it have a conversion routine associated with it? If so, check how these two problem routes are entered in the variant for the background and how they are actually stored in the database.
Srinivas
11-14-2005 9:26 PM
11-14-2005 9:56 PM
We have an Oracle 9.205 database.
About the data on the table, route is a char(6) and date is dats type, the wired part is that if today we run the job C programa (with the parameters it used the day it failed) it runs fine.
11-14-2005 10:01 PM
Ok, it looks like we still have some buffering somewhere, even though you turned it off on the SAP side, may be Oracle buffer. I think that is what Rob is looking for. I recently came accross this problem(at one of my collegues' project) with an oracle patch that made some records to be skipped in a selection, even though they match the criteria. The wierd thing is that the select fails if the record is part of a range, but succeeds when it is the only value.
May be something to look into.
Srinivas
11-14-2005 10:20 PM
We recently updated our DB2 database from 8.1.5b to 8.2.2. We noticed a problem with locking when doing updates after a transaction was called. This is obviously not the same problem here, but I'm not sure yet about the other thread.
Rob
11-14-2005 10:23 PM
Rob, the one I mentioned happened in a standard SAP report, so it was escalated to them and they are looking into it, but since here they all are custom programs, SAP may call it consulting issue. But yes, that can be one of the reasons.
Srinivas
11-14-2005 10:36 PM
Our problem happened in one of our own programs (actually three of them). Our DBA thought it might be an upgrade issue and sent it to IBM. IBM came back saying it might have to do with one of the new parameters. They suggested it go to OSS. So I sent it to OSS and last week, they were in our system for a couple of hours. They called back and said that it might be due to an asynchronous update in their BAPI. Anyway, they've passed it to their BAPI person who also wants to look at our system. So, I'm hoping it's not a consulting issue, but we'll have to wait and see.
Rob
11-14-2005 10:49 PM
I'd really apreciate if you let me know what SAP says about your problems, because I think you're rigth, they could say its a consulting issue, in the mean time I'll try to get the error on a standard program.
Robert
11-22-2005 11:27 PM
Robert - after a lot of back and forthing, OSS suggested:
- Change system parameter to speed unlock of notification
(ask your system administrator for more info)
- Check if the notification is unlocked before you
execute statement 'call transaction IW52'
- Use an unnecessary loop to retard execute of
statement 'call transaction IW52'.
Our basis people said things are correct and don't need to be changed. IBM said that it's not a DB2 upgrade issue, so as time permits we will implement one of the other suggestions.
Rob