I have been trying to identify inefficiencies in queries I have coded. I read in an IBM technical brief that seeing a high "BP Gets / Rows Processed" statistic is an indicator of an inefficient query, likely where there is no good index. Three to five was suggested as a healthy range.
The query I am looking at now has run once. Both "BP Gets / Rows Processed" and "BP Gets / Executions" are 182,101, which sounds like a stinker. Yet I know from looking at the access plan and other statistics the query is using a good index to whittle the dataset down to 22,629 rows on the first table, then joining using unique primary keys to the second and third tables. "Rows Read" is 67,887 (as is "Rows Read / Rows Processed". Granted, only 2,000 of the joined rows are then selected, as there is a predicate based on a column on each of the second and third tables which is not in the index. Still, this seems a pretty efficient way to select what I need from around 4,000,000 rows on each table.
I think the problem is that "Rows Returned" is reported as zero. Perhaps, to avoid a division by zero, the "... / Rows Processed" statistics are not divided. But around 2,000 rows were returned! Does this statistic not work for a JOIN? Dividing BP Gets by the actual count of Rows Processed fives me a much better 33.9. I do not think that is bad, considering the WHERE criteria refer to at least one field from each of the tables.
The suspect statistic is not a significant issue, but piqued my curiosity. The real reason I have been spending so much time on this is that the optimiser in our production environment chose a rather different, and entirely unsatisfactory, approach, running for over an hour rather than several seconds (with pretty much the same data and statistics), but that is another story.
INTO TABLE lt_ecs_item_company
INNER JOIN gle_fi_item_orig ON gle_fi_item_orig~item_id = gle_ecs_item~item_id
INNER JOIN gle_fi_item_modf ON gle_fi_item_modf~item_id = gle_ecs_item~item_id
WHERE gle_ecs_item~status IN ('0','1')
AND ( gle_ecs_item~company_code EQ p_company_code OR
( gle_fi_item_orig~vbund EQ lv_vbund AND gle_fi_item_modf~vbund EQ ' ' ) OR
gle_fi_item_modf~vbund EQ lv_vbund )
0 SELECT STATEMENT ( Estimated Costs = 4.646E+06 [timerons] ) num_rows tot_cost i/o_cost
--- 1 RETURN 5.1065E+03 4.6458E+06 6.8438E+05
--- 2 NLJOIN 5.1065E+03 4.6458E+06 6.8438E+05
|-- 3 [O] NLJOIN 1.9734E+06 2.6185E+06 3.9368E+05
| |-- 4 [O] FETCH GLE_ECS_ITEM 1.9734E+06 2.0576E+05 4.6730E+04
| | |
| | --- 5 RIDSCN 1.9734E+06 5.8290E+03 4.3300E+02
| | |
| | |-- 6 SORT 9.8670E+05 2.9145E+03 2.1650E+02
| | | |
| | | ------ 7 IXSCAN GLE_ECS_ITEM~Z02 #key columns: 2 9.8670E+05 1.9892E+03 2.1650E+02
| | | (index Z02 contains MANDT, STATUS, COMPANY_CODE, and ACCOUNT)
| | --- 8 SORT 9.8670E+05 2.9145E+03 2.1650E+02
| | |
| | ------ 9 IXSCAN GLE_ECS_ITEM~Z02 #key columns: 2 9.8670E+05 1.9892E+03 2.1650E+02
| --- 10 [I] FETCH GLE_FI_ITEM_MODF 1.0000E+00 2.0771E+01 3.0117E+00
| ------ 11 IXSCAN GLE_FI_ITEM_MODF~0 #key columns: 2 1.0000E+00 1.3795E+01 2.0000E+00
| (index 0 contains MANDT and ITEM_ID)
--- 12 [I] FETCH GLE_FI_ITEM_ORIG 2.5876E-03 2.0813E+01 3.0177E+00
------ 13 IXSCAN GLE_FI_ITEM_ORIG~0 #key columns: 2 1.0000E+00 1.3795E+01 2.0000E+00
(index 0 contains MANDT and ITEM_ID)