Skip to Content
0

Duplicates on SELECT DISTINCT statement in HANA DB

Jan 30, 2017 at 10:57 AM

2.2k

avatar image

Hello everyone,

I'm trying to figure out if the DISTINCT statement, with a list of fields, on HANA DB without the ORDER BY statement in the SELECT, may be source of inconsistencies in the uploaded data.

In HANA data they are loaded into memory, but you can know whether it is technically possible that duplicate records are loaded, In the face of standard code that uses the DISTINCT instructions without any sort really?.

After upgrade, Standard code seems to use DISTINCT statements without any ORDER BY instruction.

Thanks for the attention, regards

10 |10000 characters needed characters left characters exceeded
* Please Login or Register to Answer, Follow or Comment.

2 Answers

Best Answer
Horst Keller
Jan 30, 2017 at 11:54 AM
1

If you are talking about Open SQL, the behavior of DISTINCT is described here.

If you are talking about HANA SQL, the behavior of DISTINCT is described here.

There's no dependency from the ORDER BY clause mentioned there.

The SAP HANA Database offers an SQL interface to its relational databases that is generally independent from HANA's internal memory management and which behaves like standard SQL for the most part.

Show 7 Share
10 |10000 characters needed characters left characters exceeded

Thanks,

Someone Told us That in HANA, the ABAP SELECT DISTINCT without the ORDER BY clause, may cause duplication of records. So is it necessary? We could may think that this is true if data is in memory and ordered by casual.

I'm talking about SELECT DISTINCT -abap instruction. standards like SQL, as you say, it should check duplicates in the selection, deleting adjacent duplicates. As the documentation say: "That DISTINCT Specifies only one copy of each set of duplicate records selected Should be returned".

I saw a lot of instructions in code Standard That do not use the ORDER BY clause.

I hope I've been clear enough. Let me know your ideas.

Thanks a lot Bye

1

"Someone Told us That in HANA, the ABAP SELECT DISTINCT without the ORDER BY clause, may cause duplication of records."

Who told you so? It is not true. The databases sort themselves if DISTINCT is given. No ORDER BY necessary. For DISTINCT, the complete rows of the result set as defined in the SELECT list are taken into account.

You need ORDER BY in order to receive a defined sequence of the result set. Some people relied on an implicit ordering for pool and cluster tables that gets lost after depooling and declustering on HANA. Maybe you mix that up.

3

Thanks Horst.

I was quite sure that "DISTINCT" can't betray its meaning :) Someone in the consultant's world.

I was talking about standard table, not cluster tables necessary.

ORDER BY is only for the final output as far as I know.

Thanks for your advice.

I was wondering about an other thing:

We've just upgraded on HANA DB and we're looking out from ST12 (tracing transactions), that Net Time for an instruction like this, is consuming 3-4 times more than in Oracle:

HANA SYSTEM:

ORACLE SYSTEM

SELECT SINGLE maktx INTO fh_diba-cdescr FROM makt

WHERE matnr EQ fh_diba-matnr AND spras EQ sy-langu.

We can't figure out how is this possible. I've seen that indexes are not available anymore like MAKT~M, but I'm confident that's not the question.

I think there's something more at DB level needing some adjustment.

Thanks a lot for your advices.

Bye

0

SAP HANA is not made for SELECT SINGLE, even Hasso says so ...

4

Hi Horst, Thanks for reply.

HANA is not designed for the ABAP ECC Oracle way and I Imagine that HANA Paradigm is different, but every company that comes from ECC 6 it will have probably thousand of "SELECT SINGLE" instructions, even in standard code itself. This means a big change in programming approach and a big rework. This is Not a really save-time

We could expect the same or less computational time , but not 3 times more. I need to investigate more on hardware of Oracle machine and Hana machine, but if the hardware is the same, this situation will take us to a bottleneck.

So what is the best alternative to a SELECT SINGLE in this case?

Could you tell me your ideas about this?

Thanks, I appreciate!

Bye

1

There are other guys with much more knowledge about that than I, e.g.

https://blogs.sap.com/2013/03/24/performance-guidelines-for-abap-development-on-the-sap-hana-database/

1
Show more comments
Akash Kumar
Jan 30, 2017 at 12:28 PM
0

Please check if the mandt (client) is used in the select qwery.

Best Regards,

Akash

Show 2 Share
10 |10000 characters needed characters left characters exceeded

Hi

In the ABAP sentence, MANDT is defined by compiler.

Anyway i've never used CLIENT SPECIFIED clause in select statement.

Thanks.

0

Hi,

I was talking about HANA development (native) not from ABAP end.

I was thinking if you are using native query n getting multiple results that might be because of data from different client. So i suggested. Anyways i think you are getting many replies so hopefully you will get your solution.

BR,

Akash

0