Skip to Content
author's profile photo Former Member
Former Member

get all potential duplicates from e.g. kna1

Hello all,

I want to analyze all duplicates which are in really huge tables like kna1, for this I did a copy (SELECT * from KNA1 into lt_kna1) into an internal table and determined the duplicates in some ABAP coding. This is not working when the kna1 is relay huge because there is a lack of memory.

My idea was to select just the potential rows of kna1 which could be duplicates. For this I build the following SELECT statement which is as well not really performant:

SELECT * FROM kna1 CLIENT SPECIFIED INTO TABLE lt_kna1 WHERE kunnr IN (

SELECT kunnr FROM kna1 CLIENT SPECIFIED GROUP BY kunnr

HAVING COUNT( DISTINCT name1 ) GT 1 ).

Is there another possibility, for example with a function module to get the row's which might be duplicates? It would be really great if you can post your ideas about this problem.

Thanks in advance

Cajus

Add a comment
10|10000 characters needed characters exceeded

Related questions

3 Answers

  • Best Answer
    Posted on Dec 04, 2007 at 10:31 AM

    Hi Cajus,

    Thanks to the primary index, KNA1 is a sorted databases table. You can just within a select and endselect compare the current line with the previous one.

    Kind regards,

    John.

    Add a comment
    10|10000 characters needed characters exceeded

  • Posted on Dec 04, 2007 at 02:28 PM

    the KNA1 has MANDT and KUNNR, so it can have only duplicates for different

    clients, for example for client copies, where it makes perfectly sense.

    So possible explain your task.

    I have one idea where it makes sense when you want to bring together two formerly independent clients and you must check whether there are identical KUNNR used in both clients. I would not understand the task for a higher number of clients.

    This task is an inner self-join or subquery or it can also be written with for all entries.

       SELECT * 
              INTO TABLE lt_kna1 
              FROM kna1 
              CLIENT SPECIFIED WHERE mandt = mandt1
               AND  kunnr IN (
                        SELECT kunnr 
                                      FROM kna1 
                                      CLIENT SPECIFIED 
                                      WHERE mandt = mandt2 ).
    

    Siegfried

    Add a comment
    10|10000 characters needed characters exceeded

  • author's profile photo Former Member
    Former Member
    Posted on Dec 04, 2007 at 10:08 AM

    Hi,

    Use SELECT DISTINCT... you will not get any duplicate entries. But the table shoulbe not be buffering. then only you can use Select Distinct. Hope KNA1 is not buffering.

    Regards,

    Satish

    Add a comment
    10|10000 characters needed characters exceeded

    • Former Member

      Hi Satish,

      thanks for your answer but SELECT DISTINCT does not help because there are no duplicates in the key-colums (mandt and kunnr). The duplicates appear because the customer has the same or nearly the same name and adress but has two different customer ID's, so for the database it is not a duplicate but in reality it is. This is found when the data quality is not good.

      best regards,

      Cajus

Before answering

You should only submit an answer when you are proposing a solution to the poster's problem. If you want the poster to clarify the question or provide more information, please leave a comment instead, requesting additional details. When answering, please include specifics, such as step-by-step instructions, context for the solution, and links to useful resources. Also, please make sure that you answer complies with our Rules of Engagement.
You must be Logged in to submit an answer.

Up to 10 attachments (including images) can be used with a maximum of 1.0 MB each and 10.5 MB total.