Skip to Content
avatar image
Former Member

i have two duplicate records like lenovo company,lenovo . how to find this duplicate records????

Hi Guys,

I have two duplicate records like lenovo company, lenovo . how to find this duplicate records????

Add comment
10|10000 characters needed characters exceeded

  • Get RSS Feed

1 Answer

  • Dec 13, 2017 at 04:22 PM

    Hello Bharathkumar,

    If you can manage correct data or expected data and its variations in a seperate lookup table for data cleansing, then you can use the method I'll demonstrate here.

    But please note that this approach is very straightforward and doesnot contain fuzzy logic

    I assume you have two tables; one for transactional data and the other is for data cleansing

    create table DataTable (
        id int,
        company nvarchar(100)
    )
    create table DataMaster (
        company nvarchar(100),
        variation nvarchar(100)
    )


    And I populate some sample data for testing the approach

    insert into DataMaster select 'Lenovo','Lenovo Company' from dummy;
    
    insert into DataTable select 1, 'IBM' from dummy;
    insert into DataTable select 1, 'Lenovo' from dummy;
    insert into DataTable select 1, 'Lenovo Company' from dummy;

    As you can guess now, I'll execute an UPDATE statement that will match company field from transactional data from its variations, and use the correct form of the value

    update d
    set
        company = m.company
    from DataTable as d
    inner join DataMaster as m
        on d.company = m.variation;

    I hope this helps you for the solution.

    Unfortunately if your company data is not stored alone on a table column, if it is part of a longer string value, then it is more difficult to cleanse such data.


    Add comment
    10|10000 characters needed characters exceeded