Skip to Content
0

i have two duplicate records like lenovo company,lenovo . how to find this duplicate records????

Dec 13, 2017 at 01:20 PM

26

avatar image
Former Member

Hi Guys,

I have two duplicate records like lenovo company, lenovo . how to find this duplicate records????

10 |10000 characters needed characters left characters exceeded
* Please Login or Register to Answer, Follow or Comment.

1 Answer

ERALPER YILMAZ Dec 13, 2017 at 04:22 PM
0

Hello Bharathkumar,

If you can manage correct data or expected data and its variations in a seperate lookup table for data cleansing, then you can use the method I'll demonstrate here.

But please note that this approach is very straightforward and doesnot contain fuzzy logic

I assume you have two tables; one for transactional data and the other is for data cleansing

create table DataTable (
    id int,
    company nvarchar(100)
)
create table DataMaster (
    company nvarchar(100),
    variation nvarchar(100)
)


And I populate some sample data for testing the approach

insert into DataMaster select 'Lenovo','Lenovo Company' from dummy;

insert into DataTable select 1, 'IBM' from dummy;
insert into DataTable select 1, 'Lenovo' from dummy;
insert into DataTable select 1, 'Lenovo Company' from dummy;

As you can guess now, I'll execute an UPDATE statement that will match company field from transactional data from its variations, and use the correct form of the value

update d
set
    company = m.company
from DataTable as d
inner join DataMaster as m
    on d.company = m.variation;

I hope this helps you for the solution.

Unfortunately if your company data is not stored alone on a table column, if it is part of a longer string value, then it is more difficult to cleanse such data.


Share
10 |10000 characters needed characters left characters exceeded