Skip to Content
0

remove duplicate at universe level

Sep 05, 2017 at 07:28 AM

56

avatar image
Former Member

My Data foundation layer has table (ie Mytable ) which retrieve its data from a database view (i.e. MyView), I need to remove duplicated from Myview at Universe level, and before doing any join with the reset of table in the universe

I want to make sure that the below scenarios are applied

  • 1.If the user select col1 from Mytable , I need the script to be generated as ( Select distinct Col1 from MyView )
  • 2.If the user select col1 and col2 from Mytable , I need the script to be generated as ( select Distinct Col1 , Col2 from Myview)

I already try different scenario to remove duplicated, but the generated SQL did not apply the above scenarios

My attempt

Make a derived table, from Mytable in the data foundation layer, which select distinct from the main table,

However, the generated Script if the user select Col1 is

Select col1 from (select distinct * from Myview) -> which is not what I need

10 |10000 characters needed characters left characters exceeded
* Please Login or Register to Answer, Follow or Comment.

1 Answer

Koen Hesters Sep 06, 2017 at 11:15 AM
0

Hi,

trying to get it ... why do you have duplicates ? Can't you filter anything ? Is the aggregation of the records not working for you ?

Grtz

Koen

Show 3 Share
10 |10000 characters needed characters left characters exceeded
Former Member

The view is located in Teradata database and it contain some joins , Teradata optimizer can skip the needless join if you are select column from one table only and you are using distinct

Example , if myview is joining table1 , table2 , table3

If the generated SQL is

select Col1 from myview and Col1 is located in table 1 ,

optimizer will execute the join between table1, tabe2,table3

While if the generated SQL is

select distinct Col1 from myview ,

The optimizer will not execute the join and will retrieve data from table1 directly

This is why I want the universe to generate distinct value from this view each time before doing the any join.

I found it will be harder if I discuss this cases in the question , So I am asking to remove duplicated, While my object is to have distinct So I can gain a better execution plan for the query

0

Hi

Don't quite follow, but if it is your join that is duplicating, can't your correct it on that level, and another way is maybe to join the tables at universe level?

grtz

Koen

0
Former Member

The Execution plan when the generated query is Select Distinct col1 from MyView

is Better than when the generated query is Select col1 from myview

So I am trying to force universe to write the keyword distinct in the generated query

0