Skip to Content
0

ETL for loading Fact table FROM DIMENSIONS TABLES

Feb 16, 2017 at 03:16 PM

283

avatar image

Dear All,

I am using data services 12.2.3.4

I want to implement a FULL LOAD , DELTA later :-)

I have just loaded Dimensions tables but when I load the facts table ( Business model is Bug tracker ) I get a cartesian product , not full cartesian product because the join condition exist in the query transformation but may be this join condition is not enough

Join condition (Query transform)

SQL_bug_table.id = SQL_mantis_bug_history_table.bug_id and
SQL_bug_table.id = mantis_custom_field_string_table.bug_id

Must I use lookup_ext function in order to load dimensions foreign keys in fact table ?

In order to get the foreign keys of the Dimensions tables must I use the lookup_ext function ?

I have a business ratio in fat table NM_Incidencia which is the number of a bug , I thing I have to load values of the dimensions that corresponding in the operational source with this ratio , true?

Dimensions :

  • Severidad
  • Prioridad
  • Estado

They are dimensions tables catalog. Theirs values are not loaded from operational source . They are descriptions of a bug like colour but in this case they are severity , priority and condition or state

Could you give a manual with screen shots of a data flow for loading a star model (fact table and dimensions tables ) using data services ?

Also please try to answer my questions for my model of star that attached

I have attached the star model and the current data flow (fact table load)

Also I attached the E-R OPERATIONAL SOURCE (Mantis Bug Tracker)

Thanks in advance

Best Regard

Antonio

10 |10000 characters needed characters left characters exceeded
* Please Login or Register to Answer, Follow or Comment.

9 Answers

Soumyadeb Halder Feb 17, 2017 at 07:22 AM
0

Loading a star model using data services, you must load the dimension tables first and then fact table. While loading the data into fact table you have to keep in mind that no data in fact table will be loaded without validating the primary keys in the dimension tables. Those primary keys are infact foreign keys of Fact table.

Hence I think, its best way to validate the foreign key data of fact table with primary key of dimension table through lookup_ext function.

If any of the foreign key data does not match with primary key value of corresponding dimension, then do not load that record into Fact table.

Loading data into Fact table, does not mean take data from Dimension tables and load. But it mean, loading data by taking reference from Dimension table.

Fact table data should have different source( as like you have loaded data into dimension table from different sources) and while loading them into table, validate them with dimension table through lookup_ext function.

Please let me know, if this information helps you or you need more information.

Share
10 |10000 characters needed characters left characters exceeded
Antonio Astudillo Feb 17, 2017 at 10:03 AM
0

query-transformation-2.jpg

dimension-catalog-priority.jpg

query-transformation-mapping-to-dimension-catalog.jpg

sql-transformation-from-operational-source-2.jpg

sql-transformation-from-operational-source.jpg

Thank you very much, Now everything is much clear !!! but I have some doubts

Please Do you have any screenshot of an example of data flow of loading fact table ?

You wrote :

"Loading data into Fact table, does not mean take data from Dimension tables and load. But it mean, loading data by taking reference from Dimension table."

Could you explain me with more details and examples of it ?

I have 3 dimensions

Dimensions :

  • Severidad
  • Prioridad
  • Estado
  • They are dimensions tables catalog. Theirs values are not loaded from operational source . They are descriptions of a bug like colour but in this case they are severity , priority and condition or state.

    Please look model of stat attached

    MODELO_ESTRELLA_MANTIS_19_ABRIL.jpeg

    Have you seen the data flow attached ->data flow fact load.JPG ? Is it correct ?

    I have connect all dimensions with the query transform also I have connected two SQL transformations for loading dates of the fact table and loading the ratio NM_Incidencia which is the number of a bug that is NOT linked to any dimensions because it is a ratio of fact table.

    Please look all the 2 screens attached -> query transformation with "where" and "from" Are they correct ? Just I need to make a lookup_ext function to validate the foreign key data of fact table with primary key of dimension table True ?

    Please look :

    Dimension catalog Priority.JPG

    query transformation mapping to dimension catalog .JPG

    Are they well done ?

    Thanks in advance

    Best Regards

    Antonio


    Share
    10 |10000 characters needed characters left characters exceeded
    Soumyadeb Halder Feb 17, 2017 at 10:41 AM
    0

    Hello Antonio,

    I do not feel that, the way you are following to load the Fact table is correct.

    Can you please tell me, from where your Fact data is coming? I mean, what is source of the Fact data?

    I understand your model, here 'HE_Incidencia' is your Fact table and rest of the tables are dimensions. Joining dimension tables is not required while loading data into Fact table. But joining is required while creating a report from this data model, then you may join Fact and Dimension table on that basis of Foreign and Primary Key respectively to pull the details of the records.

    Share
    10 |10000 characters needed characters left characters exceeded
    Antonio Astudillo Feb 17, 2017 at 11:19 AM
    0

    Hello Halder and everyone,

    nm-incidencia.jpg

    MTIS_DW__DIM_NAME ->DIMENSION

    MTIS_DW_HE_INCIDENCIA-> FACT TABLE

    THE STAR MODEL IS STAR WITH SNOWFLAKE DIMENSIONS

    The fact data is coming from operational source as ratio or metric NM_Incidencia which is the number of a bug that is NOT linked to any dimensions because it is a ratio of the fact table but fact data is also coming from dimensions (mapping)

    Please look :

    Dimension catalog Priority.JPG

    query transformation mapping to dimension catalog .JPG

    You wrote :

    "Loading data into Fact table, does not mean take data from Dimension tables and load. But it mean, loading data by taking reference from Dimension table."

    How do you take reference from Dimension table ? -> Just doing a mapping to dimension catalog or using lookup_ext function ???

    Please look :

    SQL transformation from operational source 2.JPG and nm_incidencia.jpg You can see that id ->NM_INCIDENCIA is loaded from operational source

    SQL TRANSFORMATION (please look data flow fact load.JPG)

    select id,summary,cast(DATE_FORMAT(FROM_UNIXTIME(date_submitted), '%Y-%m-%d 00:00:00') AS DATETIME),cast(DATE_FORMAT(FROM_UNIXTIME(last_updated), '%Y-%m-%d 00:00:00') AS DATETIME )
    from mantis_bug_table (MANTIS_BUG_TABLE IS A TABLE FROM OPERATIONAL SOURCE)

    Please look mantis.JPG attached

    Thanks in advance

    Kind Regards

    Antonio


    nm-incidencia.jpg (235.5 kB)
    Share
    10 |10000 characters needed characters left characters exceeded
    Soumyadeb Halder Feb 17, 2017 at 12:14 PM
    0

    Hello Antonio,

    Does your SQL table 'mantis_bug_table' and 'mantis_bug_history_table' provide you all the details of each bug_id? I mean Severidad,Prioridad,Estado details??

    I can see that, you have only fetched two fields, Id and summary along with two datetime fields. If bug history or summary provide you Severidad,Prioridad,Estado details then you can just use a lookup_ext function and pull the dimension id and do the mapping in Fact table.

    For example, if your bug history or summary provides your Prioridad description as 'NINGUNA' then you can use lookup_ext function on 'MTIS_DW_DM_PRIORIDAD' on column 'DS_PRIORIDAD' and pull the 'ID_PRIODAD' and do the mapping into column 'ID_PRIORIDAD' of fact table 'MTIS_DW_HE_INCIDENCIA'.

    Can you please share a sample record of 'mantis_bug_table' and 'mantis_bug_history_table' and 'mantis_custom_field_string_table'?

    Share
    10 |10000 characters needed characters left characters exceeded
    Antonio Astudillo Feb 17, 2017 at 01:14 PM
    0

    prioridad.jpg

    prioridad-description.jpg

    mantis-bug-table-ok.jpg

    Hello Halder and everyone ,

    I answer below .... in italic

    Please let me know your solution , continue helping me !!! Thanks

    Hello Antonio,

    Does your SQL table 'mantis_bug_table' and 'mantis_bug_history_table' provide you all the details of each bug_id? I mean Severidad,Prioridad,Estado details??

    Yes it does

    Please look screen shot:

    prioridad.JPG

    prioridad_description.JPG

    MANTIS_BUG_TABLE_OK.JPG


    I can see that, you have only fetched two fields, Id and summary along with two datetime fields. If bug history or summary provide you Severidad,Prioridad,Estado details then you can just use a lookup_ext function and pull the dimension id and do the mapping in Fact table.

    For example, if your bug history or summary provides your Prioridad description as 'NINGUNA' then you can use lookup_ext function on 'MTIS_DW_DM_PRIORIDAD' on column 'DS_PRIORIDAD' and pull the 'ID_PRIODAD' and do the mapping into column 'ID_PRIORIDAD' of fact table 'MTIS_DW_HE_INCIDENCIA'.

    We can do what you meant above using mantis_bug_table , this table provides id of priority example -> 30 . The description "NORMAL" came from dimension MTIS_DW__DIM PRIORITY


    MANTIS_BUG_TABLE provides "id" of Prioridad (field ->priority , example ->30) and the description is provided by MTIS_DW__DIM PRIORITY

    Can we continue doing what you write below ? :

    "you can use lookup_ext function on 'MTIS_DW_DM_PRIORIDAD' on column 'DS_PRIORIDAD' and pull the 'ID_PRIORIDAD' and do the mapping into column 'ID_PRIORIDAD' of fact table 'MTIS_DW_HE_INCIDENCIA'."

    How can we use lookup_ext function for this case ?

    Does we need to map PRIORITY (FACT TABLE FK) with ID_PRIORIDAD (MTIS_DW__DIM_PRIORIDAD PK) ?

    Please could you look query transformation mapping to dimension catalog .JPG attached

    Can you please share a sample record of 'mantis_bug_table' and 'mantis_bug_history_table' and 'mantis_custom_field_string_table'?


    select * from mantis_bug_table

    # id, project_id, reporter_id, handler_id, duplicate_id, priority, severity, reproducibility, status, resolution, projection, eta, bug_text_id, os, os_build, platform, version, fixed_in_version, build, profile_id, view_state, summary, sponsorship_total, sticky, target_version, category_id, date_submitted, due_date, last_updated
    '1', '1', '2', '9', '0', '30', '50', '70', '90', '20', '10', '10', '1', '', '', '', '', '', '', '0', '10', 'Modificar manual de usuario servicios web', '0', '0', '', '434', '1254131130', '1', '1442928300'
    '2', '1', '2', '16', '0', '30', '50', '70', '90', '20', '10', '10', '2', '', '', '', '', '', '', '0', '10', 'Pone el texto \"Bad Credentials\" al logarse incorrectamente', '0', '0', '', '26', '1254131235', '1', '1277900489'
    '3', '1', '2', '20', '0', '20', '50', '70', '90', '20', '10', '10', '3', '', '', '', '', '1.1.2', '', '0', '10', 'Crear taglib para componentes jsf de documentum', '0', '0', '', '435', '1254131458', '1', '1442929023'
    '4', '1', '2', '2', '0', '20', '50', '70', '90', '20', '10', '10', '4', '', '', '', '', '', '', '0', '10', 'Generación automática de documentación de los tld\'s', '0', '0', '', '434', '1254132238', '1', '1442928369'


    Thanks in advance

    Best Regards

    Antonio


    Share
    10 |10000 characters needed characters left characters exceeded
    Antonio Astudillo Feb 20, 2017 at 09:58 AM
    0

    Hello Halder and everyone,

    Does everybody know the answers to my questions ?

    Thanks in advance.

    Best Regards

    Antonio

    Share
    10 |10000 characters needed characters left characters exceeded
    Soumyadeb Halder Feb 21, 2017 at 12:44 PM
    0

    Hello Antonio

    Select all the required fields like (ID_Incidencia, ID_Categoria, ID_Estados, ID_Prioridad, ID_Severidad etc.) from mantis_bug_table.

    Remove all the dimension tables from Join Condition.

    Keep the SQL table selections as earlier and keep the join condition as earlier. only below.

    SQL_bug_table.id = SQL_mantis_bug_history_table.bug_id and SQL_bug_table.id = mantis_custom_field_string_table.bug_id

    Your Dataflow staructre should be like 'DataFlow_Structure_To_Be_1.jpg'.

    Please check the two picture of 'DataFlow_Structure_To_Be_2.jpg' to know the Lookup_Ext and 'DataFlow_Structure_To_Be_3.jpg' to know Validation.

    Your target table HE_Incidencia, column analysis: -----------As per my view--------------------------------------------

    ID_Incidencia: This will be ID from mantis_bug_table.

    ID_Categoria: This will be category_id from mantis_bug_table. This category_id needs to be used in lookup_ext function.

    ID_Estados: This will be status from mantis_bug_table. This status needs to be used in lookup_ext function

    ID_Prioridad: This will be priority from mantis_bug_table. This priority needs to be used in lookup_ext function for table DM_PRIORIDAD for ID_PRIORIDAD. If matched then write the data into Target else ingore the row.

    ID_Severidad: This will be Severity from mantis_bug_table. This Severity needs to be used in lookup_ext function

    ID_Usuario: This will be reporter_id from mantis_bug_table. This reporter_id needs to be used in lookup_ext function FC_Fecha_Actualizada(ID_Tiempo): ....

    FC_Fecha_Modificada(ID_Tiempo): ....

    FC_Fecha_Grabacion_Incidencia(ID_Tiempo): ....

    DS_Resumen: You already mapped this column. CD_Modulo_Tecnico: Do Mapping as per your requirement.

    NM_Incidencia: You already mapped this column as BUG_ID from mantis_bug_table. I am not sure about this column.

    Please let me know if this is your answer else, may be I am not getting your question.


    Share
    10 |10000 characters needed characters left characters exceeded
    Antonio Astudillo Feb 28, 2017 at 12:06 PM
    0

    Hello Halder,

    Thank you very much for your answer.

    I was very busy with others proyects.

    Let me ask you please .... You wrote:

    "ID_Prioridad: This will be priority from mantis_bug_table. This priority needs to be used in lookup_ext function for table DM_PRIORIDAD for ID_PRIORIDAD. If matched then write the data into Target else ingore the row."

    I do not understand very well using lookup ext function how do you match "priority" from BUG_TABLE with "ID_ PRIORITY" from Priority Dimension TABLE ?

    Please look image red marked ..........Does the match exists because of this comparison of equality ID_PRIORIDAD = BUG_TABLE.priority ?

    Halder , You wrote :

    Remove all Dimensions from Data Flow but what happen with Time Dimensión ?

    I do not understand in your Data Flow How does Mantis_Bug_Table connect with Query transformation? You have deleted the SQL transformation True ?

    Thanks in advance.

    Best Regards

    Antonio


    Share
    10 |10000 characters needed characters left characters exceeded