Data Services XI 3.0 (188.8.131.52)
Just started using DS and thought I was doing something wrong, but it now appears to me to be a bug.
Background: The intention is to standardise the data coming from different sources so we want to generate a single Unique Reference Number (URN) from each source. Some sources have already got a single column unique reference and others have composite columns.
I create a data flow (let's call it dataflow_a) which essentially has a Flat File source, a query to read that flat file and the output goes to a template table (stored on Oracle 10g database) (let's call it output_a) which has "Make Port" checked as I require the data flow to be an embedded source in another data flow. The query does a little manipulation of the incoming flat data so that it can be read from the file (i.e. it needs rtrimming and there are some dates without times etc.) and the output columns are a mixture of "int", "varchar" and "datetime". The data in this has a composite key to identify unique records and that is made up of 4 columns... ref_no (int), ref_year (int), region (varchar), seq_no (int)
If this dataflow is executed (batch job) then the data is read into the template table ok. So far so good.
flatfile - - - - query - - - - output_a (port)
Now, to generate the single URN, dataflow_a is used as a source in another dataflow (let's call it dataflow_b). A query is attached to the output of dataflow_a and the columns mapped directly from the left side to the right side.
In addition, within the query, we add an Output column called "urn" for which we include a mapping of:
(I've split this onto seperate lines at the concatenation points to make it easier to read on the forum)
ltrim_blanks(to_char(dataflow_a.ref_no,'099999'))|| ltrim_blanks(to_char(dataflow_a.ref_year,'09'))|| dataflow_a.region|| ltrim_blanks(to_char(dataflow_a.seq_no,'09'))
The output is captured in another template table (output_b) on the same Oracle database.
dataflow_a - - - - query - - - - output_b
However, when I execute this job now, all the data is populated correctly apart from the URN column which shows...
for each row, with only the region changing with the data. The ref_no, ref_year and seq_no only show the format mask rather than the actual data.
Ok, so at first I thought I wasn't using to_char correctly, so I checked in the manual and it was identical to the examples given, it's documented that it works with "int" types and I also tested with some hardcoded numbers rather than the data columns just to check.
Now here's where it get's odd.
Within dataflow_b, if I put a second query pulling data from dataflow_a ...
dataflow_a - - - - query - - - - output_b \ \- - - - query - - - - output_c
... and I execute this job, then the output_b data suddently works correctly and I get the ref_no, ref_yr, region and seq_no concatenated correctly within it. If I then remove that second query and output_c from it and execute the job again, output_b reverts to showing the format masks rather than the data again.
Within the query for dataflow_b, if I change the mapping for the URN to read...
ltrim_blanks(to_char(dataflow_a.ref_no*1,'099999'))|| ltrim_blanks(to_char(dataflow_a.ref_year*1,'09'))|| dataflow_a.region|| ltrim_blanks(to_char(dataflow_a.seq_no*1,'09'))
ltrim_blanks(to_char(to_decimal(dataflow_a.ref_no,'.',',',1),'099999'))|| ltrim_blanks(to_char(to_decimal(dataflow_a.ref_year,'.',',',1),'09'))|| dataflow_a.region|| ltrim_blanks(to_char(to_decimal(dataflow_a.seq_no,'.',',',1),'09'))
then this also works ok.
So it would seem that there is an issue within the to_char function for it recognising the datatype of the source columns (which are int) even though the documentation says that to_char is happy to accept int. However that still doesn't explain why adding a second query makes it work.
Is it something I'm doing? Is is something anyone else can reproduce? Is it a known bug?
Your input is appreciated.
Edited by: Giles Cartmel on Sep 8, 2008 3:01 PM
Changed title to something a little more appropriate