Skip to Content
avatar image
Former Member

Reading multiple record formats from an xml file

I am trying to use the toolkit_file_xmllist_input adapter, and specifically the xmllistMatchStreamName property.  Here's what the manual says about it:

Property ID: matchStreamName (note the manual is missing the "xmllist" prefix)

Type: boolean

(Optional) If set to true, the XML element names are matched against the stream name. The adapter discards messages with unmatched values.

I take this to mean that if the xml file has multiple elements, this adapter will distribute their contents to appropriate input streams that have the same name as the element (if the streams exist).

I can't get this to work.  Also, when I try to perform discovery on this adapter, I get an Unable to perform discovery on adapter error.

Can someone please help?  See the test project and data below.  I would like the adapter to route data to the Employee and Address streams.

The main question here is, can I separate different records from a single xml file with this or another adapter?

Thanks in advance,

Dan

ESP CODE:

CREATE SCHEMA S_Employee (

  "type" string,

  "Name" string,

  Id string,

  Age string

);

CREATE SCHEMA S_Address (

  Id string,

  Street string,

  City string,

  State string,

  ZIP string

);

CREATE INPUT STREAM NEWSTREAM SCHEMA (Column1 INTEGER);

CREATE INPUT STREAM Employees SCHEMA S_Employee;

CREATE INPUT STREAM Addresses SCHEMA S_Address;

ATTACH INPUT ADAPTER employees TYPE toolkit_file_xmllist_input to NEWSTREAM

PROPERTIES

  dir = 'C:/Users/tarnower/Documents/SybaseESP/5.1/workspace/xmltest' ,

  file = 'employees.xml',

  xmllistMatchStreamName = TRUE

;

DATA:

<?xml version='1.0' encoding='utf-8'?>

<Personnel>

<Employees>

<Employee type="permanent">

<Name>Seagull</Name>

<Id>3674</Id>

<Age>34</Age>

</Employee>

<Employee type="contract">

<Name>Robin</Name>

<Id>3675</Id>

<Age>25</Age>

</Employee>

<Employee type="permanent">

<Name>Crow</Name>

<Id>3676</Id>

<Age>28</Age>

</Employee>

</Employees>

<Addresses>

<Address>

<Id>3674</Id>

<Street>123 Pine</Street>

<City>Anytown</City>

<State>XX</State>

<ZIP>00000</Zip>

</Address>

<Address>

<Id>3675</Id>

<Street>2323 Oak</Street>

<City>Anytown</City>

<State>XX</State>

<ZIP>00000</Zip>

</Address>

<Address>

<Id>3676</Id>

<Street>999 Maple</Street>

<City>Anytown</City>

<State>XX</State>

<ZIP>00000</Zip>

</Address>

</Addresses>

</Personnel>

Add comment
10|10000 characters needed characters exceeded

  • Get RSS Feed

1 Answer

  • Nov 04, 2015 at 03:55 PM

    Hi Dan,

    A couple of things:

    1. Modify the adapter type - for this xml file format, use the File/Hadoop XML Input Adapter (type toolkit_file_xmldoc_input).
    2. Correct the data itself - "<ZIP>" must terminate with "</ZIP>", instead of "</Zip>".
    3. Publish the data to the two streams using two separate adapters.
    4. See below for the CCL and Data.

    Thanks,

    Alice

    CCL:

    CREATE SCHEMA S_Employee (

      "type" string,

      "Name" string,

      Id string,

      Age string

    );

    CREATE SCHEMA S_Address (

      Id string,

      Street string,

      City string,

      State string,

      ZIP string

    );

    CREATE INPUT STREAM Employees SCHEMA S_Employee;

    CREATE INPUT STREAM Addresses SCHEMA S_Address;

    ATTACH INPUT ADAPTER employees TYPE toolkit_file_xmldoc_input to Employees

    PROPERTIES

      dir = 'C:/esp/xmltest/data' ,

      file = 'employees.xml' ,

      xmlElemMappingRowPattern = '/Personnel/Employees/Employee' ,

        espColumnPattern = '/Employee/@type,Name,Id,Age' ;

    ATTACH INPUT ADAPTER addresses TYPE toolkit_file_xmldoc_input to Addresses

    PROPERTIES

      dir = 'C:/esp/xmltest/data' ,

      file = 'employees.xml' ,

      xmlElemMappingRowPattern = '/Personnel/Addresses/Address' ,

      espColumnPattern = 'Id,Street,City,State,ZIP';

    DATA:

    <?xml version='1.0' encoding='utf-8'?>

    <Personnel>

    <Employees>

    <Employee type="permanent">

    <Name>Seagull</Name>

    <Id>3674</Id>

    <Age>34</Age>

    </Employee>

    <Employee type="contract">

    <Name>Robin</Name>

    <Id>3675</Id>

    <Age>25</Age>

    </Employee>

    <Employee type="permanent">

    <Name>Crow</Name>

    <Id>3676</Id>

    <Age>28</Age>

    </Employee>

    </Employees>

    <Addresses>

    <Address>

    <Id>3674</Id>

    <Street>123 Pine</Street>

    <City>Anytown</City>

    <State>XX</State>

    <ZIP>00000</ZIP>

    </Address>

    <Address>

    <Id>3675</Id>

    <Street>2323 Oak</Street>

    <City>Anytown</City>

    <State>XX</State>

    <ZIP>00000</ZIP>

    </Address>

    <Address>

    <Id>3676</Id>

    <Street>999 Maple</Street>

    <City>Anytown</City>

    <State>XX</State>

    <ZIP>00000</ZIP>

    </Address>

    </Addresses>

    </Personnel>

    Add comment
    10|10000 characters needed characters exceeded

    • Former Member Former Member

      Yes. In current implementation, if the XML element mapped to a column doesn't occur in the doc, the entire row will be ignored.

      However, your use case makes sense. Maybe we can add an option to the XPath statement for a column for user to choose one choice from 'default value', 'null' or 'ignored'.