cancel
Showing results for 
Search instead for 
Did you mean: 

Reading multiple records from an XML doc (redux)

Former Member
0 Kudos

A little over a year ago, I asked this question:

https://archive.sap.com/discussions/thread/3818944

Essentially, how do I read (or flatten) an XML document that has multiple tags into one or more input streams? I realize that the adapter to stream mechanism doesn't permit one adapter to send input to multiple streams, so I'm at a loss on how the XML based adapters are meant to be used. There's not much information in the docs and no usable sample code.

Say for example, an XML doc that looks like:

<?xml version="1.0" encoding="utf-8"?><br><InputRequest>
    <Protocol>
        <IP/>
        <Port/>
    </Protoco>
    <MessageDetails>
        <Length>0123</Length>
        <header>HEAD123</header>
        <routing></routing>
    </MessageDetails><br>    <ResponseDetails><br>        <Length>1234</Length><br>        <header>HEAD456</header><br>    </ResponseDetails><br>           .<br>           .<br>           .<br>       Many more tags<br></InputRequest>
How would I define an ESP schema to receive all the data? I've talked with the people who are sending this data, and they've agreed to include every tag that might possibly occur, even if it has no data (as shown above). Some child tag names will repeat when they're used in parent tags that have different names (e.g. Length and header above).

In the referenced question above, I was using the XML file adapter and the suggestion was to read the same file multiple times for different schemas. That really wasn't a workable solution. And for this project, I'm going to receive this data via the socket adapter, and it will only be sent once, so I have to get all the data in a single pass.

Please help!

Thanks,

Dan

Accepted Solutions (1)

Accepted Solutions (1)

JWootton
Advisor
Advisor
0 Kudos

Why the reluctance to engage SAP customer support? Is the process that burdensome? is it really significantly harder than posting in this forum?

Answers (4)

Answers (4)

JWootton
Advisor
Advisor
0 Kudos

Unless someone jumps in here to offer assistance, I suggest you open a support ticket for this

Former Member
0 Kudos

Yeah I suppose.

On the bright side, the 3rd party has agreed to restructure their XML output to use only one tag and put all the data in attributes. So no custom adapter.

I'll do further tests with the socket adapter to see if the defects listed above are troublesome enough to open a ticket.

Thanks again Jeff and Robert for your time and expertise.

Former Member
0 Kudos

Hi again,

I ran some tests with the 3rd party's XML and it isn't too hard to flatten it, so maybe I can convince them to do that and avoid the custom adapter route.

I was using the xml file adapter and not the socket adapter so maybe it behaves differently, but I have some questions about how the adapter is behaving. I'm running ESP 5.1 SP12 PL03.

1. Every time I start the adapter, it sends a record that has nulls in all the columns. If this is the same in the socket adapter, it's likely there will be only one record per xml doc, effectively doubling the input data. Is this a bug? Should I report it in an incident?

2. It seems the data can only be specified in attributes. Nested tags don't work. (look at data where ID=4)

3. A properly defined XML tag (I think) is not being sent to ESP (look at data where ID=3). Note the closing bracket following the attributes.

Thanks again for the suggestions and help. It's really helped clear my mind about how to proceed.

ESP Code:

CREATE SCHEMA S_xmlIn (
    ID string,
    Port string,
    Length string,
    header string
);

CREATE INPUT STREAM xmlIn SCHEMA S_xmlIn;

ATTACH INPUT ADAPTER XML_IN TYPE toolkit_file_xmllist_input to xmlIn
PROPERTIES 
	dir = 'C:/data' , 
	file = 'tx1Attr.xml'
;

tx1Attr.xml

<?xml version="1.0" encoding="US-ASCII"?>
<ThisWorks ID="1" Port="333" Length="0439" header="00121" />
<ThisWorks ID="2" Port="444" Length="0439" header="00121" </ThisWorks>
<ThisWorks ID="5" Port="444" header="00121" </ThisWorks>
<ThisDoesntWork ID="3" Port="555" Length="0439" header="00121"></ThisDoesntWork>
<ThisDoesntWork> <ID>4</ID> <Port>666</Port> <Length>0439</Length> <header>00121</header> </ThisDoesntWork>


RobertWaywell
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi Dan,

It sounds like you have a decent understanding of the functionality of the 'out-of-the-box' adapters and that they do not fit the requirements for the application that you are currently working on. In that case, the recommended approach is to build a custom adapter using your choice of the available SDK's (C/C++, C# .Net, Java). With a custom adapter you will be able to implement the XML parsing logic that you require and then connect to and publish to as many separate streams within the ESP project as needed.

Former Member
0 Kudos

Hi Robert,

Thanks for your (and Jeff's) response. I was hoping this wouldn't be the solution (see my reply to Jeff). I am sort of dreading the custom adapter route. But I've never built one before. Maybe it's not as bad as I'm fearing. Again, I need to go both input and output. I will look at the sample code for custom adapters. Is there any hope that source code for a socket based adapter is available? Is it hard to add GD support? Failover? Auto Restart? etc?

JWootton
Advisor
Advisor
0 Kudos

It doesn't sound like any of the pre-configured adapters will do what you need but you have several options here...

Let's start with the challenge of repeating elements. This is not so much as an adapter question but a CCL schema question. As you know, CCL requires a fixed schema for each input stream/window - i.e. fixed number of columns. So in a situation where the number of columns in each "event" will vary, you have a few choices:

1. Define an input stream with the max number of columns any event can have. You can input events that don't have all columns - missing columns will just be "NULL"
2. Define multiple input streams, each with a different schema, and "route" the event to the appropriate input stream

3. Pack multiple values in a single CCL string column, using a delimiter, and then in the CCL you can parse the string as needed

Now to the adapters: most of the pre-build adapters are based on the Adapter Framework. The adapter framework lets you combine transport modules, parsing modules, and ESP/SDS connector modules in different combinations to achieve the desired result. SDS and ESP ship with a set of modules and a pre-configured set of adapters. However, you can define many additional adapters without writing new custom modules, just by combining existing modules in different combinations.

Have look at the guide for building custom adapters. Modules of interest:

- XMLDoc parser - lets you parse an XML doc and map XML elements to CCL rows/columns

- ESPMultiStreamPublisher, which lets a single adapter publish to mulitple streams using filters to determine which events go to which input stream

In the end, it may not be possible to achieve what you want with existing modules, in which case you could consider writing a custom parsing module.

And the doc for the adapters, and adapter modules can be difficult to decipher (especially around these more complex modules) but there are examples included in download packages (location varies by package - sounds like you are using ESP. Poke around a little but somewhere in the ESP installation directory you should find a folder called adapters and if you drill down you'll find examples)
Former Member
0 Kudos

Hi Jeff,

I figure choice #1 is the most likely with the out of the box adapters. If I can get the company who sends the XML to flatten it out to a single tag where all the data is passed as attributes. The duplicate names can be blended with the names of the parent tag. That's a lot of work for them, but reduces the functional tests needed for a custom adapter.

I don't understand choice #2. How would the Socket Event XML Adapter be able to route the event to multiple input streams? Even so, I don't know if I want to break an event apart because it will have to be reassembled later when the response is generated.

Same with choice #3. I would need some sort of preprocessing task between the 3rd party app and the adapter to pack the data together. I am trying keep the number of processes as small as possible.

This isn't only an inbound issue either. The ESP project will generate a response based on pass/fail of a number of business rules and send it back to the XML adapter and then back to the 3rd party app.

It's starting to look like I'm going to have to build a custom adapter that flattens the inbound XML into a single ESP schema. I have written some Java apps that read/write XML, so I know what to expect, but I worry that it will end up being hardcoded to a specific XML format and tightly coupled to both the 3rd party app and the ESP project. Meaning new coding if data formats change.

Question: How are other users dealing with this? Most XML (and JSON) documents consist of different data structures combined together. It's an integral part of their purpose.