cancel
Showing results for 
Search instead for 
Did you mean: 

Reading multiple record formats from an xml file

Former Member
0 Kudos

I am trying to use the toolkit_file_xmllist_input adapter, and specifically the xmllistMatchStreamName property.  Here's what the manual says about it:

Property ID: matchStreamName (note the manual is missing the "xmllist" prefix)

Type: boolean

(Optional) If set to true, the XML element names are matched against the stream name. The adapter discards messages with unmatched values.

I take this to mean that if the xml file has multiple elements, this adapter will distribute their contents to appropriate input streams that have the same name as the element (if the streams exist).

I can't get this to work.  Also, when I try to perform discovery on this adapter, I get an Unable to perform discovery on adapter error.

Can someone please help?  See the test project and data below.  I would like the adapter to route data to the Employee and Address streams.

The main question here is, can I separate different records from a single xml file with this or another adapter?

Thanks in advance,

Dan

ESP CODE:

CREATE SCHEMA S_Employee (

  "type" string,

  "Name" string,

  Id string,

  Age string

);

CREATE SCHEMA S_Address (

  Id string,

  Street string,

  City string,

  State string,

  ZIP string

);

CREATE INPUT STREAM NEWSTREAM SCHEMA (Column1 INTEGER);

CREATE INPUT STREAM Employees SCHEMA S_Employee;

CREATE INPUT STREAM Addresses SCHEMA S_Address;

ATTACH INPUT ADAPTER employees TYPE toolkit_file_xmllist_input to NEWSTREAM

PROPERTIES

  dir = 'C:/Users/tarnower/Documents/SybaseESP/5.1/workspace/xmltest' ,

  file = 'employees.xml',

  xmllistMatchStreamName = TRUE

;

DATA:

<?xml version='1.0' encoding='utf-8'?>

<Personnel>

<Employees>

<Employee type="permanent">

<Name>Seagull</Name>

<Id>3674</Id>

<Age>34</Age>

</Employee>

<Employee type="contract">

<Name>Robin</Name>

<Id>3675</Id>

<Age>25</Age>

</Employee>

<Employee type="permanent">

<Name>Crow</Name>

<Id>3676</Id>

<Age>28</Age>

</Employee>

</Employees>

<Addresses>

<Address>

<Id>3674</Id>

<Street>123 Pine</Street>

<City>Anytown</City>

<State>XX</State>

<ZIP>00000</Zip>

</Address>

<Address>

<Id>3675</Id>

<Street>2323 Oak</Street>

<City>Anytown</City>

<State>XX</State>

<ZIP>00000</Zip>

</Address>

<Address>

<Id>3676</Id>

<Street>999 Maple</Street>

<City>Anytown</City>

<State>XX</State>

<ZIP>00000</Zip>

</Address>

</Addresses>

</Personnel>

Accepted Solutions (0)

Answers (1)

Answers (1)

former_member217348
Participant
0 Kudos

Hi Dan,

A couple of things:

  1. Modify the adapter type - for this xml file format, use the File/Hadoop XML Input Adapter (type toolkit_file_xmldoc_input).
  2. Correct the data itself - "<ZIP>" must terminate with "</ZIP>", instead of "</Zip>".
  3. Publish the data to the two streams using two separate adapters.
  4. See below for the CCL and Data.

Thanks,

Alice

CCL:

CREATE SCHEMA S_Employee (

  "type" string,

  "Name" string,

  Id string,

  Age string

);

CREATE SCHEMA S_Address (

  Id string,

  Street string,

  City string,

  State string,

  ZIP string

);

CREATE INPUT STREAM Employees SCHEMA S_Employee;

CREATE INPUT STREAM Addresses SCHEMA S_Address;

ATTACH INPUT ADAPTER employees TYPE toolkit_file_xmldoc_input to Employees

PROPERTIES

  dir = 'C:/esp/xmltest/data' ,

  file = 'employees.xml' ,

  xmlElemMappingRowPattern = '/Personnel/Employees/Employee' ,

    espColumnPattern = '/Employee/@type,Name,Id,Age' ;

ATTACH INPUT ADAPTER addresses TYPE toolkit_file_xmldoc_input to Addresses

PROPERTIES

  dir = 'C:/esp/xmltest/data' ,

  file = 'employees.xml' ,

  xmlElemMappingRowPattern = '/Personnel/Addresses/Address' ,

  espColumnPattern = 'Id,Street,City,State,ZIP';

DATA:

<?xml version='1.0' encoding='utf-8'?>

<Personnel>

<Employees>

<Employee type="permanent">

<Name>Seagull</Name>

<Id>3674</Id>

<Age>34</Age>

</Employee>

<Employee type="contract">

<Name>Robin</Name>

<Id>3675</Id>

<Age>25</Age>

</Employee>

<Employee type="permanent">

<Name>Crow</Name>

<Id>3676</Id>

<Age>28</Age>

</Employee>

</Employees>

<Addresses>

<Address>

<Id>3674</Id>

<Street>123 Pine</Street>

<City>Anytown</City>

<State>XX</State>

<ZIP>00000</ZIP>

</Address>

<Address>

<Id>3675</Id>

<Street>2323 Oak</Street>

<City>Anytown</City>

<State>XX</State>

<ZIP>00000</ZIP>

</Address>

<Address>

<Id>3676</Id>

<Street>999 Maple</Street>

<City>Anytown</City>

<State>XX</State>

<ZIP>00000</ZIP>

</Address>

</Addresses>

</Personnel>

Former Member
0 Kudos

Hi Alice,

This is close.  But there's a problem when one of the elements specified in the XPath expression doesn't occur in the data.  For example, if there is an expression that looks like:

<Address>

<Id>3676</Id>

<City>Anytown</City>

<State>XX</State>

<ZIP>00000</ZIP>

</Address>

(e.g. no <Street> value), then the adapter will skip that element.  I need the adapter to generate the row in the stream but have the Street column be null or empty.

The actual file I want to use doesn't include tags for empty values, so if I create a schema that has all the values, I doubt that it will ever read in a single row.

Is there anyway to get that to work?

Thanks.

Dan

Former Member
0 Kudos

Yes. In current implementation, if the XML element mapped to a column doesn't occur in the doc, the entire row will be ignored.

However, your use case makes sense. Maybe we can add an option to the XPath statement for a column for user to choose one choice from 'default value', 'null' or 'ignored'.