cancel
Showing results for 
Search instead for 
Did you mean: 

Creating Multiple XML target file

Former Member
0 Kudos

HI Experts,

I would require some guidance from you. I have a situation whereby, I have to create multiple XML target file based on records.

The job in my hand is to extract the customer master from ECC and load it into XML files. But the catch is I need to create one XML file per customer. ie if my source has 500 customer then i have to create 500 XML and each XML file should contain all the information for that customer.

Can anyone guide me how to split the target XML in a single job. i.e. One XML per record of source data.

Accepted Solutions (1)

Accepted Solutions (1)

former_member187605
Active Contributor
0 Kudos

You can build a data flow that processes one customer at a time. Include it in a loop to process all customers individually.

Not very performing, I am afraid, but it will work.

Former Member
0 Kudos

Hi Dirk,

I too understand that this procedure will have performance issue as there is 2 lakh records and will take ages for the job to get completed. There are few more doubts to clarify in this:

1) If source file has 2 lakh records, how can we take only one record from it and create an XML?

2) If we include the data flow in a loop, will it create different XML files or overwrite on the same XML.

3) As you suggested this will lead to performance issue, is there a way around for it. As creating a single XML for 2 lakh records and 80 columns for each record, will lead to a lengthy XML file which will run to crores of lines. Can we make the XML to split automatically at a given interval say after every 1000 records.

Do help me out with these question. Eagerly waiting for a reply.

Former Member
0 Kudos

Hi,

Here looping means not only looping for each customer , here we will generate one file for each customer.

load all your customer ids into one table with an additional primary key seqence number.

then pick each cutomer in the while loop by incrementing the counter and each time you process the DF, assgin the XML filename by the customername/ID to a variable and assign that variable to XML file name in options of XML target. it will generate 500 XMLs for you.

try all kinds of performance tuning methods like DOP,distribution at DF level ,more job servers... it migth help.. otherwise as dirk said.. it will take time for 500 & 200000 records.

If you can figure out a way,let us know.


former_member187605
Active Contributor
0 Kudos

1/. The process of selecting a single record is independent of the number of records, be it 500 or 200K:

  • initialise a global variable in a script before the loopt: $G_Cust = sql ('<Datastore>,'select min(custID) from Table');
  • loop as long as you find a next customer
    • use custID = $G_Cust in the where clause of the first query transform in your data flow
    • increment the global variable in another script: $G_Cust = sql ('<Datastore>,'select min(custID) from Table where custID > {$G_Cust}');

2/. Use another global variable for the XML file name that you modify in function of $G_Cust.

3/. You can process records in groups of any size ($G_groupCnt) and create a file per group by adding a sequence number to your input stream:

  • the where clause would become seqno >= $G_seqno and seqno < $G_seqno + $G_groupCnt
  • the increment: $G_seqno= sql ('<Datastore>,'select min(seqno) from Table where seqno >= [$G_seqno]+[$G_groupCnt]');
Former Member
0 Kudos

Hi Dirk,

Im able to split the file per record in flat file format using global variable $G_FILENAME.

Can you guide me how to use this in case of XML as whenever i'm defining the $G_FILENAME in the target xml its just giving one file with the last record.

former_member187605
Active Contributor
0 Kudos

Please double-check your code (or give more details so that we can do so), because there should be no difference. If your logic works for flat files, there's no reason it wouldn't for XML.

Former Member
0 Kudos

Hi Dirk,

Please find the below screenshot of job

workflow

Initialization Script

While LOOP

Increment Script

data flow

Target XML

Please help me with this.

former_member187605
Active Contributor
0 Kudos

Your logic seems ok to me at first sight. This job will generate an XML file for every record in the source table (if the query transform does not contain any filters).

What's the data type of your Employee ID column? If it's numeric you should use square brackets instead of curly ones in the where clause of the first line in the increment script.

Former Member
0 Kudos

Hi Dirk,

Thanx for responding. My Employee ID is of Varchar type.

Secondly, in the target XML template, while defining the file name, I've not given any path as shown in the above XML Target screenshot. Can you help me in finding where does these file get saved if we don't define any path.

Thirdly, when I define a path and then use the global variable ($G_FILENAME) as the filename,  shown in the below screenshot.

After, execution I can only see one file in the define directory/path i.e. $G_FILENAME with one record that is the last record that is processed. Can you please help me to understand why this is happening.

In the above screenshot you can even see that when I apply the same logic and file path with flat file its giving the desired result but not with XML. Can you please help me to identify where am i going wrong.

XML file

former_member187605
Active Contributor
0 Kudos

That's because you've specified the name of the File as '\Users\Administrator\PERSISTENCE_CACHE\Desktop\$G_FILENAME' and checked Delete and recreate file.The global variable is not evaluated (as it's part of a string) and files are all generated with the same name, overwritten! SO only the last one remains.

Make sure you assign the full path name to $G_FILENAME and use the variable only in the XML target File name. That  will do the job for you!

Former Member
0 Kudos

Hi Dirk,

Thanks for the reply. Currently I'm defining the $G_FILENAME in the below way:

$G_FILENAME = 'EMPLOYEE_ID_' || $G_EMPLID || '.xml';

Can you please guide me how to assign the full path name to the varialble.

former_member187605
Active Contributor
0 Kudos

$G_FILENAME = '\\Users\\Administrator\\PERSISTENCE_CACHE\\Desktop\\EMPLOYEE_ID_' || $G_EMPLID || '.xml';

Former Member
0 Kudos

HI Dirk,

Thanks a lot all worked well and got the desired output as below:

Thanks for being so supportive and prompt in the response.

Former Member
0 Kudos

Hi Dirk,

Could you please post a blog, how XML_Pipeline and XML_Map  works in detail.

Thanks,

Vasanth

Answers (0)