on 04-08-2015 11:54 AM
HI Experts,
I would require some guidance from you. I have a situation whereby, I have to create multiple XML target file based on records.
The job in my hand is to extract the customer master from ECC and load it into XML files. But the catch is I need to create one XML file per customer. ie if my source has 500 customer then i have to create 500 XML and each XML file should contain all the information for that customer.
Can anyone guide me how to split the target XML in a single job. i.e. One XML per record of source data.
You can build a data flow that processes one customer at a time. Include it in a loop to process all customers individually.
Not very performing, I am afraid, but it will work.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Dirk,
I too understand that this procedure will have performance issue as there is 2 lakh records and will take ages for the job to get completed. There are few more doubts to clarify in this:
1) If source file has 2 lakh records, how can we take only one record from it and create an XML?
2) If we include the data flow in a loop, will it create different XML files or overwrite on the same XML.
3) As you suggested this will lead to performance issue, is there a way around for it. As creating a single XML for 2 lakh records and 80 columns for each record, will lead to a lengthy XML file which will run to crores of lines. Can we make the XML to split automatically at a given interval say after every 1000 records.
Do help me out with these question. Eagerly waiting for a reply.
Hi,
Here looping means not only looping for each customer , here we will generate one file for each customer.
load all your customer ids into one table with an additional primary key seqence number.
then pick each cutomer in the while loop by incrementing the counter and each time you process the DF, assgin the XML filename by the customername/ID to a variable and assign that variable to XML file name in options of XML target. it will generate 500 XMLs for you.
try all kinds of performance tuning methods like DOP,distribution at DF level ,more job servers... it migth help.. otherwise as dirk said.. it will take time for 500 & 200000 records.
If you can figure out a way,let us know.
1/. The process of selecting a single record is independent of the number of records, be it 500 or 200K:
2/. Use another global variable for the XML file name that you modify in function of $G_Cust.
3/. You can process records in groups of any size ($G_groupCnt) and create a file per group by adding a sequence number to your input stream:
Your logic seems ok to me at first sight. This job will generate an XML file for every record in the source table (if the query transform does not contain any filters).
What's the data type of your Employee ID column? If it's numeric you should use square brackets instead of curly ones in the where clause of the first line in the increment script.
Hi Dirk,
Thanx for responding. My Employee ID is of Varchar type.
Secondly, in the target XML template, while defining the file name, I've not given any path as shown in the above XML Target screenshot. Can you help me in finding where does these file get saved if we don't define any path.
Thirdly, when I define a path and then use the global variable ($G_FILENAME) as the filename, shown in the below screenshot.
After, execution I can only see one file in the define directory/path i.e. $G_FILENAME with one record that is the last record that is processed. Can you please help me to understand why this is happening.
In the above screenshot you can even see that when I apply the same logic and file path with flat file its giving the desired result but not with XML. Can you please help me to identify where am i going wrong.
XML file
That's because you've specified the name of the File as '\Users\Administrator\PERSISTENCE_CACHE\Desktop\$G_FILENAME' and checked Delete and recreate file.The global variable is not evaluated (as it's part of a string) and files are all generated with the same name, overwritten! SO only the last one remains.
Make sure you assign the full path name to $G_FILENAME and use the variable only in the XML target File name. That will do the job for you!
User | Count |
---|---|
93 | |
10 | |
10 | |
9 | |
9 | |
7 | |
6 | |
5 | |
5 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.