on 05-24-2016 1:06 PM
Hi Experts,
Now this is something tricky and few possible things are not working out.
SCENARIO :
I have a table SFE_EXP_BODS_F. Suppose there are 5 columns in table :
1. DIVISION_NAME (varchar)
2. SPECIALITY_NAME (varchar)
3. CALDATE (date)
4. TIMES_EXPOSED (int)
5. PY_TIMES_EXPOSED (int) --> to be filled
CALDATE contains the value from 1 APR 2014 to 31 Mar 2016. (in blocks of month like 2014-04-01, 2014-05-01)
First four columns are coming from previous transformation but PY_TIMES_EXPOSED is getting calculated here in this query transform (via New output column).
LOGIC to be implemented for PY_TIMES_EXPOSED :
This column should get the same value as TIMES_EXPOSED column exactly 1 year back.
This means if a record is coming in this query transform with CALDATE = 2015-04-01, PY_TIMES_EXPOSED should get the value from TIMES_EXPOSED column of this table when CALDATE was 2014-04-01 and rest all fields value = present values of record.
If a record comes with CALDATE < 2015-04-01, then PY_TIMES_EXPOSED should be NULL because the data is from 01 APR 2014 onwards.
IMPLEMENTATION :
I tried devising a logic for this scenario and here's the mapping I used for PY_TIMES_EXPOSED in query transform :
ifthenelse(CALDATE > '2015-03-31',
sql('DRL_ORACLE_Prod', 'SELECT TIMES_EXPOSED from SFE_EXP_F_BODS where CALDATE = \'[add_months(Q_LOOKUP2.CALDATE, -12)]\'
AND DIVISION_NAME = \'[Q_LOOKUP2.DIVISION_NAME]\'
AND SFE_BRAND = \'[Q_LOOKUP2.SFE_BRAND]\'
AND SPECIALITY_NAME = \'[Q_LOOKUP2.SPECIALITY_NAME]\'
AND HQ_CODE = \'[Q_LOOKUP2.HQ_CODE]\'
AND CLASSIFICATION = \'[Q_LOOKUP2.CLASSIFICATION]\''), NULL)
Few other fields seen in above sql operation are just other dimension fields which needs to be matched to fetch TIMES_EXPOSED.
This mapping is not giving any error in validation and should have worked as per me but it is not working when the execution control reaches this query transform. 0 records are sent to target, which is after this query transform and job errors out. Error screen is attached. There's no ODBC connectivity issue.
Any workaround to achieve this functionality??? Please help me. It's very urgent.
Once try lookup_ext function instead of sql function.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Lookup_ext function worked up! However I had to use 'PRE_LOAD_CACHE'. It's not moving with 'NO CACHE'. Job takes around 700 secs. to complete 6 million lookups from a 8 million lookup table records.
I am still searching for a solution, which can achieve this a little faster.
Thanks a ton Shailesh. Really appreciate!
I tried using the condition suggested by you and it didn't work out. A record which existed only in Jan 2015 didn't get corresponding record in Jan 2016.
Here are my Join conditions :
Q1 (SFE_EXP_STAGE left join SFE_EXP_STAGE_1):
SFE_EXP_STAGE.CALDATE = add_months(SFE_EXP_STAGE_1.CALDATE, 12) and
SFE_EXP_STAGE.DIVISION_NAME = SFE_EXP_STAGE_1.DIVISION_NAME and
SFE_EXP_STAGE.SFE_BRAND = SFE_EXP_STAGE_1.SFE_BRAND and
SFE_EXP_STAGE.SPECIALITY_NAME = SFE_EXP_STAGE_1.SPECIALITY_NAME and
SFE_EXP_STAGE.HQ_CODE = SFE_EXP_STAGE_1.HQ_CODE and
SFE_EXP_STAGE.CLASSIFICATION = SFE_EXP_STAGE_1.CLASSIFICATION
Q2 (SFE_EXP_STAGE_1 left join SFE_EXP_STAGE😞
SFE_EXP_STAGE.CALDATE = add_months(SFE_EXP_STAGE_1.CALDATE, 12) and
SFE_EXP_STAGE_1.DIVISION_NAME = SFE_EXP_STAGE.DIVISION_NAME and
SFE_EXP_STAGE_1.SFE_BRAND = SFE_EXP_STAGE.SFE_BRAND and
SFE_EXP_STAGE_1.SPECIALITY_NAME = SFE_EXP_STAGE.SPECIALITY_NAME and
SFE_EXP_STAGE_1.HQ_CODE = SFE_EXP_STAGE.HQ_CODE and
SFE_EXP_STAGE_1.CLASSIFICATION = SFE_EXP_STAGE.CLASSIFICATION
Request you to review and suggest where are we missing.
Thanks
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Dirk Venken ,
I truncated few tables in DB and the Job ran. Now, here's what I did in job:
Took a source table X and it's replica X_1 for joining.
Before joining I put a filter on X_1 dataset. In query Q1 which receives X_1 input, I gave this in where clause :
X_1.CALDATE = add_months(sysdate (), -12)
Then Q1 output was joined with X tale. X left join Q1 with the join condition :
X.CALDATE = add_months(Q1.CALDATE, 12) and
X.HQ_CODE = Q1.HQ_CODE and
X.DIVISION_NAME = Q1.DIVISION_NAME and
X.SFE_BRAND = Q1.SFE_BRAND and
X.SPECIALITY_NAME = Q1.SPECIALITY_NAME and
X.CLASSIFICATION = Q1.CLASSIFICATION
Target was filled this time, with PY values for year 2015 and 2016 only - which looks good. But to check whether all PY column values got properly filled, I am doing a check :
For example SUM(PY_TIMES_EXPOSED) for APR 2016 should be equal to SUM(TIMES_EXPOSED) for APR 2015. Right?
Here I am NOT getting the same values. Trust, this is a big setback for me.
Please suggest further.
Thank you
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Yes you're right! See the scenario is I need to have all these 3 types in my target table. A record can be either of these types :
1. A record with TIMES_EXPOSED value and also PY_TIMES_EXPOSED value --> For example a record in May 2015 which existed in May 2014 also.
2. A record with TIMES_EXPOSED value and null PY_TIMES_EXPOSED value --> For example a record which came first time in any of months.
3. A record with null TIMES_EXPOSED value and non-null PY_TIMES_EXPOSED VALUE --> For example, a record which existed only in previous year but didn't come in succeeding year.
All records from APR 2014 to Mar 2015 will have some TIMES_EXPOSED value and null PY_TIMES_EXPOSED value. I MUST insert a record in target table corresponding to all such records in APR 2015 to MAR 2016, with TIMES_EXPOSED, PLAN_EXPOSURE as 0 but PY_TIMES_EXPOSED as TIMES_EXPOSED of 2014-15.
As per our present logic of Left join, we are unable to capture Case 3 and all those records with Valid PY values are not appearing in target table. At the front end, dashboard compares Present values vs. PY values on month by month basis.
Reply in your words :
If a combination of HQ_CODE, DIVISION_NAME, SFE_BRAND, SPECIALITY_NAME and CLASSIFICATION that doesn't exist in both years is the case, we need to insert a record in succeeding year with same value in these 5 columns but TIMES_EXPOSED value in PY_TIMES_EXPOSED column of succeeding year record. And measure columns like TIMES_EXPOSED, PLAN_EXPOSURE can be null or 0.
Please help. It has become pretty challenging to correctly build this table. Thanks a ton for your support on this case. Thanks man!
You need a full outer join then. See full outer join in data services | SCN for how to build it.
Thanks for the helpful information once again!! Here's how my job looks after implementing Full Outer Join :
I have put the same join conditions in both Q1 and Q2. Q1 has SFE_EXP_STAGE left join SFE_EXP_STAGE_1 and Q2 has SFE_EXP_STAGE_1 left join SFE_EXP_STAGE.
I want some clarifications :
1. I am using the same join conditions in both the Q1 & Q2. I mean in Q1 : SFE_EXP_STAGE.CALDATE = add_months(SFE_EXP_STAGE_1.CALDATE, 12). In Q2 : SFE_EXP_STAGE_1.CALDATE = add_months(SFE_EXP_STAGE.CALDATE, 12).
Do we need to change the CALDATE joining condition in Q2?
2. In Q1 output, I am selecting every column from SFE_EXP_STAGE except PY columns. PY_TIMES_EXPOSED = SFE_EXP_STAGE_1.TIMES_EXPOSED
3. In Q2 output, I am selecting every column from SFE_EXP_STAGE_1 except PY columns. Here PY_TIMES_EXPOSED = SFE_EXP_STAGE.TIMES_EXPOSED. I am mapping TIMES_EXPOSED, PLAN_EXPOSURE(measure columns) to 0 here.
I didn't do anything else. Selecting Distinct in 'DISTINCT' query transform. I still didn't get the Case 3 record. Means for example, I didn't get a record in Mar 2016 with PY values for a record which existed in Mar 2015 but don't exist in Mar 2016. I have to get this 2016 record with all measure values 0 except PY columns.
Please help.
Because that's exactly the condition you would use in an inner, a left outer and a right outer join (you want at least the common records in all 3 cases, don't you?). And that's how you simulate a full outer join:
select l.*, r.* from l
full outer join r
on l.id = r.id
and add_months(r.dat,12) = l.dat
is totally equivalent to
select l.*, r.* from l
left outer join r
on l.id = r.id
and add_months(r.dat,12) = l.dat
union
select l.*, r.* from r
left outer join l
on l.id = r.id
and add_months(r.dat,12) = l.dat
In DS design, that means 2 left outer joins with exactly the same condition, and a merge followed by a distinct (=equivalent to the union).
Some general "Best Practice" advice:
1/. Never use a sql function call in a mapping. This will generate a round-trip to the database for every record that passes thru the data flow.
2/ Use lookup instead of lookup_ext whenever possible. It's slightly more efficient. And, most importantly, use the correct cache option (PRE_LOAD_CACHE or DEMAND_LOAD_CACHE, to avoid database round-trips, too)!
3/. Test your logic on a limited set of records, not on the full input data set. Until you've got it right. If it works for 10 records, it will probably work for all.
4/. Try to read and understand my answers to a post, first. Chances are high I am giving the right solution .
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Really infromative reply!
Dirk, in my case, I am using lookup_ext as I have multiple conditional expressions. This lookup will be performed for around 6 million records. My lookup table has approx 8 Million records. and I am using PRE_LOAD_CACHE. Is it fine or should I go for DEMAND_LOAD_CACHE?
1/. You can use lookup, even with multiple conditional expressions. The function wizard generates one pair only, but you can manually add as many as you like.
Cf. SAP Data Services Reference Guide, section 6.3.91 lookup:
"
Syntax: lookup (<lookup_table>, <result_column>, <default_value>, <cache_spec>,<compare_column>, <expression>)
"
and
"
Note: You can specify more than one <compare_column> and <expression> pair—simply add additional pairs at the end of the function statement. The values must match for all specified pairs in order for the lookup function to find a matching row.
"
2/. If your lookup is 6M records, use DEMAND_LOAD_CACHE.
3/. If you want to speed up your data flow processing, your only option would be to use a left outer join, pushed down to the database. If your data flow is complex, use a Data_Transfer transform to store the result of previous transformations in a temporary table in the same database as your lookup table. Don't use a lookup function anymore, but left-outer-join your temp table with the (lookup) table (using exactly the same conditions as you used in the lookup). If your target table is in the same database, too, the full logic will be puhed down to the database (and typically run much faster).
Thanks Dirk for the suggestion! Though I wasn't able to figure out the syntax so easily from your reply but finally figured out by hit n try. So here's the exact way to write multiple conditional expressions :
ifthenelse(CALDATE > '2015-03-31',
lookup(DRL_ORACLE_Prod.DRL_MIS.SFE_EXP_F_BODS, TIMES_EXPOSED, NULL, 'PRE_LOAD_CACHE',
CALDATE, add_months(Q_LOOKUP2.CALDATE, -12),
HQ_CODE, Q_LOOKUP2.HQ_CODE,
DIVISION_NAME, Q_LOOKUP2.DIVISION_NAME,
SFE_BRAND, Q_LOOKUP2.SFE_BRAND,
SPECIALITY_NAME, Q_LOOKUP2.SPECIALITY_NAME,
CLASSIFICATION, Q_LOOKUP2.CLASSIFICATION),
NULL)
However the performance impact was not visibly different than using lookup_ext().
DEMAND_LOAD_CACHE option makes job to get stuck and it starts processing 100 records/5 secs. ONLY PRE_LOAD_CACHE drives job to completion either with lookup or lookup_ext. However there's a very strange thing happening and it is missing few lookup values to be captured.
Now the only solution to be tried in my case seems to use Left outer join. Let me try that!
I am unable to implement my logic by using either of methods now. Even Left join is not working and fetching all incorrect values.
Let's for say I have a table X with all information in fields like CALDATE, TIMES_EXPOSED etc. It has PY_TIMES_EXPOSED column as nulls till now.
I take another dataflow, in which I take table 'X' as source two times. I join the these two on these conditions :
X.CALDATE = add_months(X_1.CALDATE, -12) and
X.HQ_CODE = X_1.HQ_CODE and
X.DIVISION_NAME = X_1.DIVISION_NAME and
X.SFE_BRAND = X_1.SFE_BRAND and
X.SPECIALITY_NAME = X_1.SPECIALITY_NAME and
X.CLASSIFICATION = X_1.CLASSIFICATION
Now I fetch all columns from X table except PY_TIMES_EXPOSED. This I fetch from X_1 table. This is what you suggested right?
I am getting all wrong values of <PY> column everywhere.
As I have input data ONLY APR 2014 onwards till present month, I MUST NOT get any value in PY column for dataset between APR 2014 till MAR 2015. Right? It should return nulls in PY column for these dates. But this logic is inserting some value in these records also - which should NOT be the case.
PLEASE give me correct logic to implement my requirement as it is becoming a showstopper now. - Please take a look and help.
Thanks in sdvance!
Hi Dirk,
I tried with 12 only. Tried two times but job terminated after loading hardly 5,30,000 records with this error :
ODBC data source <MIS_ORACLE> error message for operation <SQLExecute>: <[SAP BusinessObjects][ODBC Oracle Wire Protocol
8744 1356 DBS-070401 5/28/2016 12:05:56 AM driver][Oracle]ORA-01653: unable to extend table DRL_MIS.T5 by 1024 in tablespace USERS>.|<Regular Load Operations>|<T5>
I something bad in logic again?
Hi Samarth,
It is very simple. What ever column (I mean Source column) you are mapping the TIMES_EXPOSED (4TH COLUMN in target) use the same source column for the 5th column (PY_TIMES_EXPOSED) WITH extra logic.
I think it will work.
Thank & Regards
Abdul Majeed Mohammed
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Include SFE_EXP_BODS_F twice as a source in your data flow. Use a left outer join. Join condition:
SFE_EXP_BODS_F. CALDATE = add_months(SFE_EXP_BODS_F_1. CALDATE, -12)
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
90 | |
10 | |
10 | |
10 | |
7 | |
7 | |
6 | |
5 | |
4 | |
3 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.