cancel
Showing results for 
Search instead for 
Did you mean: 

Need help on Match Transform

former_member498903
Participant

Hello Experts,

I need your help in setting up the match score and criteria to find the duplicate ranging from 75% to 100%

I have LFA1 table data, I need to identify the vendor duplicate names from the Name1 field.

I use the EnglishIndia_DataCleanse transform before match wizard to standardize the Name1 field data.

In Match strategy I am using simple match and the Match criteria on

1. Givenname1 and Givenname2

2. Givenname1

3.Familyname1

And the match Criteria automatically takes the following mapping which for me seems Ok.

1. Person_Givenname1 - Person_Givenname1_Standardize

     Person_Givenname1_Standardize_match_std1 - Person_Givenname1_Standardize_match_std1_standardized

     Person_Givenname1_Standardize_match_std2 - Person_Givenname1_Standardize_match_std2_standardized

     Person_Givenname1_Standardize_match_std3 - Person_Givenname1_Standardize_match_std3_standardized

2. Person_Givenname2 - Person_Givenname2_Standardize

     Person_Givenname2_Standardize_match_std1 - Person_Givenname2_Standardize_match_std1_standardized

3.Person_Givenname3 - Person_Givenname3_Standardize

     Person_Givenname3_Standardize_match_std1 - Person_Givenname3_Standardize_match_std1_standardized

Next step to create the break group - I created group for 3

     Person_Givenname1_Standardize

     Person_Givenname2_Standardize

     Person_Givenname3_Standardize

In the Edit options Match Set Name I kept as INDIVIDUAL

Now my match criteria parameters shows as:

Person1_Given_Name1 - 50 (contribution to weighted score)Match score 101 - No match score 79

Person1_Given_Name2 - 30 (contribution to weighted score)Match score 101 - No match score 79

Person1_Family_Name1 - 20 (contribution to weighted score)Match score 80 - No match score 79

When I execute the above job it gives me the match score resulting 100% records, where i need the records set ranging from 75% to 100% for only Name1 duplicate.

kindly suggest, which step I am loosing hence I am only getting 100% duplicate records whereas 75% duplicate Names are available in the records.

Any help would be much appreciated.

Regards,

Neil

Accepted Solutions (1)

Accepted Solutions (1)

venkataramana_paidi
Contributor
0 Kudos

Hi Neil,

By seeing your steps , I am suggesting to change these below steps.

You are using the weighted match method not the rule based method. That's why you need to change below options .

1. You have to set the Match score to 100  and No match score to -1 for all three.

2. You have to set the weighted match score to 75 in level options. Please check the screenshot.

I hope this will work .

Thanks & Regards,

Ramana.

former_member498903
Participant
0 Kudos

Hi Venkata,

Thanks for your suggestion.

I have made changes suggested but apart from that i had to do other changes as well in getting desired output.

But your post was surely a break through for my problem and I got the desired output.

Many thanks!

Regards,

Neil

Answers (1)

Answers (1)

former_member106536
Active Participant
0 Kudos

Incorporating match fields into your break key, negates the ability to utilize the match standard fields.

If theres nothing else to narrow down your possible candidates, you're probably going to want to use a substr of the family name as the break key. 

If you break on first/mid/last, even if theyre all substrings you're going to end up with about the same thing as simply doing an orderby, gen_row_num_by_group(break key).  If your break group is too granular, you un bucket a lot of potential matches.  ;(