cancel
Showing results for 
Search instead for 
Did you mean: 

How to use a Match Transform for comparisson 1 to N?

Former Member
0 Kudos

Hi everyone,

When u use the Match Transform, it's true that I have the chance to use one inflow or two, but the way it works is the same (at least, that's what I was able to get).

For example, if I wanna compare a few similar records like this (Table 1):

IDCONTENT
1JUAN
2JUANA
3JAN

With the records of a master table (Table 2) where there can be similar records too, like this:

IDCONTENT
50JUAN
51JUANA


The Match stage (obviously, deppending on the criteria you use) group all of them in one single group and this is a problem for me because, for example, I can't have proper similarity percentages.

What I would like to get is a comparisson 1 to N (from the first table to the second one). Something like:

TABLE1.IDTABLE1.CONTENTTABLE2.IDTABLE2.CONTENTSIMILARITY PERCENTAGE
1JUAN50JUAN100%
1JUAN51JUANA80%
2JUANA50JUAN80%
2JUANA51JUANA100%
3JAN50JUAN80%
3JAN51JUANA60%

Is that possible? How would I have to configure the Match Transform to do that?

Thank you in advance and best regards,

José

Accepted Solutions (0)

Answers (3)

Answers (3)

Former Member
0 Kudos

I thought about the workaround you are suggesting but, appart from that maybe it doesn't seem to be so optimal, also, calling it in a loop causes "a violation of Primary Key" in a system table created and managed by the own SAP DataServices during the execution, as you can see in the post below:

Anyway, do you know if there is a way to use/configure the Match Transform (or another transformation method) so that it doesn't make matching groups (but just an association 1-N between the input data and the master table records for each input data record to get pure similarity percentages and be able to obtain the best option from the master table)?

Thanks again and best regards,

José

former_member187605
Active Contributor
0 Kudos

As already mentioned, I am not aware of any other solution matching () your needs.

And the post you're referring to claims not including Post Match Group Statistics would avoid the PK violation.

former_member106536
Active Participant
0 Kudos

several things going on here, and when it comes to matching I think its always best to present the entire picture.  Saying that you're using break keys doesnt tell me much.  The break key drives the candidate selection.  If you have left over records from one DSID then that means those records didnt match anything.  If you want to eliminate some of those records, that requires altering your break key or manually joining your break keys to that DSID.   At which point you could easily select just one record from your break key.  (which is probably a horrible idea)

You should be using a compare table to keep candidate records from bumping up against each other.  I dont know if you care about INTRA compare table records, but you could eliminate those compares as well.

Most of what you're asking for can be dealt with post match.  If you see records that are matching that you dont want to match,  tighten up your match.  We also often use gender codes to unmatch records.

Former Member
0 Kudos

Hi again,

Thanks for the answer, but I've used the Candidate Selection for the mode input data versus master table (I mean, the mode with two separated sources) and I don't really know how to set it to say "just take one record from the input data in the comparisson against the master table".

When more than one records from the input data are similar, they are grouped in the same group and I get groups with several records where I get more than one from the input data (and,obviously, more than one from the master table, which is correct).

So what I get with the Candidate Selection is just a priorization (by sorting) to get the "reference" record from the input data in the comparisson but I can't stop the matching transformation from taking more input data records in the same group.

By the way, I'm using break group keys so that records with certain fields different are not trying to be matched with each other (I don't know if that could be the problem).

In conclusion, I would like to know how to say in the Candidate Selection or any other configuration parameter that I want just one record from the input data (and one or more from the master table) in every matching group (and also, I would like to not get groups of just master records with no input data records in them because that is useless).

Thanks again and Merry Christmas,

José

former_member187605
Active Contributor
0 Kudos

As far as I know that's not possible.

A (time consuming) workaround would be to run your data flow in a loop, processing a single record from the source table at each iteration.

former_member187605
Active Contributor
0 Kudos

Yes, this can be done. Not so easy to explain in a few words but the approach is completely doumented. Check out 17.4.8.2.2 Candidate selection in the SAP Data Services Designer Guide.