cancel
Showing results for 
Search instead for 
Did you mean: 

Semantics vs Aggregation node in Calculation View

former_member226419
Contributor
0 Kudos

Hi,

I am trying my hands on in Calculation view.

I have a small query. when we create any calculation view we can define MIN,MAX,COUNT  in semantics then what is the use of aggregation node ? If I define semantics by giving measure to one of the column and then apply projection or aggregation both will give the same result (that aggregation measure column based on other attribute columns ). Then how can we differentiate between the same i.e. projection and aggregation? Everytime i am trying to use projection or aggregation it gives me same result. I am very much confused. I tries searching some posts but not satisfied with the responses.


Can someone please clear my concept here?

BR

Sumeet

Accepted Solutions (0)

Answers (2)

Answers (2)

0 Kudos

Hi,

The major difference between Aggregation and projection node is , by adding a column as 'Add as aggregated column' you can do the aggregation.

But in projection you can just add a column as output .

So if you add a aggregation node in calculation view and if you want to do aggregation on any measure , you need to add that measure column as 'Add as Aggregated Column'.

If you just add that column as 'Output' , In data preview , you will not see any aggregation done on result.

In projection , you only have option to add column as Output.

Regards

Vishal Jadhav

lbreddemann
Active Contributor
0 Kudos

Hmm... you're not satisfied with the responses... ts ts ts...

Ok, answer these:

what is the projection used for?

what is the aggregation used for?

Leave beside whatever you do with the semantics settings for now and find out what the distinct features of the two node types are.

Can you e.g. filter in an aggregation node?

Can you sum up measures in a projection node?

- Lars

former_member226419
Contributor
0 Kudos

Hi Lars,

Yes what my understanding says here as below. I am not saying anything about semantics.

behaviour of both

Projection: Used to filter out the fields from table/view.

Aggregation: Used to sum,count,count the records from table. ( which same semantics will also do.. )

But now below is the scenario.

In semantics I have defined one measure and 2 attributes ( Amount Sold as measure, week ID and year as attribute ) .

Now below are my observations.

1.) Projection_1: I have used to filter out the fields from sales table.

2.)  Aggregation or if I give projection: Both will yield the same result based on semantics definition. If I give all the fields as attributes both projection as well as aggregation give same result i.e. no addition of records for amount sold else if i declare amount sold as measure it will add up...But again behaviour of both projection and agg is same.. Why? how can we distinguish between both? I cannot see use of aggregation node as such.

BR

Sumeet

lbreddemann
Active Contributor
0 Kudos

Seriously, I have issues to get your point.

With Projection nodes you don't aggregate data and you can filter data.

Aggregation nodes don't filter any data but you can defined which columns should be aggregated.

The semantics definition does not change that.

That's very easy to reproduce.

Your observation on the other hand I cannot reproduce.

former_member226419
Contributor
0 Kudos

Hi Lars,

I got it. While aggregation I havnt seen the option of ' Output as aggregated column' and thats why both giving me same result ( Projection as well as aggregation).

Now I have one more query.

Whenever i apply any node proj or aggr i can see the option of apply filter for a particular field irrespective of aggregation or projection.

So my query is we can use aggregation only every time if we want to filter records them then what's the use of projection?

We can solve our  purpose with Aggregation only every time.

BR

Sumeet

lbreddemann
Active Contributor
0 Kudos

The final aggregation node does not come with the filter option.

So, no you cannot use it instead.

Beyond that, the final node in the model basically defines the general type of view.

If you want to have a flat data output with no automatic aggregation along the output columns, the projection node is used.

As soon as you work with something that resembles a cube, you'll need the aggregation.

Also, aggregation works by grouping records. That makes them unique in the output.

The projection doesn't do that.

That's very similar to the SQL behavior of queries with and without GROUP BY clause.

- Lars

former_member226419
Contributor
0 Kudos

Hi,

Thanks for your suggestions.It is really helpful to understand concepts for person from non BI background.

Below are my observations.

a.) Final node with semantics will come default with aggregation since as per performance t will be better if we aggregate data if we are using cube else it will act as a general node if we are not giving any measures and don't have filter option.

b.) Diff between aggregation vs projection is that projection will have only apply filter option and dont have '.output as aggregated column' and latter have the same and if we don't want to aggregate we can go for Projection view only.

I have one more query. I worked on one sales report in which we need to join multiple sales table and to filter the data. I used calculation view and then used projection node and applied filters and on top of that applied joins. But after multiple projections and joins it gave me error ' Out of memory'.

Then I did the same in attribute view by simply added the tables in data foundation and applied joins and filters  and data retrieved very quickly.

I don't know why same not happened with Calculation view why its throwing 'Out of memory' error?

BR

Sumeet

lbreddemann
Active Contributor
0 Kudos

This is likely due to the way the calculation view _sometimes_ deals with joins.

In some cases it is not possible to break up the column views involved in a join and create an execution plan that would join the underlying tables directly without having to materializing the rows.

If this cannot be done, then intermediate tables are filled with the data "so far" and the join is then performed against these temporary tables.

And these temp tables also reside in memory.

That's partly why you typically don't want to join the outcome of analytic and/or calc views.

The other part of why you should be careful here is: semantics. With tables it's pretty straight forward to tell what it actually means to join two tables in your data model.

With aggregated and processed data this might not be as straight forwards - but you have to define it, since otherwise the result of the join doesn't mean anything...

Former Member
0 Kudos

Hi Lars,

Quite interesting facts and helpful.

Thanks