Does HANA have a result cache

Dear HANA experts?

Does HANA have a result cache?

http://events.sap.com/teched/en/session/8596 (on 20.40)

"Prepare one execute often"

How i can identify that query used result cache data instead full execution (prepare plan, execute in engines, return results)?

SAP Managed Tags:
SAP HANA

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Accepted Solutions (0)

Answers (2)

HI experts,

I have a question.

Does HANA have another cache like data cache and procedure cache ?

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Hey Kazuki

As the memory is the primary data storage for HANA a "data cache" simply doesn't make any sense doesn't it?

I am not too sure about the exact job of your 'procedure cache' but you may simply check system view M_CACHES to find the currently used caches.

A rather often used cache is e.g. the hierarchy cache, which is used whenever one of the "../hier"-views is queried.

- Lars

Hi Lars,

If the result set is cached, it most likely be in aggregated form vs. the base data in memory being more granular. I'm assuming the cache would avoid the need to aggregate the granular data and utlimately resulting in faster query execution. So wouldn't this free up workspace memory during query execution as the "aggregaton" work wouldn't have to be executed each time? If many users are running the same query, couldn't the workspace consumption for aggregation be large?

Also, can the OLAP cache on the BW application server also provide the same cache functionality?

Regards,

Dae Jin

The result cache does keep query results readily available just as they were requested.

If the query performed aggregations, then the data is aggregated.

I am pretty sure that unloading any of the required tables/columns for the query would lead to cache invalidation.

But sure enough: when many users want to use the very same result of this query and actually send their query in a compatible way that allows for cache-reusage then they can benefit from the already computed result set.

In fact: this is what the result cache is all about.

Concerning the OLAP cache comparison: BW on HANA (and BW in general) does a lot more than just sending SQL queries to the underlying database. The OLAP layer performs major computations.

The OLAP cache actually stores these computation results, which makes the OLAP cache the fastest caching option for BW on HANA. This will even skip the database access alltogether.

- Lars

Hi Mikhail,

Sure, HANA does have a possibility of having resultset cache. The SAP Note http://search.sap.com/notes?id=0001833049 (although related to the distributed landscape) mentions the parameter in indexserver.ini, section [cache], resultcache_enabled = yes.

I am not sure if this setting, should be enabled as it may display the result from the cached resultset and may not always be correct, if the data gets updated.

I think by default the value is set to no, but you can check yourself.

Regards,

Ravi

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

( ! ) Realy - by default this parameter (resultcache_enabled) is no.

It's strange.

Interesting why?

How identify result_cache size? Any system views?

With the cached result set, the data changes will not be reflected unless the cache is invalidated. Although once the result set is cached, the subsequent execution of the queries requiring the same resultset will be faster, but such setting is not recommended for the correctness of the data.

Regards,

Ravi

Hi Mikhail, you can use the following SYS views to get some more information on result cache.

M_CACHE_ENTRIES - has the size of the Cache

M_CACHES - has some hit/miss counts

Playing with some similar functionality to improve repeated Explorer views, will let you know how it goes.

Regards,

Justin

Just to add: there are more caches in SAP HANA than the general result cache that get listed in these views.

MDX for example or hierarchies build up caches that can be monitored with these views.

Concerning the result cache in general: there are only a few usage scenarios that actually would benefit from it.

Either queries are fast enough anyway (so the benefit is just not there) or the data changes too quickly (the cached entries get invalidated).

That's why it is not active by default.

I thought about not including the cache in the diagram but put it in for the sake of a more complete picture.

- Lars

Just to add more onto this discussion. I was testing with the result cache on queries just to see what kind of performance bump it would offer.

What I saw is that it was pretty significant. One two separate queries that ran in 20s and 3s respectively, I saw the performance drop to about 15ms for each. Through a client tool (Explorer), I was able to get the initial load from 4s to .071s, which is incredible from a user perspective. Given there is probably tuning exercises that should be embarked on to make the model more perormant on its own, but this gives other options.

Before caching

After Caching

So looks like once cached, the speed is more or less fixed to retrieve the data set from the previously cached result. In my case, with testing queries of various size, it took about 15ms (at DB) to retrieve any of the results. As Lars said, for some use cases this might make sense, and certainly it does boost performance quite a substantial amount.

There are a couple parameters involved here that can help enable and control the use of ResultCache

1) indexserver.ini --> cache --> resultcache_enabled = yes

2) indexserver.ini --> cache --> resultcache_white_list (the parameter has to be added since it doesn't exist by default). This allows you to specifically name models that should have caching enabled, very important so that its not enabled systemwide.

3) indexserver.ini --> cache --> resultcache_maximum_value_size_in_bytes. By default size is 1,048,576 (1MB) by default.

If I could - Lars could you help shed some light or point me to any documents that may help answer the following?

1) Regarding the parameters above, with item #3, this seems to be a PER query cache limit. I observed that the total cache size (SELECT * FROM "SYS"."M_CACHES"), that this continues to increase past the limit, so my assumption is that it is per query. Is there a way to limit the global resultcache size? Furthermore, is there maintenance or cleanup activities that can be performed to delete these entries over time?

2) What actually causes a resultcache entry to be invalidated? Is it the merge of the underlying tables or some other activity? In your response, you mention that this may not be useful for data that changes too frequently, so I am just wondering what the invalidation looks like.

Anyway - hope this can be useful for someone. I can definitely see some strong benefit in some scenarios.

Regards,

Justin

Hey Justin,

good that somebody wraps up his sleeves and tries out the stuff

Not sure from the top of my head about the result cache size settings - I would have guessed that this would be a global size. Maybe it's again more something like a guideline then a strict rule (you know, if we have the memory, why not just use it until we need it for something else?).

The invalidation of the cache of course needs to depend on the queries or more precisely on the transactions the queries run in. If the underlying data had been changed in whatever way (technically speaking, if the maximum change transaction of the underlying data is newer than the one of the cache entries *and* the transaction that wants to read the data is also newer - very likely the case - then the cache cannot be used for this query).

If no transactions exist anymore that could potentially be interested in the cache data, the entries are free to be released (when ever that might occur - no hurry, though).

In layman terms: when you want to use the query cache on real time replicated data, you will only see invalid cache entries all the time, since the data is constantly changed.

As operational reporting models are one of the more tricky modelling challenges, where you would like to have something like that, it's a bit disappointing to not be able to leverage the cache for that.

On the other hand, when you load in a controlled way and can accept the slower first query execution for the cache filling, then it might be useful.

Really, it's *very* close to what the OLAP cache in SAP BW does.

Cheers,

- Lars

Hi Lars,

as you correctly state, "when you load in a controlled way and can accept the slower first query execution for the cache filling" caching is very useful indeed. A we load nighty on a daily schedule a function to reset the cache after load would be nice instead to reconfigure HANA manually. Is there such a function?

best regards, Nikolaus

Hi Nikolaus.

the result cache will be invalidated automatically - no need to do that manually.

If you want to re-fill the cache after your data load, well, simply run your query...

Somewhat unrelated I am wondering why you pick HANA for such a solution where you are actually only looking at old data? Loading once per day and preparing the query results can be done with any other platform - much cheaper ones.

I get the impression that HANA is not actually implemented to its full potential here...

- Lars

Hi Lars,

Thank you for your quick response. I see that i misunderstood the sentence about seeing invalid cache entries all the time when loading data.

btw: HANA does a great job, as we are joining tables with hundreds of millions up to more than a Billion lines in interactive sessions with near-instant response.

best regards,

Nikolaus

Hi Lars

I turned on indexserver->cache with

resultcache_enabled = yes

resultcache_maximum_value_size_in_bytes=100,000,000

resultcache_minimum_query_execution_time_in_milliseconds=100

After this I can see cache getting populated in M_CACHES. However, the hit count is always 0.

In M_CACHE_ENTRIES, I see numerous rows with my query (CALL ...proc name...). The rows have identical memory_size which is expected because they are executing identical queries. But no hits. Why is that?

I am using XSJS (on Rev 82) to execute a stored procedure from a prepared statement. The bound input parameters are (1) a local temp table which is filled with identical values in every call (2) several nvarchar.

Regards

Dinesh

ps: at the time of this test I am the only user on this system, if this matters.

Would you mind creating a new thread instead of resurrecting some old discussion?

Sure Lars, I have started a new thread at

Regards

Dinesh

Ask a Question

Top Q&A Solution Author

User

Count