cancel
Showing results for 
Search instead for 
Did you mean: 

How HANA handle Out of memory (OOM)

Former Member
0 Kudos

Dear HANA experts.

How HANA handle OOM (Out of memory) situations,  could you provide more details about this?

Accepted Solutions (0)

Answers (5)

Answers (5)

Former Member
0 Kudos

As i see  In rev 72 situation more better, but there are still possibilities for OOM.

OOM situations - is the one of the main HANA exploits.

And no matter how memory you can - 1TB, 2TB, 8TB.

Hope HANA dev team working to change situation.

Former Member
0 Kudos

We opened ticket at SAP to clarify this and waiting news:

==--==-==-

If OOM is hapened  and indexserver service restarted does all current sessions on server will be  terminated ?

Former Member
0 Kudos

We've upgraded to the last rev 69 pl1.

Yesterday we got OOM because of DSO Activation. (this process very RAM-hungry).

After indexserver was automatic restarted we got a problems with backup logs (log backup process hanged). Only one  solution - restart the HANA.  Open Ticket about.

Colleagues, OOM handling in HANA - it's nightmare.  Hope SAP will can fix this situation, a make OOM handling not so fatal. We can't restart HANA scale-out every time when OOM occure.

Former Member
0 Kudos

Which revision of HANA are you running? On my system, the bad session gets killed. An index server restart is a bug, so please raise an OSS message if you get this.

John

Former Member
0 Kudos

on rev 67

> On my system, the bad session gets killed

On my too, but also killed all other session, that an issue.

We have opened couple OSSs.


Former Member
0 Kudos

Yes that's definitely a bug. I've seen this but only when running several large queries and they all bomb out of memory. Usually when I mess up the query and run something which materializes a vast amount of memory!

John

Former Member
0 Kudos

So I was doing some testing for an unrelated matter, to see what happens in OOM situations for HANA. In my instance I cause a large number (100+) very expensive queries to run.

Note that I am using HANA Rev.69 and there are a number of fixes in that for various scenarios. It's possible that our revisions behave differently.

These queries would need hundreds of TBs of memory to actually run, because they cause massive materializations. I can one or two of these concurrently. With 100 concurrent, we expect queries to fail.

During this, I start to run a few smaller queries which access a lot of data, but which would normally run in 3-4 seconds.

I find that all my big queries fail (expected) as follows:

* 2048: column store error: search table error:  [9] Error executing physical plan: Memory allocation failed;in executor::Executor in cube: _SYS_BIC:demo.rca.data/AV_BOC_RCA_TRANS_CUST SQLSTATE: HY000

* 2048: column store error: search table error:  [9] Error executing physical plan: Memory allocation failed;in executor::Executor in cube: _SYS_BIC:demo.rca.data/AV_BOC_RCA_TRANS_CUST SQLSTATE: HY000

* 2048: column store error: search table error:  [9] Error executing physical plan: Memory allocation failed;in executor::Executor in cube: _SYS_BIC:demo.rca.data/AV_BOC_RCA_TRANS_CUST SQLSTATE: HY000

* 2048: column store error: search table error:  [9] Error executing physical plan: Memory allocation failed;in executor::Executor in cube: _SYS_BIC:demo.rca.data/AV_BOC_RCA_TRANS_CUST SQLSTATE: HY000

But interestingly, I find that my smaller queries complete, but 5-6x slower than usual.

Statement 'SELECT GENDER, NAME, SUM(TXAMOUNT)/COUNT(TXAMOUNT) AS AVG_SPEND FROM ...'

successfully executed in 18.200 seconds  (server processing time: 18.198 seconds)

Fetched 211 row(s) in 2 ms 164 µs (server processing time: 0 ms 674 µs)

I think this is fairly good behavior on HANA's part? Hope this helps.

John

former_member184768
Active Contributor
0 Kudos

Hi John,

Thanks for very informative post. This would definitely help. If not too much to ask, can I request for one more thing.

Can you please also try a massive data load at the same time of concurrent executions. Such scenario has caused the indexserver to fail in my system. We are in process of upgrading to rev 69 and will also try the similar scenarios in the days to come.

Once again, thanks for sharing your experience.

Regards,

Ravi

Former Member
0 Kudos

So I do this regularly. I've not seen indexserver crashes in this instance even back to Rev.52

I'd make a few notes:

- You should only do bulk loads (40-80 threads) when there are no users on the system. Bulk loads are very resource-intensive and cause major latency problems in query execution because you have very large delta stores. With bulk loads you should expect to get around 1-5m rows/sec depending on the table width.

- If you want good concurrent query execution then try loading in a single thread. You will still get a reasonable load rate - 100k+ rows/sec but in my tests it has a negligable impact (10-20%) on query response.

In my tests we were doing as many as 250k inserts/sec using ESP (multiple threads to multiple tables). What are your requirements for concurrent loading and reporting?

Regards,

John

Former Member
0 Kudos

> I've seen this but only when running several large queries and they all bomb out of memory. Usually when I mess up the query and run something which materializes a vast amount of memory!

It's normal situation. Any possibility to hung indexserver must be eliminated.

Former Member
0 Kudos

Agreed - I really recommend you upgrade to Rev.69. Indexserver is better than ever. I talked to the dev team this week past and I heard there is more good news coming in this area - can't share the details.

Former Member
0 Kudos

ok, if it's so - good news.

Soon will be two releases:

  1.  Rev 69 Patch 1

  2.  SP7

Hope this indexserver changes will be in both releases.

Former Member
0 Kudos

69 would be nice, but i don't expect it 2 days after 68 became available to AWS mortals. i'm open to any pleasant surprises, however.

Former Member
0 Kudos

Sure. When you reach 95% memory, HANA unloads partition-columns in a Least Recently Used model.

In most instances we find that the partition-columns unloaded weren't being used so there's no impact.

If you have a situation where your working set of partition-columns plus your calculation memory required for the queries coming in exceeds available RAM, then bad things happen. Much like in any other computer software ever made 🙂

If you have a situation where your query consumes all available RAM, it will terminate with an error.

John

Former Member
0 Kudos

We have found that HANA have a strange behavior in OOM.

In classical RDBMS  there are sessions approah - if session feel bad - RDMS kill session.

In HANA there servers (indexserver) oriented approach.

==--==-==-

If OOM is hapened  then indexserver service restarted and all current sessions on server will be  terminated.

So you're loosing all.  Imagine that you're have 1000+ sessions in HANA ERP and in one session you have OOM. You're loosing all - and hearing disastrous screaming of your users.

SAP seems need to have more failure-surviving OOM handling approach in SAP HANA.