cancel
Showing results for 
Search instead for 
Did you mean: 

In-Memory Developer Center HANA Servers Outage

Former Member
0 Kudos

Dear Developer Center users,

It took us a month to get to our first major system outage - some of our HANA servers lost the connection to their NAS and stopped working. That's what I would do if I was a database and couldn't find my data files. Even for an in-memory database, that's no good...

We are investigating the issue and I'll reply to this post when the systems are back up.

thanks for your patience and sorry for the inconvenience

--Juergen

Accepted Solutions (0)

Answers (3)

Answers (3)

RonaldKonijnenb
Contributor
0 Kudos

Hi,

Anybody facing errors when creating a new analytic or attribute view? Getting the following error:

[http://dl.dropbox.com/u/11677676/7-12-2011%2015-41-35.png]

tomas-krojzl
Active Contributor
0 Kudos

Hello,

see:

grant select on schema s0001432066 to _SYS_REPO with grant option

Tomas

Former Member
0 Kudos

Dear Developer Center users,

As of now, all systems are back up and running. We even have "good" backups. The bad news is that all data and all models created on DCB are gone. You will have to re-create everything. We apologize for the inconvenience.

If this means you need more time to evaluate HANA, feel free to send an email to inmemorydevcenter AT sap.com and we will extend your trial.

If this means your trust in HANA as a database is all gone, please re-consider - we really just had a vicious combination of hardware failure (the storage system that holds the HANA data volumes crashed) and lousy administration (I didn't review the backups for a while) lead to the loss of data. Add "no redundancy / high availability" because we're talking sandbox systems. and you know what happened and how to avoid it in production environments...

+++ Important Notice +++

We are still unsure about the NAS - the malfunction might come back. If you re-load data of build models, please do your own local backups by using the "Export..." function in the studio. We may have to replace the storage... For now, I am putting all systems back online, as they have been behaving nicely over the last 8 hours...

cheers,

--Juergen

Former Member
0 Kudos

Hello Juergen

Thanks for the updates.

I have few queries.

We even have "good" backups. The bad news is that all data and all models created on DCB are gone

a) Is the system fresh system as you suggest in previous post?

b) I cant see any Tables, are they gone too? -Schema's seem to be preserved.

c) Assuming the system is Fresh; is it possible / do you plan to import data-models from the back-ups? This maybe helpful for those who have done complex modelling. Or atleast provide a way to recover as reference.

Thanks

Anand

Former Member
0 Kudos

Hi Anand

a) yes, it's a fresh install with the initial dataset (SFLIGHT, UNITCONVERSION, EFASHION)

b) we re-created all users and schemas, but tables, views and models are gone

c) we have a copy of the data and log files and will see if we can reconstruct anything from it. I wouldn't count on it, though

--Juergen

Former Member
0 Kudos

Status update:

DCA, DCC, DCD are back. DCB not yet, seems the database got corrupted. I'll have smarter people than me look at it and update the status again...

--juergen

Former Member
0 Kudos

Hi Juergen,

is there a way to help?

thx,

greg

rajkumar_kandula
Active Contributor
0 Kudos

Dear Juergen,

As the DCB is still down, we are not able to work in our SAP Hana Sandbox Systems. So, in this scenario will our trail period gets extended or we have to lose these days of system outage..? Thanks in advance.

Regards,

Raj

Former Member
0 Kudos

Hi Raj

We will certainly extend the trial usage for anybody affected by the outage. Right now, our main priority is bringing the systems back up.

cheers

--Juergen

Former Member
0 Kudos

Hi Greg,

Thanks for the offer. Right now, all we can ask for is a little more patience. Let me give you - and everyone - a quick status update:

- good news is that all servers are showing "green" status right now

- bad news is that we lost all the data on DCB - the database was corrupted and unrecoverable. To make things even worse, we didn't have a good backup. We are doing daily backups, at least we thought so. After the update to HANA SPS3, things stopped working. We had to change the batch files to a new syntax (instead of ALTER SYSTEM BACKUP... you have to write BACKUP DATA USING FILE...) - but even after this change, the daily backups don't happen. Blame your friendly part-time dev center administrator (yes, that would be me) for not checking

- We are still investigation why the storage subsystem failed. Currently, a hardware malfunction cannot be ruled out, therefore the problem may come back. For this reason, I have not announced availability yet.

stay tuned...

--Juergen

tomas-krojzl
Active Contributor
0 Kudos

Hello,

bad news is that we lost all the data on DCB - the database was corrupted and unrecoverable. To make things even worse, we didn't have a good backup

Well if everyone on server lost the data and it cannot be returned - then maybe users might be distributed across all other SAP HANA databases. It will not return them their data but at least they will not have to wait for server to be declared stable...

(Just idea - do not kick me. :-D)

Tomas

tomas-krojzl
Active Contributor
0 Kudos

Hello,

another option could be to use last valid backup (if you have any backup from before the upgrade) and to reinstall HANA to original revision and repeat upgrade... This would at least return data to status from before the upgrade... But I guess all backups are overwritten by now... (This is always worst - to discover the all your recent backups are corrupted and older backups that were valid are expired. :-x)

Anyway I would be happy if you could share some experience gained from this issue. I believe that you are so far probably the only one who experienced SAP HANA database corruption. How exactly it looked like - what exactly you tried - are there some diagnostic tools like DBV on Oracle - etc.

Tomas

Former Member
0 Kudos

Friends,

The schema specific to my user id has vanished from the Catalog, any idea? I could see others schema but not mine

please advise.

thanks,

Tilak

tomas-krojzl
Active Contributor
0 Kudos

Hello,

The schema specific to my user id has vanished from the Catalog, any idea? I could see others schema but not mine

Well one option is that you are on DCB database which lost the data...

Another option is that not all objects are listed - see explanation from Juergen for similar issue:

I would assume your home schema is indeed existing, but not visible in the studio - in the current builds, the studio does not show all schemas in the system by default, and I have not gotten behind the algorithm it uses to choose the ones to show initially. You can choose Window -> Preferences -> Administration Console -> Catalog and check the "Fetch all database catelog objects". Or you can right-click the Catalog node in the Navigator view and set a custom filter. As filter, enter your S-UID plus all the tutorial schemas (e.g. SFLIGHT, UNITCONVERSION, EFASHION_TUTORIAL) you want to see - this is probably the most elegant option

See http://forums.sdn.sap.com/click.jspa?searchID=75312917&messageID=10868040

Tomas

Former Member
0 Kudos

Hi Tilak

I agree with Tomas - if you don't see your schema, it's probably an implicit filter. If your schema is empty, it's the complete data loss we experience on DCB (see other post in this thread).

--Juergen

Former Member
0 Kudos

Hi Tomas

I do indeed have a 2 week old backup of Rev 17. Due to the immense growth of the system over the last two weeks (we're not at more than 400 users), it's not really of big value for most of the DCB users and would cause further delays. We have decided to go back online with a fresh system.

Regarding your question about experiences: unfortunately, not much. After the outage, the DCB database did not start up, complaining about a failure in the statisticsserver. I tried deactivating the statisticsserver, but then the indexserver crashed with absolutely no message in the log. We did some further checks and we do have a copy of the data files, but loading these 80GB from Santa Clara to Walldorf is challenging.

We'll wait to hear the root cause of the failure - if it's hardware malfunction, there's no complaining about HANA. If it turns out to be not hardware-related, we'll can have the HANA dev team check the data, config and log files for inconsistencies. As this would mean weeks of outage, we decided to put the dev center back online as soon as possible...

--Juergen