1 2 3 48 Previous Next

SAP HANA and In-Memory Computing

706 Posts
Jenny Lee

HANA Health Check

Posted by Jenny Lee Oct 31, 2014

     For this blog, I would like to focus on some basic health checks for HANA. These checks can give you a good idea of how your HANA system is performing. We will go through some SQL statements and the thresholds to determine what the status of your HANA system is in. To know how the HANA system is performing, it can allow us to plan ahead and avoid unnecessary system disaster.

 

 

 

System Availability:

 

The following query shows you how many time each service was restarted in the specified hour and date within the analyzed period.

select  to_dats(to_date("SNAPSHOT_ID"))AS "DATE", hour("SNAPSHOT_ID") AS "HOUR",
SUBSTR_BEFORE(T1.
INDEX,RIGHT(T1.INDEX,  5)) AS "HOST"RIGHT(T1.INDEX,5)AS "PORT",  T2.SERVICE_NAME, count ("ALERT_ID") AS "NUM_RESTART" from "_SYS_STATISTICS"."STATISTICS_ALERTS" T1 JOIN "SYS"."M_VOLUMES"
T2
ON SUBSTR_BEFORE(T1.INDEX,RIGHT(T1.INDEX,5))=T2.HOST AND RIGHT(T1.INDEX, 5)=T2.PORT   WHERE ALERT_ID = '004'  AND
SNAPSHOT_ID >= add_days(now(), -14)
GROUP BY to_date("SNAPSHOT_ID"), hour ("SNAPSHOT_ID"), SUBSTR_BEFORE(T1.INDEX,RIGHT(T1.INDEX,  5)), RIGHT(T1.INDEX,5), T2.SERVICE_NAME ORDER BY to_date("SNAPSHOT_ID") DESChour ("SNAPSHOT_ID") DESC

 

STATUS

THRESHOLDS

RED

Name server is not running

Name server/ Index server had 3 or more restarts in the analyzed period

YELLOW

Statistics server is not running

Name server / Index server had up to 2 restarts in the analyzed period

Remaining servers had 2 or more restarts in the analyzed period

GREEN

All other cases

 

The example below shows that this standalone test system got restarted 1 time on October 22nd, 2 times on October 21st at around 11pm and another 2 times at around 10pm. In total, there are 3 restarts of the indexserver and nameserve in the analyzed period. If the nameserver is currently not running, then this will be rated as RED. To find out rather the database is restarted manually or due to some other reasons, you may go to index server and name server traces to get more information. If you need further assistance, please consider opening an incident with Product Support.


systemAvail3.png

 

Top 10 Largest Non-partitioned Column Tables (records)

The following query displays the top 10 non-partitioned column tables and how many records exist in each.

 

SELECT top 10 schema_name, table_name, part_id, record_count from SYS.M_CS_TABLES where schema_name not LIKE '%SYS%' and part_id = '0' order by record_count desc, schema_name, table_name

STATUS

THRESHOLD

RED

If tables with more than 1.5 billion records exist.

YELLOW

If tables with more than 300 million records exist.

GREEN

No table has more than 300 million records.


In the threshold chart, it shows that if the column table has more than 300 million records; then it is in yellow rating.This is not yet critical with regards to the technical limit of 2 billion records but you should consider partitioning those tables that are expected to grow rapidly in the future to ensure parallelization and sufficient performance. For more information, please refer to the below SAP Notes or the SAP HANA Administration Guide.

 

Useful SAP Notes:

- 1650394  - SAP HANA DB: Partitioning and Distribution of Large Tables

- 1909763 - How to handle HANA Alert 17: ‘Record count of non-partitioned column-store tables’

 

Top 10 Largest Partitioned Column Tables (records)

This check displays the 10 largest partitioned column tables in terms of the number of records.


select top 10 schema_name, table_name, part_id, record_count

from SYS.M_CS_TABLES

where schema_name not LIKE '%SYS%' and part_id <> '0'

order by record_count desc, schema_name, table_name

STATUS

THRESHOLD

RED

If table with more than 1.9 billion records exist.

YELLOW

If table with more than 1.5 billion records and below 1.9 billion records exist.

GREEN

No table has more than 1.5 billion records.

 

The recommendation is to consider re-partitioning after it has passed 1.5 billion records as the technical limit is two billion records per table. If table is more than 1.9 billion records, then you should do the re-partitioning as soon as possible. For more information, please refer to the below SAP Notes or the SAP HANA Administration Guide.

 

Useful SAP Notes:

-   1650394  - SAP HANA DB: Partitioning and Distribution of Large Tables

 

Top 10 Largest Column Tables in Terms of Delta size (MB):

This check displays the 10 largest column tables in terms of the size of the delta and history delta stores.


select top 10 schema_name, table_name, part_id, round(memory_size_in_main /(1024*1024),2), round(memory_size_in_delta/(1024*1024),2), record_count, RAW_RECORD_COUNT_IN_DELTA from SYS.M_CS_TABLES

where schema_name not LIKE '%SYS%'

order by memory_size_in_delta desc, schema_name, table_name

STATUS

THRESHOLD

RED

MEMORY_SIZE_IN_DELTA >10 GB

YELLOW

MEMORY_SIZE_IN_DELTA >=5 GB AND <=10 GB

GREEN

MEMORY_SIZE_IN_DELTA < 5 GB

 

The mechanism of main and delta storage allows high compression and high write performance. Write operations are performed on delta store and changes are taken over from the delta to main store asynchronously during Delta Merge. The column store performs a delta merge if one of the following events occurs:

- The number of lines in delta storage exceeds the specified limit

- The memory consumption of the delta storage exceeds the specified limit

- The delta log exceeds the defined limit

 

Ensure that delta merges for all tables are enabled either by automatic merge or by application-triggered smart merge. In critical cases trigger forced merges for the mentioned tables. For more detail, please refer to the following SAP Note or the SAP HANA Administration Guide.

 


Useful SAP Notes:

-1977314 - How to handle HANA Alert 29: 'Size of delta storage of column-store tables

 

CPU Usage:

To check the CPU usage in relation to the available CPU capacity, you can go to the Load Monitor from SAP HANA Studio.

STATUS

Header 2

RED

Average CPU usage >=90% of the available CPU capacity

YELLOW

Average CPU usage >=75% and < 90% of the available CPU capacity

GREEN

Average CPU usage < 75% of the available CPU capacity

hostCPU.png

 

The Load Graph and the Alert tabs can provide the information of time frame of the high CPU consumption. If you are not able to determine the time frame because the issue happened too long ago, check the following StatisticsServer table which includes historical host resource information up to 30 days:

 

"_SYS_STATISTICS"."HOST_RESOURCE_UTILIZATION_STATISTICS"

 

With the time frame, you may search through the trace files of the responsible process as they will provide indications on the threads or queries that were running at the time. If the high CPU usage is a recurrent issue that is due to scheduled batch jobs or data loading processes, then you may want to turn on the Expensive Statements trace to record all involved statements. For recurrent running background jobs like backups and Delta Merge, you may want to analyze the two system views: "SYS". "M_BACKUP_CATALOG" and "SYS"."M_DELTA_MERGE_STATISTICS" or "_SYS_STATICTICS"."HOST_DELTA_MERGE_STATISTICS"

 

For more information, please refer to the following SAP Note and also the SAP HANA Troubleshooting and Performance Analysis Guide.

 

SAP Note:

- 1909670 - How to handle HANA Alert 5: ‘Host CPU Usage'

 

 

Memory Consumption:

To check the memory consumption of tables compare to the available allocation limit, you may go to the Load Monitor From HANA Studio.

 

memoryUsage2.png

 

STATUS

THRESHOLD

RED

Memory consumption of tables >= 70% of the available allocation limit.

YELLOW

Memory consumption of tables >= 50% of the available allocation limit.

GREEN

Memory consumption of tables < 50% of the available allocation limit.

 

As an in-memory database, it is critical for SAP HANA to handle and track its memory consumption carefully and efficiently; therefore, HANA database pre-allocates and manages its own memory pool. The concepts of the in-memory HANA data include the physical memory, allocated memory, and used memory.

- Physical Memory: The amount of physical (system) memory available on the host.

- Allocated Memory: The memory pool reserved by SAP HANA from the operating system

- Used Memory: The amount of memory that is actually used by HANA database.

 

Used Memory serves several purposes:

- Program code and stack

- Working space and data tables (heap and shared memory) The heap and shared area is used for working space, temporary data, and storing all data tables (row and column store tables).

 

For more information, please refer to the following SAP Note and also the SAP HANA Troubleshooting and Performance Analysis Guide.

 

Useful SAP Note:

- 1999997 - FAQ: SAP HANA Memory

 

HANA Column Unloads:

 

Check Column Unloads on Load Graph under the Load Tab in the SAP HANA Studio. This graph will give you an idea of the time frame of any high activities of column unloads.

Header 1

Header 2

RED

>= 100,000 column unloads

YELLOW

>= 1001 and <100,000 column unloads

GREEN

<=1000 column unloads

 

Column Store unloads indicates the memory requirements exceed the current available memory in the system. In a healthy situation, it could be that the executed code request a reasonable amount of memory and requires SAP HANA to free up memory resources that are used rarely. However, if  there is a high number of table unloads then it will have an impact on the performance as the tables needs to be fetched again from the disk.

There are a couple of things to look for.

 

-  If the unloads happen on the statistics server, then it might be that the memory allocated for statistics server is not sufficient and most of the time it would accompany by Out of Memory errors. If this is the case, refer to SAP Note 1929538 HANA Statistics Server - Out of memory. On the other hand, if the unload motivation is 'Unused resource' then you should increase parameter global.ini [memoryobjects] unused_retention_period.

 

- If the unloads happen on the indexserver server and the reason for the unloads is due to low memory then it could be either of the reasons:

1) The system is not properly sized

2) The table distribution is not optimized

3) Temporary memory shortage due to expensive SQL or mass activity

 

For more detail information on this, please refer to SAP Note 1977207.


1977207 - How to handle HANA Alert 55: Columnstore unloads


License Information:

The view M_LICENSE can show the date that the HANA license will expire. You can also check the HANA license information from HANA Studio, right click the HANA system > Properties > License. If the license expires, the HANA system will be in a lockdown state; therefore, it is important to make sure the license is renewed before it expires.

 

select system_id, install_no, to_date(expiration_date), permanent, valid, product_name, product_limit, product_usage FROM "SYS"."M_LICENSE"

 

HANA database supports two kinds of license keys:

1) Temporary license key:

      - It is valid for 90 days.

      - It comes with a new SAP HANA database. During these 90 days, you should request and apply a permanent license key.

2) Permanent license key:

     - It is valid until the predefined expiration date.

     - Before a permanent license key expires, you should request and apply a new permanent license key.

 

For more information and steps to request for a license, please refer the SAP Note 1899480

 

- 1899480 - How to handle HANA Alert 31: 'License expiry'

I just reached the final credits of In-Memory Data Management (2014) - Implications on Enterprise Systems and I’d like to share my thoughts for each session.

 

I won’t explain the content for each week (you can found a very well explanation here but I’ll give my impressions, what I liked or learned. It’s totally personal, you may found other topics more interesting.

 

Let’s go:

 

Week 1

Lecture 1 - History of Enterprise Computing

When you start to hear a senior man with white hair talking about tape storage you may think “what I’m doing? I just bought my very modern smartphone and wasting my time hearing that man talking about store information in… tapes?!”. It’s not true in this case. I always like to hear Platter, it’s like a Jedi Master teaching. This introduction is very important to understand the motivation and birth of in-memory database.

 

Lecture 2 - Enterprise Application Characteristics

In my ABAP classes I always teach about OLAP/OLTP and the paradigm to have separated machine with different tuning for each one. Here I learnt a different history.

 

Lecture 3 - Changes in Hardware

How cheap memories, fast network and affordable servers able in-memory computing.

 

Lecture 4 - Dictionary Encoding

Here is one of the key points of SAP HANA, column storage. Here Plattner explain columnar storage and start to talk about compression.

 

Lecture 5 - Architecture Blueprint of SassouciDB

A very quick explanation about an academic and experimental database.

 

Week 2

Lecture 1 - Compression

It’s another key point for SAP HANA. You will learn about compression technics and yes, you will start to do some math to compare them.

 

Lecture 2 - Data Layout

More detail about row vs. column data storage. Pros and cons for each approach and a hybrid possibility.

 

Lecture 3 - Row v. Column Layout (Excursion)

Here we have more of Professor Plattner giving more information about data layout.

 

Lecture 4 - Partitioning

As a geek I only heard about partition when I want to install two operational systems in the same machine. Here I learned a very powerful technic to help parallelism reach higher levels.

 

Lecture 5 - Insert

Insert command. Under the hood.

 

Lecture 6 - Update

Lots of things to modify, re-ordenate and re-write.

 

Lecture 7 - Delete

Not delete, left behind.

 

Lecture 8 - Insert-Only

Worry about the future without forget the past.

 

Week 3

Lecture 1 - Select

Projection, Cartesian Product and Selectivity. All the beautiful theory about retrieving data.

 

Lecture 2 - Tuple Reconstruction

Retrieve a tuple in a row database: piece of cake. Retrieve a tuple in a colomn database: pain in the …

 

Lecture 3 - Scan Performance

Full table scan: row versus column layout. Show me the numbers!

 

Lecture 4 - Materialization Strategies

Materialization: when the attribute vector and dictionary mean something. Here you will learn two strategies for materialization during a query: early and later materialization.

 

Lecture 5 - Differential Buffer

I special buffer to help speed up write operations. Do you remember the insert-only paradigm? It’s about “worry about the future”.

 

Lecture 6 - Merge

When the differential buffer becomes main partition. Do you remember the insert-only paradigm? It’s about “without forget the past”.

 

Lecture 7 - Join

Once you learn that retrieve a tuple in a column layout is a pain, you can imagine what’s doing a join. Here you will know why.

 

Week 4

Lecture 1 - Parallel Data Processing

Very good lesson about parallel data processing. The lecture and reading material try to cover hardware and software aspects of parallelism. Highlight to map reduce. I highly recommend you deep into.

 

Lecture 2 - Indices

Presenting the indices of indices: inverted indices. “Using this approach, we reduce the data volume read by a CPU from the main memory by providing a data structure that does not require the scan of the entire attribute vector.” (from the reading material, chapter 18).

 

Lecture 3 - Aggregate Functions

Coming from old-school ABAP generation, aggregate functions still causes some itches in my ears. However, with push-down concept everything changed. Can old dog still can learn new tricks?

 

Lecture 4 - Aggregate Cache

In the past everything was simple: storage in disk, cache in memory. Today, storage in memory and cache in.. memory too!? Why do I need cache using in-memory database? Cache some chewed data, here is aggregate cache.

 

Lecture 5 - Enterprise Simulations

Answer insanity-fast a query is only part of the game. Now, enterprise simulations are possible. Change some variables and see the result. Ok, it’s not that simple, but it’s awesome anyway!

 

Lecture 6 - Enterprise Simulations on Co-processors (Excursion)

Awesomeness of enterprise simulation with co-processors. For who born before internet might remember co-processor 387, “almost” the same. In this presentation we see how co-processors can help high intensive calculation processing.

 

Week 5

Lecture 1 - Logging

If you think that logging is just to check what happened in the past or to check who was responsible to change the value that causes the highest incident in production yesterday, it’s better to think twice. Logging have a very important role in recovery process.

 

Lecture 2 - Recovery

The first think that everyone try to realize when know about in-memory databases is “if power goes down? All my database data will be swiped out?”. Here you learn that it’s right. But you also learn how in-memory database overcome that.

 

Lecture 3 - Replication

I remember a very simplistic definition of ACID concept: “All or nothing in”. In this lecture we check “all in” concept applied to in-memory databases. How to guarantee ACID in a database stored at RAM.

 

Lecture 4 - Read-only Replication Demo (Excursion)

Replication in action.

 

Lecture 5 - Hot-Standby

It’s a very hot topic (sorry…I won’t do it again). Hot-standby works together replication in order to guarantee ACID. It’s a good opportunity to you see why we can say that SAP HANA is a very beautiful piece of engineering.

 

Lecture 6 - Hot-Standby Demo (Excursion)

Hot-standby in action.

 

Lecture 7 - Workload Management and Scheduling

SAP HANA is all about speed, including user response. Professor Platner explain the importance to have a very responsive system. Here a quote that summarize it: “we must have to answer to user in the same speed of Excel, otherwise the user will download the data to Excel and work there”.

 

Lecture 8 - Implications on Application Development

What the implications for that special people that develop application to users? Code push-down (mode business logic to database) and store procedures, yes we’re still talking about ABAP. Those are the biggest paradigm shift for ABAP developers.

 

Week 6

Lecture 1 - Database-Driven Data Aging

Carsten Meyer explain news ideas about archiving and old data.

 

Lecture 2 - Actual and Historical Partitions

Cold data in not about aging, it’s about usage. Nuffsaid Professor Plattner.

 

Lecture 3 - Genome Analysis

In-memory have very huge implications beyond the Enterprise System. Let me bring a excerpt from “High Performance In-Memory Genome Data Analysis” reading material that can help to desmystify HANA as a luxury: “Nowadays, a range of time-consuming tasks has to be accomplished before researchers and clinicians can work with analysis results, e.g., to gain new insights”.

 

Lecture 4 - Showcase: Virtual Patient Explorer (Excursion)

Medical and patient stuff with lots of lots of information.

 

Lecture 5 - Showcase: Medical Research Insights (Excursion)

More medical and patient stuff with lots of lots of information.

 

Lecture 6 - Point-of-Sales Explorer

How In-memory SAP HANA DB help sales analysis. Three tables and 8 billions rows. Featuring The Professor commenting about SAP HANA performance “freaking unbelievable! People are scared!”.

 

Lecture 7 - What’s in it for Enterprises (Excursion)

More benefits to use SAP HANA for Enterprise. Decisions are able to be made in real-time basis.

 

Lecture 8 - The Enterprise Cloud (Excursion)

Bernd Leukert, member of the executive Board of SAP, talking about running business on cloud is much more than upload your files to Dropboxe.

 

As I said, it’s was my impression about each section. I really enjoy that training and it’s helping me a lot to understand other SAP HANA trainings.

 

I consider that as the cornerstone for anyone that decide to work with SAP HANA.


Before my tenure at SAP I worked in the sales group of a business intelligence (BI) startup and I used to pitch-hit for our under-staffed training department. This meant that when they were in a bind, I’d occasionally jump in to do on-site training sessions with a new customer deploying our BI software.

 

While I enjoyed showing the solutions without the pressure of having to close a sales deal, I always found the database section, where I did a relational database 101 overview and connected our software, to be quite tedious.

 

Jump forward to today’s hyper-connected world, where everything is digitized, fueling new data-driven business models and there’s a lot more to be excited about.

 

It’s not the individual advancements in data processing technology that I’m jazzed about… it’s what happens when you combine the data from devices, sensors and machines, creating inventive scenarios and adding unique business value that I really appreciate – especially given the expanding data challenges organizations face.

 

The changing world of data.jpg
As an example if a software provider or enterprise customer strings together a sensor with an embedded or remote database, adds real-time event processing software and a data warehouse with in-memory computing and then tosses in predictive analytics for good measure – they have a great recipe for:
- A smart vending machine that can deliver user recommendations and transaction history, or tell the candy supplier when a refill is needed. 
- Intelligent plant equipment that captures its own usage information and provides proactive repair warnings based on and historical failure data.
- Real-time fleet management systems that calculate the optimal distribution of work to maximize efficiency - distributing work to fleet assets in real time.

 

Add Cloud and mobile and of course all that exciting data management utility and information is available anywhere, anytime with lower TCO…

 

DM Manufacturing.jpg

There seems to be an infinite number of operational situations, processes and business models where end-to-end data management creates new services and revenue streams delivering customer value.

 

And that’s exciting… 

 

If you’re interested in exploring more use cases like:
- Real-time problem solving
- Smart metering
- Real-time promotions
- Pattern and customer profitability analysis
…then feel free to navigate The Changing World of Data Solution Map,  our Data Management Solution Brief, or reach out to our OEM team. Many partners are already embedding SAP data management solutions with their offerings to reduce their time to market, differentiate their solution and open up new revenue opportunities.

 

Get the latest updates on SAP OEM by following @SAPOEM on Twitter. For more details on SAP OEM Partnership and to know about SAP OEM platforms and solutions, visit us www.sap.com/partners/oem

From below statisticsserver trace, look at the memory consumption for statisticsserver highlighted in red. Pay attention to the PAL (process allocation limit), AB (allocated byte) and U (used) value. When U value is close, equal or bigger then PAL value, this indicates out of memory occurred.

 

Symptoms:

 

[27787]{-1}[-1/-1] 2014-09-25 16:10:22.205322 e Memory ReportMemoryProblems.cpp(00733) : OUT OF MEMORY occurred.

Failed to allocate 32816 byte.

Current callstack:

1: 0x00007f2d0a1c99dc in MemoryManager::PoolAllocator::allocateNoThrowImpl(unsigned long, void const*)+0x2f8 at PoolAllocator.cpp:1069 (

  1. libhdbbasis.so)

2: 0x00007f2d0a24b900 in ltt::allocator::allocateNoThrow(unsigned long)+0x20 at memory.cpp:73 (libhdbbasis.so)

3: 0x00007f2cf78060dd in __alloc_dir+0x69 (libc.so.6)

4: 0x00007f2d0a247790 in System::UX::opendir(char const*)+0x20 at SystemCallsUNIX.cpp:126 (libhdbbasis.so)

5: 0x00007f2d0a1016dc in FileAccess::DirectoryEntry::findFirst()+0x18 at SimpleFile.cpp:511 (libhdbbasis.so)

6: 0x00007f2d0a1025da in FileAccess::DirectoryEntry::DirectoryEntry(char const*)+0xf6 at SimpleFile.cpp:98 (libhdbbasis.so)

7: 0x00007f2d0a04872f in Diagnose::TraceSegmentCompressorThread::run(void*&)+0x26b at TraceSegment.cpp:150 (libhdbbasis.so)

8: 0x00007f2d0a0c0dcb in Execution::Thread::staticMainImp(void**)+0x627 at Thread.cpp:475 (libhdbbasis.so)

9: 0x00007f2d0a0c0f6d in Execution::Thread::staticMain(void*)+0x39 at Thread.cpp:543 (libhdbbasis.so)

Memory consumption information of last failing ProvideMemory, PM-INX=103393:

Memory consumption information of last failing ProvideMemory, PM-INX=103351:

IPMM short info:

GLOBAL_ALLOCATION_LIMIT (GAL) = 200257591012b (186.50gb), SHARED_MEMORY = 17511289776b (16.30gb), CODE_SIZE = 6850695168b (6.37gb)

PID=27562 (hdbnameserver), PAL=190433938636, AB=2844114944, UA=0, U=1599465786, FSL=0

PID=27674 (hdbcompileserve), PAL=190433938636, AB=752832512, UA=0, U=372699315, FSL=0

PID=27671 (hdbpreprocessor), PAL=190433938636, AB=760999936, UA=0, U=337014040, FSL=0

PID=27746 (hdbstatisticsse), PAL=10579663257, AB=10512535552, UA=0, U=9137040196, FSL=0

PID=27749 (hdbxsengine), PAL=190433938636, AB=3937583104, UA=0, U=2352228788, FSL=0

PID=27743 (hdbindexserver), PAL=190433938636, AB=155156312064, UA=0, U=125053733102, FSL=10200547328

Total allocated memory= 198326363056b (184.70gb)

Total used memory     = 163214166171b (152gb)

Sum AB                = 173964378112

Sum Used              = 138852181227

Heap memory fragmentation: 17% (this value may be high if defragmentation does not help solving the current memory request)

Top allocators (ordered descending by inclusive_size_in_use).

1: / 9137040196b (8.50gb)

2: Pool 8130722166b (7.57gb)

3: Pool/StatisticsServer 3777958248b (3.51gb)

4: Pool/StatisticsServer/ThreadManager                                     3603328480b (3.35gb)

5: Pool/StatisticsServer/ThreadManager/Stats::Thread_3                     3567170192b (3.32gb)

6: Pool/RowEngine 1504441432b (1.40gb)

7: AllocateOnlyAllocator-unlimited 887088552b (845.99mb)

8: Pool/AttributeEngine-IndexVector-Single                                 755380040b (720.38mb)

9: AllocateOnlyAllocator-unlimited/FLA-UL<3145728,1>/MemoryMapLevel2Blocks 660602880b (630mb)

10: AllocateOnlyAllocator-unlimited/FLA-UL<3145728,1>                       660602880b (630mb)

1: Pool/RowEngine/RSTempPage 609157120b (580.93mb)

12: Pool/NameIdMapping                                                      569285760b (542.91mb)

13: Pool/NameIdMapping/RoDict 569285696b (542.91mb)

14: Pool/RowEngine/LockTable 536873728b (512mb)

15: Pool/malloc                                                             429013452b (409.13mb)

16: Pool/AttributeEngine 253066781b (241.34mb)

17: Pool/RowEngine/Internal 203948032b (194.50mb)

18: Pool/malloc/libhdbcs.so 179098372b (170.80mb)

19: Pool/StatisticsServer/LastValuesHolder                                  167034760b (159.29mb)

20: Pool/AttributeEngine/Delta 157460489b (150.16mb)

Top allocators (ordered descending by exclusive_size_in_use).

1: Pool/StatisticsServer/ThreadManager/Stats::Thread_3                     3567170192b (3.32gb)

2: Pool/AttributeEngine-IndexVector-Single 755380040b (720.38mb)

3: AllocateOnlyAllocator-unlimited/FLA-UL<3145728,1>/MemoryMapLevel2Blocks 660602880b (630mb)

4: Pool/RowEngine/RSTempPage 609157120b (580.93mb)

5: Pool/NameIdMapping/RoDict 569285696b (542.91mb)

6: Pool/RowEngine/LockTable 536873728b (512mb)

7: Pool/RowEngine/Internal                                                 203948032b (194.50mb)

8: Pool/malloc/libhdbcs.so 179098372b (170.80mb)

9: Pool/StatisticsServer/LastValuesHolder                                  167034760b (159.29mb)

10: StackAllocator                                                          116301824b (110.91mb)

11: Pool/AttributeEngine/Delta/LeafNodes                                    95624552b (91.19mb)

12: Pool/malloc/libhdbexpression.so 93728264b (89.38mb)

13: Pool/AttributeEngine-IndexVector-Sp-Rle                                 89520328b (85.37mb)

14: AllocateOnlyAllocator-unlimited/ReserveForUndoAndCleanupExec            84029440b (80.13mb)

15: AllocateOnlyAllocator-unlimited/ReserveForOnlineCleanup                 84029440b (80.13mb)

16: Pool/RowEngine/CpbTree 68672000b (65.49mb)

17: Pool/RowEngine/SQLPlan 63050832b (60.12mb)

18: Pool/AttributeEngine-IndexVector-SingleIndex                            57784312b (55.10mb)

19: Pool/AttributeEngine-IndexVector-Sp-Indirect                            56010376b (53.41mb)

20: Pool/malloc/libhdbcsstore.so 55532240b (52.95mb)

[28814]{-1}[-1/-1] 2014-09-25 16:09:19.284623 e Mergedog Mergedog.cpp(00198) : catch ltt::exception in mergedog watch thread run(

): exception  1: no.1000002  (ptime/common/pcc/pcc_MonitorAlloc.h:59)

    Allocation failed

exception throw location:

 

 

You can refer 2 solutions below if the HANA system is not ready to switch to embedded stasticsserver for any reason.

Solution A)

 

1) If statistic server is down and inaccessible, you need to kill hdbstatisticsserver pid in OS. Statisticsserver will be restarted immediately by hdb daemon.

 

2) Check memory consumed by statisticsserver:

 

 

3) Check whether the statistics server deletes the old data, go to Catalog -> _SYS_STATISTICS -> TABLES and randomly check table starting with GLOBAL* and HOST* and sort by snapshot_id ascendingly. Ensure the oldest date identical to the retention period.

 

Alternatively, you can run command: select min (snapshot_id) from _SYS_STATISTICS.<TABLE>

 

Eg:


 

 

 

4) Check the retention period of each tables in Configuration -> Statisticsserver -> statisticsserver_sqlcommands


eg:

30 days for HOST_WORKLOAD

 

5) If old data more than 30 days (or we want to delete old data by shorten the retention period), follow 1929538 - HANA Statistics Server - Out of Memory -> Option 1:


Create the procedure using the file attached on note 1929538 and run call set_retention_days(20);

 

6) Once done, you’ll see old data with more than 20days get deleted :

 

Memory consumption for statisticsserver reduced:


Also, the min snapshot_id get updated, which is 20days before the retention period:



7) You can reset the retention period to default value anytime if you want, by calling call set_retention_days(30);or restore every SQL command to default in statisticsserver_sqlcommands.


Solution B)

i) Follow 1929538 - HANA Statistics Server - Out of Memory and increase allocationlimit for statisticsserver. This can be done only when statisticsserver is up and accessible. Otherwise, you need to kill and restart it.

 

 

 

One good script HANA_Histories_RetentionTime_Rev70+ from Note 1969700 - SQL statement collection for SAP HANA provides a good overview of Retention time.


My 2 cents worth, for any statisticsserver OOM error, always check the memory usage of statisticsserver to ensure obselete data get deleted after retention period instead of increasing the allocation limit for statisticsserver blindly.


Additionally, you also can refer to 2084747 - Disabling memory intensive data collections of standalone SAP HANA statisticsserver to disable data collection that consume high memory.


Hope it helps,


Thanks,

Nicholas Chang






Overview

This blog is intended to use SAP crypto library to enable SAML SSO from SAP BI4 to SAP HANA DB. If you want to use OPENSSL instead, please check the other SCN blog for details.

 

Turn on SSL using SAP Crypto Library

 

1.     Install SAP Crypto library

SAP Crypto Library can be downloaded from Service Market Place. Browse to http://service.sap.com/swdc, expand Support Packages and Patches "Browse our Download Catalog "SAP Cryptographic Software" SAPCRYPTOLIB" SAPCRYPTOLIB 5.5.5 "Linux on x86_64 64bit.

 

The new CommonCryptoLib (SAPCRYPTOLIB) Version 8.4.30 (or higher) is fully compatible with previous versions of SAPCRYPTOLIB, but adds features of SAP Single Sign-On 2.0 Secure Login Library. It can be downloaded in this location:

expand Support Packages and Patches "Browse our Download Catalog "Additional Components " SAPCRYPTOLIB "COMMONCRYPTOLIB 8


Use SAPCAR to extract sapgenpse and libsapcrypto.so to /usr/sap/<SID>/SYS/global/security/lib/

Add the directory containing the SAP Crypto libraries to your library path:

  export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/sap/<SAPSID>/SYS/global/security/lib

 

2.     Create the SSL key pair and certificate request files

  • Copy the sapgenpse to $SECUDIR directory. Then run sapgenpse to generate sapsrv.pse file and SAPSSL.req file:

  ./sapgenpse get_pse -p sapsrv.pse -r SAPSSL.req "CN=<FQDN of the host>"

 

  • Send the Certificate Request to a Certificate Authority to be signed. Browse to http://service.sap.com/trust, and expand SAP Trust Center Services in Detail, and click SSL Test Server Certificates, and then click the ‘Test it Now!’ button. Paste the content from the SAPSSL.req file to the text box, and click Continue.
    1.png
    SAP returns the signed certificate as text, copy this text and paste it into a file on the HANA server: 
    /usr/sap/<sid>/HDB<instance_nr>/<hostname>/sec/SAPSSL.cer
  • Download the  SAP SSL Test Server CA Certificate from the http://service.sap.com/trust site:
    6.png


  • Import the Signed Certificate using sapgenpse
    ./sapgenpse import_own_cert -c SAPSSL.cer -p sapsrv.pse -r SAPServerCA.cer
3. Check HANA settings
Indexserver.ini->[Communication]->sslcryptoprovider = sapcrypto

 

 

4.Restart HANA, and test if SSL works from HANA studio


Click on the "Connect using SSL" option in the properties of the connection.  Once done, a lock will appear in the connection in HANA Studio
2.png

Create Certificate file for BO instance.

 

  1. Create HANA Authentication connection
    Log onto BO CMC" Application" HANA Authentication, click New. After provide HANA Hostname and port, and IDP name, click the Generate button, and click OK button so that you will see an entry added for HANA authentication
    10-22-2014 10-07-46 AM.png
  2. Copy the content of the generated certificate and paste it to a file on your HANA server:

    /usr/sap/<sid>/HDB<instance_nr>/<hostname>/sec/sapid.cer
  3. Add the certification to the pse file:

./sapgenpse maintain_pk -p sapsrv.pse -a sapid.cer

3.png

4. You may need to Restart HANA to make the new pse file take effect.

 

SAML configuration in HANA

 

  1. Create SAML provider in HANA


You could import the SAML identity provider from the certificate file (sapid.cer) which you created from last step in Security->Open security Console -> SAML Identity Providers. Make sure you have chosen the SAP Cryptographic Library.

5.png

 

2. Create a HANA user TESTUSER with SAML authentication.

Check the SAML option, click the Configure link, then Add the Identity Provider created in last step 'HANA_BI_PROVIDER' for the external user 'Administrator'

4.png

 

 

Test SAML authentication

 


Go to BO CMC" Application" HANA Authentication, edit the entry created in previous step, click "Test Connection" button.

7.png

 

Troubleshooting

If the connection test is not successful, please change the trace level of the following to DEBUG:


indexserver.ini - authentication, xssamlproviderconfig


The index server trace will provide more information on why the authentication failed.

 

Reference

 

How to Configure SSL for SAP HANA XSEngine using SAPCrypto

Configuring SAML with SAP HANA and SAP BusinessObjects 4.1 - Part 1

Use SAML to Enable SSO for your SAP HANA XS App

For the Keynote at SAP TechEd/d-code 2014, we built out a quarter trillion row model in a single scale-up HANA system. You can read the high level Unleashing Lightening with SAP HANA overview and watch the video.

 

I thought that people might be interested in how the demo was built, and the logistical and technical challenges of loading such a large data model.

 

Building the SGI UV300H

 

The first challenge we had was finding a system big enough for 30TB of flat files in short time. The SGI UV300H is a scale-up HANA appliance, made up from 4-socket building blocks. The SGI folks therefore had to string 8 of these together using their NUMAlink connectors and attach 6 NetApp direct attached storage arrays for a total of 30TB of storage.

 

Today, only 4- and 8-socket SGI systems are certified and the 32-socket system is undergoing certification. The nature of the UV300H means that there is non-uniform data access speed. On a 4-socket Intel system you have either local (100ns) or remote (300ns) - you can read Memory Latencies on Intel® Xeon® Processor E5-4600 and E7-4800 product families for more details.

 

With the NUMAlink system there is also a hop via NUMAlink to the remote chassis, which increases the memory latency to 500ns. Whilst that is blindingly fast by any standard, it increases the non-uniformity of RAM access on HANA. For SAP HANA SPS09, SAP optimized HANA for the UV300H by improving average memory locality.

 

However HANA SPS09 wasn't available, so we ran on stock SAP HANA SPS08 Revision 83. It's tough to say how big a penalty this cost us, but on a theoretical 7.5bn aggregations/sec, we got closer to 5bn, so I'm guessing SPS09 would provide a 25-50% hike in performance under certain circumstances.

 

But to be clear, this is the same HANA software that you run on any HANA server, like AWS Developer Edition. There was no customization involved.

 

Downloading 30TB of flat files

 

Here was our next challenge. I did the math, and realized this was going to take longer than the time available, so I put a call into Verizon FIOS to see if they could help. They came out the next day and installed a new fiberoptic endpoint which could service up 300/300Mbit internet. With my laptop hard-wired into the router, we could get a constant 30MByte/sec download from the Your.Org Wikimedia Mirror. Thanks guys!

 

Once these were on USB hard disks, we shipped them to SGI Labs, which cost another 4 days, due to the Columbus Day holiday.

 

From there, we found we could load into HANA faster than we could copy the files onto the server (USB 2.0).

 

Building the HANA Model

 

Thankfully, I have a few smaller HANA systems in my labs, so I tested the configuration on a 4S/512GB system with 22bn rows, and on a scale-out 4x4S/512GB system with 100bn rows. There were a few things that we found that would later be of value.

 

First, partitioning by time (month) is useful, because you can load a month at a time, and let HANA merge and compress the last month whilst you load the next month. This saves the constant re-merging that happens if you don't partition by time. A secondary partition by title is useful, because it ensures partition pruning during data access, which means that much less RAM is scanned for a specific query. This led to a RANGE(MONTH), HASH(TITLE) two-level partition strategy, which is very typical of data of this type.

 

Second, the amount of data we had meant that it was going to be most practical to logically partition the data into tables by year. This wasn't strictly necessary, but it meant that if something went wrong with one table, it wouldn't require a full load. This decision was vindicated because user error meant I wiped out one table the night before the Keynote, and it was easily possible to reload that year.

 

Third, a secondary index on TITLE was used. This was based on research by Lars Breddemann and Werner Steyn Further Playing with SAP HANA which led us to understand that when a small amount of data is selected from a large table, an index on the filter predicate column is beneficial. Therefore if the SQL query is SELECT DATE, SUM(PAGEVIEWS) FROM PAGECOUNTS WHERE TITLE = 'Ebola' GROUP BY DATE, then a secondary index on TITLE will increase performance.

 

Fourth, we built a simple HANA model to UNION back in all the tables in a Calculation View, and join it to the M_TIME_DIMENSION system table so we could get efficient time aggregation in OData and ensure query pruning.

 

Optimizing SAP HANA for the UV300H

 

By this time, we had 30TB of flat files on the /hana/shared folder of the UV300H and had to get them loaded. We realized there was a challenge, which is the Wikipedia files come in space delimited, with no quotes around text, and the date is in the filename, not a column. We didn't have Data Services or another ETL product, and the fastest way to get data into HANA is using the bulk loader.

 

So, I wrote a script which uncompressed the file into a memory pipe, reformatted it in awk to contain the timestamp and convert it to CSV with quotes, write it out to a RAMdisk, run it into the bulk loader and delete the RAMdisk file. Each hour takes around 20 seconds to process, and I ran 12 threads, in parallel, plus an additional 40 threads for the bulk loader process.

 

What we realized at this point was that SAP HANA SPS08 wasn't optimized for the amount of power that the UV300H had, so I tweaked the settings to be more aggressive, particularly with mergedog, which only uses 2 CPU cores by default. We enabled the integrated statistics server, installed PAL and the script server.

 

In addition, I found that it was necessary not to be too aggressive, because the log volume is only 500GB, and you can easily fill this up between 5 minute savepoints if you get too aggressive with loading (remember you have to buffer enough logs until the savepoint is complete). I suspect the certified 32-socket system will have a 1 or 2TB log volume for this reason.

 

Other than that, we pretty much found that it just worked. Here's a screenshot of using all the 960 vCores in the UV300H during some early query testing. I'm sure glad I didn't pay for the power bill!

Screen Shot 2014-10-14 at 3.41.44 PM.png

Building the Web App in HANA XS

 

We put together the Web App in HANA XS using SAP UI5 controls and OData services to access the underlying data model. More on this in a later blog when Brenton OCallaghan is going to describe how it was built.

 

What's critical about this is that the OData services which is accessed directly by the browser runs in-memory, and has access directly to the Calculation Scenario which is generated by the SAP HANA Calculation View. This means that the response time in the browser is very little more than a SQL query ran in a console on the server itself.

 

There were really no special considerations required to use HANA XS with a model of this size - it worked exactly the same as for any other HANA model. One thing we did to ensure we didn't cause problems was to restrict the HANA models so you couldn't return very large data volumes by using Input Parameters. This means you can't return 250bn rows in a browser!

 

Final Words

 

I've said this in the other blog, but whilst there were huge logistical challenges in building a model like this in 10 days, HANA made it possible. The fact that HANA self-optimizes the whole model for compression and query performance and requires no tuning is a huge benefit. Once we had built a simple data model, we were able to directly load all the data overnight.

 

One thing that was worth noting is because of the direct attached storage model in the UV300H, we found we can load around 200GB/minute into RAM (once it has been loaded and merged once). That means we can load the entire 6TB model on this system in around 30 minutes, which is the fastest load speed I've ever seen on a HANA system.

 

Anyhow, the purpose of this blog was to open the kimono on specifically how we built this demo, and to show that there was no special optimization to do so. This, despite the fact that the UV300H 32-socket edition certification is still in progress and the HANA optimizations for it weren't available to us. If you have any questions on it then please go ahead and ask them, I'd be very happy to answer.

 

And remember if you'd like to Build your own Wikipedia Keynote Part 1 - Build and load data then please follow that series - we'll be building out the whole demo in 4 parts.

This blog is intended to focus on how you can connect with social media like for eg. Twitter and work on text analysis or predictive analysis with the social media data. Use cases could be the Football World Cup, EPL, Cricket IPL T20 and many predictive apps.

 

Twitter has become one the most widely used platform for its trending data using the # (hash tags).

 

I came across a use case where we worked on the predictive analysis with the tweets. Shared here are some of the key points on “How we go about getting this connectivity between HANA & Twitter” and “How we read these tweets, store in HANA DB and take it further for predictive analysis”.

In this example I have used SAP HANA XSJS. You can use the JAVA API’s as well.


With HANA XSJS I tried with two solutions 1) UI request -> XSJS -> Twitter -> Response back to UI

2) Use of XS Jobs for getting Tweets + Separate Service call for UI rendering.


Ok now let’s go ahead and see how it’s done.

 

With Twitter’s new authentication mechanism below steps are necessary for having a successful connection.

 

HttpDest would look like this:

 

description = "Twitter";
host = "api.twitter.com";
port = 443;
useProxy = false;
proxyHost = "proxy";
proxyPort = 8080;
authType = none;
useSSL = true;

Another important step would be to setup the TRUST STORE from XS ADMIN tool :

 

Outbound httpS with HANA XS (part 2) - set up the trust relation

 

Twitter offers applications to issue authenticated requests on behalf of the application itself (as opposed to a specific user).

 

We need to create a twitter application (Manage Aps) in https://dev.twitter.com/. The settings tab/OAuth tool would give us the Consumer Key and secret Key which is very important for setting up the Request authorization header.

 

OAuth keys.JPG

Critical steps is to get the bearer token ready:

  1. URL encode the consumer key to RFC 1738 - xvz1evFS4wEEPTGEFPHBog
  2. The consumer secret to RFC 1738 - L8qq9PZyRg6ieKGEKhZolGC0vJWLw8iEJ88DR
  3. Concatenate the encoded consumer key, a colon character “:”, and the encoded consumer secret into a single string.

        xvz1evFS4wEEPTGEFPHBog:L8qq9PZyRg6ieKGEKhZolGC0vJWLw8iEJ88DR

   4. Base64 encode the string from the previous step -  V2RlM0d0VFUFRHRUZQSEJvZzpMOHFxOVBaeVJnNmll==


There you have the BASIC token. We need to get a BEARER token from the BASIC token by issuing a POST:

https://dev.twitter.com/oauth/reference/post/oauth2/token

 

Response:

HTTP/1.1 200 OK

Status: 200 OK

Content-Type: application/json; charset=utf-8

Content-Encoding: gzip

Content-Length: 140

{"token_type":"bearer","access_token":"AAAA%2FAAA%3DAAAAAAAA"}

 

The BEARER token is the gateway for using in your XSJS service to talk to twitter. For checking your twitter URL request format you can use the Twitter Developer Console.


Console.JPG

Response:

 

Reponse.JPG

 

XSJS code snippet :

 

var dest = $.net.http.readDestination("Playground", "twitter");
var client = new $.net.http.Client();
var url_suffix = "/1.1/search/tweets.json?q=#SAPHANA&count=1";
var req = new $.web.WebRequest($.net.http.GET, url_suffix); //MAIN URL
req.headers.set("Authorization","Bearer AAAAAAAA");
client.request(req, dest);



 

If all is setup correctly we would get a Response that gives you the Tweet text in Statuses ARRAY..

 

var response = client.getResponse();
var body = response.body.asString();
myTweets = JSON.parse(body);



myTweets.statuses[index].text ===> Tweet data

 

Once you have the tweets array, you can loop it, process and store them in our SAP HANA DB for further predictive analysis

 

var conn = $.db.getConnection();
var pstmt = conn.prepareStatement(insert_hana_tweet);
pstmt.setString(1, myTweets.statuses[index].id_str);
pstmt.setString(1, myTweets.statuses[index].text);                               
pstmt.setString(2, myTweets.statuses[index].created_at);
pstmt.setString(3, myTweets.statuses[index].user.screen_name);
pstmt.execute();
conn.commit();
pstmt.close();
conn.close();




In my use case I have used "SEARCH" tweets but you have many other options eg. Read Messages, retweets, followers & so on.

The request parameters can be used based on your use case eg. Most recent tweet (result_type=recent), number of tweets to fetch (count=10), fetch all tweets after a specific tweet id(since_id).

 

I have the XS service in a XS JOB which would check for tweets and store them in HANA DB.

 

{
"description": "Insert Tweets",
    "action": "Playground:job_twitter.xsjs::collectTweet",
"schedules": [
        {
            "description": "Tweets",
"xscron": "* * * * * * *"
        }
    ]
}



 

Have a look at the developer guide to understand the various options available for scheduling... XSCRON parameter is used for setting the time and duration of the job.

Talk to Twitter and keep trending Hope this post was helpful !

 

Avinash Raju

SAP HANA Consultant

www.exa-ag.com

So by now you may have seen the Wikipedia page counts model that we built for the keynote. I'll blog later on about the logistical challenges of getting 30TB of flat files and building a system to load it in 10 days, but since SAP TechEd & d-code is a conference with a load of developers, I'm going to spend some time showing you how to build your own.

 

The beauty of the SAP HANA platform is that you can build this whole model inside the HANA platform using one tool - HANA Studio

 

Background

 

The background is that Wikipedia publishes a list of Page view statistics for Wikimedia projects, which are consolidated page views per hour by title and by project. These go back to 2007 and are now a total of nearly 250bn rows. It's a fascinating dataset because the cardinality is really rough on databases, there are over 4m articles in the English version alone and well over that including all projects.

 

The total dataset is around 30TB of flat files, which translates into around 6TB of HANA database memory required. The SAP HANA Developer Edition is a 60GB Amazon EC2 cloud system, and so you can comfortably fit around 30GB of RAM, so you can fit around 0.5% of the overall dataset. That means we can comfortably fit around 3 weeks of data. This should be enough for you to have some fun!

 

So how do you get started? Trust me, its very easy to do!

 

Step 1 - get SAP HANA Developer Edition

 

N.B. This is free of charge from SAP, but you will have to pay Amazon EC2 fees. Be mindful of this and turn off the system when you're not using it.

 

It takes a few minutes to setup, because you have to configure your AWS account to receive the HANA Developer Edition AMI, but Craig Cmehil and Thomas Grassl have done a great job of making this easy, so please go ahead and configure the HANA Developer Edition!

 

You can of course use an on-premise version or any other HANA instance, though our scripts do assume that you have internet access, so your system doesn't, then you will adapt them. That's part of the fun, right!

 

Step 2 - Create and build the model

 

For the purposes of this exercise, this couldn't be easier as there's just one database table. Note that we use a HASH partition on TITLE. In the big model, we actually use a multilevel partition with a range on date as well, but you won't need this for just 3 weeks. The HASH partition is really handy as we are mostly searching for a specific title, so we can be sure that we'll only hit 1/16th of the data for a scan. This won't hurt performance.

 

Also note that there's a 2bn row limit to partition sizes in HANA, and we don't want to get near to that (I recommend 2-300m rows max as a target). HASH partitioning is neat, because it evenly distributes values between partitions.

 

Also note that we use a generated always statement for date. Most of the time we're not interested in timestamp, and it's very expensive to process the attribute vector of timestamps when you only need the date. Materializing the date allows for a minimum of 24x more efficient time series processing.

 

CREATE USER WIKI PASSWORD "Initial123";

DROP TABLE "WIKI"."PAGECOUNTS";

CREATE COLUMN TABLE "WIKI"."PAGECOUNTS" (

    "WIKITIME" TIMESTAMP,

    "WIKIDATE" DATE GENERATED ALWAYS AS to_date("WIKITIME"),

    "PROJECT" VARCHAR(25),

    "TITLE" VARCHAR(2048),

    "PAGEVIEWS" BIGINT,

    "SIZE" BIGINT) PARTITION BY HASH(TITLE) PARTITIONS 16;

 

Step 3 - Download and load the data

 

The friendly folks at Your.org maintain an excellent mirror of the Wikipedia Page Views data. There are a few challenges with this data, from a HANA perspective.

 

First, it comes in hourly files of around 100MB, which means you have to process a lot of files. So, we wrote a batch script that allows processing of a lot of files (we used this script on all 70,000 files in a modified form to allow for much more parallel processing than your AWS developer instance can cope with!).

 

Second, they are gzipped, and we don't want to unzip the whole lot as that would be huge and takes a lot of time. So the script unzips them to a RAM disk location for speed of processing.

 

Third, the files are space delimited and don't contain the date and time in them, to save space. For efficient batch loading into HANA without an ETL tool like Data Services, we reformat the file before writing to RAM disk, to contain the timestamp as the first column, and be CSV formatted with quotes around the titles.

 

Anyhow, the script is attached as hanaloader.sh. You need to copy this script to your AWS system and run it as the HANA user. Sit back and relax for an hour whilst it loads. The script is uploaded as a txt file so please remember to rename as .sh

 

Please follow these instructions to run the script:

 

-- login to the server as root and run the following:


     mkdir /vol/vol_HDB/sysfiles/wiki

     chown hdbadm:sapsys /vol/vol_HDB/sysfiles/wiki

     chmod u=wrx /vol/vol_HDB/sysfiles/wiki

     su - hdbadm

     cd /vol/vol_HDB/sysfiles/wiki


-- place wikiload.sh in this folder


-- edit wikidownload.sh as described in the comments in the file


-- Once ready run as follows:

     ./ wikidownload.sh 2014 10

 

Step 4 - Install HANA Studio

 

Whilst this is loading, go ahead and get SAP Development Tools for Eclipse installed. If you have access to SAP Software Downloads, you could alternatively use HANA Studio. Make sure you are on at least Revision 80, because otherwise the developer tools won't work.

 

Step 5 - Testing

 

Well now you have a database table populated with 3 weeks of Wikipedia page views. You can test a few SQL scripts to make sure it works, for example:

 

SELECT WIKIDATE, SUM(PAGEVIEWS) FROM WIKI.PAGECOUNTS GROUP BY WIKIDATE;

SELECT WIKIDATE, SUM(PAGEVIEWS) FROM WIKI.PAGECOUNTS WHERE TITLE = 'SAP' GROUP BY WIKIDATE;

 

Note how when you filter, performance dramatically improves. This is the way that HANA works - it's far faster to scan (3bn scans/sec/core) than it is to aggregate (16m aggs/sec/core). That's one of the keys to HANA's performance.

 

Next Steps

 

This is just the first of a multi-part series. Here's what we're going to build next:

 

Part 2: Building the OLAP model. We use the HANA Developer Perspective to build a virtual model that allows efficient processing and ad-hoc reporting in Lumira.

Part 3: Predictive Analysis. We build a predictive model that allows on the fly prediction of future page views.

Part 4: Web App. We expose the model via OData and build a simple web app on the top using SAP UI5.

 

I just want to say a big thanks to Werner Steyn Lars Breddemann, Brenton OCallaghan and Lloyd Palfrey for their help with putting all this together.

 

Keep tuned for next steps!

*This is a repost due to request*


Hi All,


My name is Man-Ted Chan and I’m from the SAP HANA product support  team. Today’s blog will be about the new SAP HANA Statistics Server. We will review some background information on it, how to implement it, and what to look for to verify it was successful.

 

What is the Statistics Server?

 

The statistics server assists customers by monitoring their SAP HANA system, collecting historical performance data and warning them of system alerts (such as resource exhaustion). The historical data is stored in the _SYS_STATISTICS schema; for more information on these tables, please view the statistical views reference page on help.sap.com/hana_appliance

 

What is the NEW Statistics Server?

 

The new Statistics Server is also known as the embedded Statistics Server or Statistics Service. Prior to SP7 the Statistics Server was a separate server process - like an extra Index Server with monitoring services on top of it. The new Statistics Server is now embedded in the Index Server. The advantage of this is to simplify the SAP HANA architecture and assist us in avoiding out of memory issues of the Statistics Server, as it was defaulted to use only 5% of the total memory.

 

In SP7 and SP8 the old Statistics Server is still implemented and shipped to customers, but can migrate to the new statistics service if they would like by following SAP note 1917938.

 

How to Implement the New Statistics Server?

 

The following screen caps will show how to implement the new Statistics Server. I also make note of what your system looks like before and after you perform this implementation (the steps to perform the migration are listed in SAP note 1917938 as well).

 

In the SAP HANA Studio, view the landscape and performance tab of your system and you should see the following:

1.png2.png

Prior to migrating to the new statistics server please take a back of your system, once that is done please do the following:

Go to the Configuration tab and expand nameserver.ini-> statisticsserver->active

3.png

Double click on the value ‘false’ and enter the new value ‘true’ into the following popup:

4.png

After pressing ‘Save’ the Configuration tab will now show the following:

5.png

Once this done check the ‘Landscape’ and ‘Performance’ tab.

6.png

7.png

As we can see there Statistics Server is now gone. Do not restart your system during this migration, to check the status of the migration please run the following:

SELECT * FROM _SYS_STATISTICS.STATISTICS_PROPERTIES where key = 'internal.installation.state'

The results will show the status and the time of the deployment.

Key

Value

  1. internal.installation.state
Done (okay) since 2014-09-20 02:55:34.0360000

 

Do not restart your SAP HANA system until the migration is completed


Trace Files

If you run into issues implementing the new statistics server then we will need to look into the SAP HANA trace files.Logs that we can check during the implementation of the new Statistics Server are the following:

  • Statistics Server trace
  • Name Server trace
  • Index Server trace

If the deployment does not work review the trace files to pin point where an error occurred.Below I have examples of trace snippets of a successful deployment of the embedded Statistics Service.

Statistics Server Trace

In the Statistics Server trace we will see the statistics server shutting down:[27504]{-1}[-1/-1] 2014-09-20 02:55:37.669772 i Logger BackupHandlerImpl.cpp(00321) : Shutting down log backup, 0 log backup(s) pending[27172]{-1}[-1/-1] 2014-09-20 02:55:38.340345 i Service_Shutdown TrexService.cpp(05797) : Disabling signal handler[27172]{-1}[-1/-1] 2014-09-20 02:55:38.340364 i Service_Shutdown TrexService.cpp(05809) : Stopping self watchdog[27172]{-1}[-1/-1] 2014-09-20 02:55:38.340460 i Service_Shutdown TrexService.cpp(05821) : Stopping request dispatcher[27172]{-1}[-1/-1] 2014-09-20 02:55:38.340466 i Service_Shutdown TrexService.cpp(05828) : Stopping responder[27172]{-1}[-1/-1] 2014-09-20 02:55:38.341478 i Service_Shutdown TrexService.cpp(05835) : Stopping channel waiter[27172]{-1}[-1/-1] 2014-09-20 02:55:38.341500 i Service_Shutdown TrexService.cpp(05840) : Shutting service down[27172]{-1}[-1/-1] 2014-09-20 02:55:38.350884 i Service_Shutdown TrexService.cpp(05845) : Stopping threads[27172]{-1}[-1/-1] 2014-09-20 02:55:38.354348 i Service_Shutdown TrexService.cpp(05850) : Stopping communication[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356233 i Service_Shutdown TrexService.cpp(05857) : Deleting console[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356240 i Service_Shutdown TrexService.cpp(05865) : Deleting self watchdog[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356260 i Service_Shutdown TrexService.cpp(05873) : Deleting request dispatcher[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356278 i Service_Shutdown TrexService.cpp(05881) : Deleting responder[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356302 i Service_Shutdown TrexService.cpp(05889) : Deleting service[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356444 i Service_Shutdown TrexService.cpp(05896) : Deleting threads[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356449 i Service_Shutdown TrexService.cpp(05902) : Deleting pools[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356454 i Service_Shutdown TrexService.cpp(05912) : Deleting configuration[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356458 i Service_Shutdown TrexService.cpp(05919) : Removing pidfile[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356515 i Service_Shutdown TrexService.cpp(05954) : System down


Name Server Trace

 

In the Name Server trace you will see it being notified that the Statistics Server is shutting down and the topology is getting updated.An error that you might encounter in the Name Server trace is the following:STATS_CTRL       NameServerControllerThread.cpp(00251) : error installingPlease review SAP note 2006652 to assist you in resolving this.Below is a successful topology update:

  1. NameServerControllerThread.cpp(00486) : found old StatisticsServer: mo-517c85da0:30005, volume: 2, will remove it

[27065]{-1}[-1/-1] 2014-09-20 02:55:34.050358 i STATS_CTRL NameServerControllerThread.cpp(00489) : forcing log backup...

[27065]{-1}[-1/-1] 2014-09-20 02:55:34.051287 i STATS_CTRL NameServerControllerThread.cpp(00494) : log backup done. Reply: [OK]

--

[OK]

--

[27065]{-1}[-1/-1] 2014-09-20 02:55:34.051292 i STATS_CTRL NameServerControllerThread.cpp(00497) : stopping hdbstatisticsserver...

[27065]{-1}[-1/-1] 2014-09-20 02:55:34.054522 i STATS_CTRL NameServerControllerThread.cpp(00522) : waiting 5 seconds for stop...

[27426]{-1}[-1/-1] 2014-09-20 02:55:34.323824 i Service_Shutdown TREXNameServer.cpp(03854) : setStopping(statisticsserver@mo-517c85da0:30005)

[27065]{-1}[-1/-1] 2014-09-20 02:55:39.054777 i STATS_CTRL       NameServerControllerThread.cpp(00527) : hdbstatisticsserver stopped

[27065]{-1}[-1/-1] 2014-09-20 02:55:39.054796 i STATS_CTRL NameServerControllerThread.cpp(00530) : remove service from topology...

[27065]{-1}[-1/-1] 2014-09-20 02:55:39.056706 i STATS_CTRL       NameServerControllerThread.cpp(00534) : service removed from topology

[27065]{-1}[-1/-1] 2014-09-20 02:55:39.056711 i STATS_CTRL NameServerControllerThread.cpp(00536) : remove volume 2 from topology...

[27065]{-1}[-1/-1] 2014-09-20 02:55:39.058031 i STATS_CTRL NameServerControllerThread.cpp(00540) : volume removed from topology

[27065]{-1}[-1/-1] 2014-09-20 02:55:39.058038 i STATS_CTRL NameServerControllerThread.cpp(00542) : mark volume 2 as forbidden...

[27065]{-1}[-1/-1] 2014-09-20 02:55:39.059263 i STATS_CTRL NameServerControllerThread.cpp(00544) : volume marked as forbidden

[27065]{-1}[-1/-1] 2014-09-20 02:55:39.059269 i STATS_CTRL NameServerControllerThread.cpp(00546) : old StatisticsServer successfully removed

[27065]{-1}[-1/-1] 2014-09-20 02:55:39.060823 i STATS_CTRL NameServerControllerThread.cpp(00468) : removing old section from statisticsserver.ini: statisticsserver_general

[27065]{-1}[-1/-1] 2014-09-20 02:55:39.072798 i STATS_CTRL NameServerControllerThread.cpp(00473) : making sure old StatisticsServer is inactive statisticsserver.ini: statisticsserver_general, active=false

[27065]{-1}[-1/-1] 2014-09-20 02:55:39.083604 i STATS_CTRL NameServerControllerThread.cpp(00251) : installation done

[27065]{-1}[-1/-1] 2014-09-20 02:55:39.083620 i STATS_CTRL NameServerControllerThread.cpp(00298) : starting controller

 

Index Server Trace

 

The statistics service is a set of tables and SQL procedures, so if you check the index server trace you will see the deployment of version SQL procedures, and an error could occur during the SQL execution.

Here is an example of a successful deployment:

upsert _SYS_STATISTICS.statistics_schedule (id, status, intervallength, minimalintervallength, retention_days_current, retention_days_default) values (6000, 'Idle', 300, 0, 0, 0) with primary key;

END;

[27340]{-1}[-1/-1] 2014-09-20 02:55:29.802118 i TraceContext     TraceContext.cpp(00718) : UserName=

[27340]{-1}[-1/-1] 2014-09-20 02:55:29.802110 i STATS_WORKER     ConfigurableInstaller.cpp(00168) : creating procedure for 6000: CREATE PROCEDURE _SYS_STATISTICS.Special_Function_Email_Management (IN snapshot_id timestamp, OUT was_cancelled integer) LANGUAGE SQLSCRIPT SQL SECURITY INVOKER AS

-- snapshot_id [IN]: snapshot id

-- was_cancelled [OUT]: indicator whether the specialfunction has been cancelled

l_server string;

 

How do I check if it is running?

 

If you suspect that your new Statistics Service is not running you can check under the

Performance ->Threads tab

8.png

 

Or you can run the following query:

select * from "PUBLIC"."M_SERVICE_THREADS" where thread_type like '%ControllerThread (StatisticsServer)%'

 

How Do I Revert Back?

 

If for some reason you need to go back to the original Statistics Server, you will not be able to just change the value of

nameserver.ini-> statisticsserver->active back to false, but you will have to perform a recovery to a time before you performed the migration.

Purpose


This blog will focus on basic setup of Smart Data Access (SDA) and then outline some problems that customers have encountered.  Some of the issues outlined in the troubleshooting section come directly from incidents that were created.

 

There is already a lot of information on Smart Data Access which this blog does not aim to replace.  Throughout the blog, I will reference links to other documentation that can cover the topics in more detail.

 


What is Smart Data Access (SDA)?


SDA allows customers to access data virtually from remote sources such as Hadoop, Oracle, Teradata, SQL Server, and more. Once a remote
connection to a data source is done we can virtually connect to the tables and query against is or use in data models as if it were data that resides in a SAP
HANA database.

 

This allows it so customers do not have to migrate or copy their data from other databases into a SAP HANA database.

 

How to Setup SDA?


Smart Data Access was introduced in SAP HANA SP6, so if you intend on using SDA be on at least this revision.

 

Prior to connecting to a remote database you will need to configure an ODBC connection from the server to the remote database. For
assistance on how to install the Database drivers on how to install the database drivers for SAP HANA Smart Data Access please refer to SAP note
1868702 and refer to the SAP HANA Learning  Academy videos


https://www.youtube.com/watch?v=BomjFbJ25vo&index=16&list=PLkzo92owKnVx_X9Qp-jonm3FCmo41Fkzm


Documentation on SDA can be found in the SAP HANA Admin guide chapter 6


http://help.sap.com/hana/SAP_HANA_Administration_Guide_en.pdf

 

 

Changes to SDA?


For information on what upgrades to SDA has occurred with each revision of SAP HANA please feel free to review the following

 

SP6 -> SP 7 delta

 

http://www.saphana.com/servlet/JiveServlet/previewBody/4296-102-7-9005/HANA_SPS07_NEW_SDA.pdf

 

SP7 -> SP 8 delta

 

http://www.saphana.com/docs/DOC-4681

 

What Remote Sources Are Supported (As of SP8)?

 

Hadoop

Teradata

SAP HANA

Oracle 12c

Sybase ASE

Sybase IQ

DB2 (SP8)

Microsoft SQL Server 2012

Apache Spark (SP8)

 

** Please note that you could connect to other databases via ODBC, but please note that we cannot guarantee that it will work. **

 

 

How To Add a Remote Source

 

 

 

Once you have configured your ODBC files to the external data source of your choosing you can setup a connection to that source in
Studio by doing the following (we are using Oracle in our example)

 

  1. Expand your system -> Provisioning
    addpng.png
  2. Right click on the Remote Sources folder and select New Remote Source…
    addremote.png
  3. The main window pane will request you to enter in the connection informationaddadapter.gif
  4. Click on the Adapter Name drop down select the appropriate adapter (for this example we will select Oracle).  The main window pane will request
    you to enter in the connection information
      1. ASE – Adaptive Service Enterprise: version 15.7 ESD#4
      2. Teradata – version 13 and 14
      3. IQ – Version 15.4 ESD#3 and 16.0
      4. HANA – HANA revision 60 and up
      5. HADOOP – HDP 1.3 support added SP7
      6. Generic ODBC – This to connect to other databases that support ODBC protocol, however we do not guarantee that it will
        work
      7. Oracle – Oracle 12c support added in SP7
      8. MSSQL – Microsoft SQL Server ver11 support added in SP7
      9. Netezza – Netezza version 7 added in SP8
      10. DB2– DB2 UDB version 10.1 added in SP8
  5. Fill in connection properties and credentialsORACLE.gif
  6. Press the execute button to save this connection
    executed.gif
    ** As an alternative, you can create a remote source through SQL using the following command: CREATE REMOTE SOURCE <Name>
    ADAPTER "odbc" CONFIGURATION FILE 'property_orcl.ini' CONFIGURATION 'DSN=<DSNNAME>' WITH CREDENTIAL TYPE 'PASSWORD' USING 'user=<USERNAME>;password=<Password>';
  7. Press the Test Connection button to verify the connection to the source is successful
    test.gif
  8. Under the Remote Source you will now see your connection
    created.gif

 

 

How To Access The Virtual Tables

 

  1. After adding your New Remote Source, expand it  to see the users and the tables

    virtualtable.gif
  2. Right click on the table you would like to access and select ‘Add as Virtual Table’
    addvirt.gif
  3. You will then choose the alias name and the schema you would like to add this virtual tablevirt2.gif
  4. After hitting create you will get a confirmation message


    success.gif
  5. Now you can check the schema you have chosen
    virtdone.gif
  6. If you select ‘Open Definition’
    def.gif
  7. You will see under the ‘Type’ it says ‘Virtual’

typevirt.gif

 

 

 

 

Reported Problems

Description: After Restarting the HANA database, object privileges are lost

Resolution: This is resolved in Revision 75.  Documented in note: 2077504 - Smart Data
Access object privileges dropped from role after restarting HANA

 

Description: [08S01][unixODBC][SAP AG][LIBODBCHDBSO][HDBODBC] Communication link failure;-10709 Connect failed (no

reachable host left)

Resolution: This issue was caused by firewall blocking connectivity

 

Description: Issue connecting to Teradata
/opt/teradata/client/13.10/odbc_64/bin> ./tdxodbc: /usr/sap/odbc_drivers/unixODBC-2.3.2/DriverManager/.libs/libodbc.so: no version information available

(required by ./tdxodbc)

Resolution: tdxodbc is loading the incorrect libodbc.so library.  The $LD_LIBRARY_PATH environment variable should have the Teradata libraries appear before the UNIX ODBC Driver Manager libraries

 

Descritpion: Numeric overflow when to execute query on virtual table.
Error will look like:

com.sap.dataexplorer.core.dataprovider.DataProviderException:
Error: [314]: numeric overflow: 5.00 at

com.sap.ndb.studio.bi.datapreview.handler.DataPreviewQueryExecutor.executeQuery(DataPreviewQueryExecutor.java:192)at

com.sap.dataexplorer.ui.profiling.ProfilingComposite$12.run

Resolution: This issue is resolved in Revision 74.

 

Description: The GROUP BY &  Aggregation is not pushed down to the remote SDA source

using graphical calculation view.
Remote Source: This incident was reported with IQ
Resolution: HANA will not be able to push down aggregate functions to IQ via SDA until HANA

SPS9


Description: After configuration an Oracle Remote Source, the "Adapter Name" will switch to MSSQL (Generic ODBC)
Resolution: This issue simply a display issue.  It will still use the correct Oracle Adapters. This issue has been reported to development, but not currently scheduled to be fixed.

 

 

Description: After changing parameters of the SDA connection, the virtual tables disappear:
Remote Source: IQ 16.03
Solution: Edit the remote source parameters using "ALTER REMOTE SOURCE <remote_source_name>"


Description: SQL Server nvarchar(max) fields don't work with Smart Data Access.  Error:

SAP DBTech JDBC: [338]: zero-length columns are not allowed: LONGTEXT: line 1 col 209 (at pos 208)

Resolution: Planned to be fixed in SP09

 

** MORE issues will be updated at the bottom **

 

Authors

 

Man-Ted Chan

Jimmy Yang

 

 

a) OS Physical Memory – Total amount of memory available on physical host. However, total memory allocation limit for SAP HANA usage is approximate 93-95% (without specify the global allocation limit)


b) Virtual Memory – Reserve memory for SAP HANA processes from Linux OS, and this entire reserved memory footprint of a program is referred to Virtual Memory. SAP HANA virtual memory is the maximal amount that the process has been allocated, including its reservation for code, stack, data, and memory pool under program control. SAP HANA Virtual Memory dynamically growth when more memory needed (eg: table growth, temp computation and etc). When current pool memory can’t satisfy the request, memory manager requesting more memory from OS to this pool memory, up to its predefined memory allocation limit.


c) Resident Memory – the physical memory actually in operational use by a process.


d) Pool Memory – When SAP HANA started, a significant number of memory is requested from OS to this memory pool to store all the in memory data and system tables, thread stacks, temporary computation and other structures that need for managing HANA Database.


d1) only part of pool memory is used initially. When more memory is required for table growth or temporary computations, SAP HANA Manager obtain it from this pool. When the pool cannot satisfy the request, the memory manage will increase the pool size by requesting more memory from the OS, up to the pre-defined Allocation Limit.  Once computations completed or table dropped, freed memory is returned to the memory manager, who recycles it to its pool.


e) SAP HANA Used Memory – total amount of memory currently used by SAP HANA processes, including the currently allocated Pool Memory. The value will dropped when freed memory after each temporary computation and increase when more memory is needed and requested from Pool.


Hope above provide a clearer picture for SAP HANA Memory on Linux and hopefully if time permits, i'll post more on how to analyze current memory usage by SAP HANA (code, stack, shared and heap memory) next based on my analysis done.



You may have noticed that a number of people on SCN have received the SAP HANA Distinguished Engineer Badge! A shiny red star with a RAM chip in the middle!

HANADistinguishedEngineer75png1464f6c666d.png

And... you may be wondering how to get it. The SAP HANA Distinguished Engineer Program looks to recognize those individuals who are positively contributing to the SAP HANA Ecosystem. The badge is achieved via a HANA Distinguished Engineer Nominations process, and the HANA Distinguished Engineer Council meets periodically to review nominations. In addition, we sometimes scour SCN and other places to proactively nominate worthy individuals.

 

What does it take to be a HDE?

 

It's pretty simple. We have a SAP HANA Distinguished Engineer FAQ which describes this in more detail, but there are basically two things.

 

1) Be a HANA practitioner. You need to be working with customers on projects and have real-world experience.

 

2) Share your knowledge. You need to consistently share quality technical content with the public.

 

Everything else is open to interpretation - some HANA product managers work with customers, which is awesome. Some people only share content behind company firewalls, which we don't recognize as public content. Some create more "high level" and "marketing" content, which we recognize as valuable to the community, but we don't recognize that content for this program.

 

What is the purpose of the HDE program?

 

The HDE program looks to further adoption of the SAP HANA platform by encouraging a thriving community of practitioners and recognizing those who would be an asset to any customer project.

 

Why is the community aspect so important?

 

It's part of the core beliefs of the people who setup the program that the best way to help tech is to create a thriving community of content writers and sharing. It's the same reason why we are a huge supporter of the OpenSAP folks.

 

Also note that the HDE program is created by the community, for the benefit of customers. It's sponsored by SAP, and we are very thankful to have Saiprashanth Reddy Venumbaka and Craig Cmehil help lead it, but SAP don't own it.

 

Who can't be a HDE?

 

We get a lot of submissions from people who are really valuable to the ecosystem - trainers, sales, pre-sales, marketing. All that content is really important, but every HDE is someone who customers would want on your project team, so whilst we feel really bad when those individuals are nominated, they can't be HDEs.


We also get a lot of submissions from lots of awesome consultants who don't share technical content publicly. If you don't share content publicly, you can't be a HDE

 

We added a "**** Please note that if there isn't public material linked here, the candidate will not be considered ****" to the SAP HANA Distinguished Engineer Nomination Form but that didn't stop some people from nominating themselves without it!

 

Wow, that's an intimidating list of people!

 

Look, you couldn't have a program like the HDE program without people like Lars Breddemann and Thomas Jung! But, you don't have to be a rock star to be a HDE, just a regular person delivering projects and sharing quality content. That said, we definitely screen the actual content that people produce; if it's in any way negative to the community (or technically inaccurate, or just copies of documentation), we'll pass.


There's actually one individual that the HDE council has invited twice and has declined twice (you know who you are!), because they don't think they have sufficient real world experience.

 

What about diversity?

 

This year, the popularity of SAP HANA has thankfully meant that the HDE program has grown past American, German and British consultants. We have HDEs from Poland, Czech Republic, Argentina, Netherlands, Sweden, Ireland, Canada, India, Brazil and China, which is really cool. But we are ashamed to say that there are no women. Let us know if you can help with that.

 

Does being a HDE help with career progression? What's in it for me?

 

That's a very tricky question because it is very difficult to benchmark. HANA is a very hot technology and experienced resources are always in demand, and the HDE brand is definitely intended to be a mark of good quality resources, but it's up to individual employers to recognize this. Other programs like Microsoft's MVP program are considered to be positive to careers, so it does stand to reason.

 

As for what's in it for you, sharing concepts makes a consultant more rounded and a better communicator. The resume has been replaced by LinkedIn and many employers look for individuals with a brand and referenceability. HDEs get opportunities to speak at events, webinars, to write books and other activities. If you don't see that as good for your career then that's cool, the program just isn't for you.

 

So how do you get that badge?

 

There are four simple steps!

 

1) Sign up to SCN. The home of the HDE program is SCN, so you do need a SCN ID to get the badge!

 

2) Get yourself on a HANA project. You're going to need that real world experience!

 

3) Share what you learnt. Everyone shares in their own way and we don't proscribe a particular way. It can be speaking, writing blogs, forum activity, webcasts, podcasts. Whatever you like. You can be active on SCN, Slideshare, Twitter, Stack Overflow or anywhere else you choose, but remember the content has to be public. That training session you delivered to your peers in Walldorf doesn't count!

 

4) Nominate yourself, or wait for someone else to nominate you. HDEs are chosen on merit, so it's just fine to nominate yourself, we don't mind.

I am carefully following the latest HANA (and other unimportant stuff) surveys and SAP representatives’ reaction to it. You may want to do some reading on what happened before reading my blog. If so then here are the useful sources:

  1. ASUG survey that started the whole thing: ASUG Member Survey Reveals Successes, Challenges of SAP HANA Adoption
  2. Blog by Jelena Perfiljeva on the ASUG survey topic and the stir around it: Y U No Love HANA?
  3. Blog by Steve Lucas on the ASUG survey: Thoughts on the ASUG Survey on SAP HANA
  4. Hasso Plattner himself on the ASUG survey results: The Benefits of the Business Suite on HANA
  5. DSAG survey: Demand for innovation potential to be demonstrated
  6. Dennis Howlett on the DSAG survey: Analyzing the DSAG survey on SAP Business Suite on HANA

 

If you made it here, let me share some thoughts with you. I must warn you that I don’t have much real life exposure to HANA and my thoughts are primarily based on what I consume from various sources like SCN or SAP official marketing. Combined with the customer surveys my information sources are very well mixed and hilariously contradict with one another.

 

The SolMan story

But to the point. Namely to the title of this article. When you read about HANA and customer being not so hot about it, what does that remind you about? It reminds me about the Solution manager. Note that I am a Security and Authorizations consultant working primarily with the basis component (which is the foundation of all ABAP based systems). I get to work with Solution manager a lot. I don’t claim to be a SolMan expert but I have enough of them around me to be reasonably well informed about what they do and how is the market for them as well as get to hear feedback from the customers.

I don’t want to discuss some recent confusions and disappointments of some of the customers about changes in the SolMan functionality. I believe the SolMan team on the SAP side is a team of seasoned engineers and they know what they’re doing. What I want to concentrate on is the perception of the SolMan as a symbol of the SAP basis and infrastructure as a whole. At that is very much the same bucket where HANA ends up as well.

Every customer that runs an ABAP based system must run the basis component (BC is the good old name), which means database, user administration, roles and profiles, performance, custom development etc. Every customer must run this and have an internal team (often combined with an external one) to run the systems. SolMan is something that wise people see as a central hub for many (if not most) of these things and if you deploy and use the SolMan wisely, it offers huge benefits. You run jobs in your systems? I am pretty sure you do. Boom here comes the SolMan central monitoring. You do custom development? Whoosh here comes SolMan’s CHARM, CTS+, CCLM etc. Seriously for many basis things (for security things less so) SolMan offers some way how to run everything centrally which in my opinion provides some nice benefits.

But how comes that I can see that many customers not investing into centralized basis operations via the SolMan? How comes that if the budget is cut, SolMan is axed in the first wave? How comes that so many people are trained on MM-Purchasing (random business example) but SolMan experience and understanding of the big picture is so rare?

In my opinion the problem is the following. The companies have a fixed budget they spend on IT. Part of the fixed budget is a fixed budget for SAP. The budget is fixed. Non-inflatable. No magic. Fixed. The SLA between the shared services centre, the competence centre or how you call the team or organization that provides the SAP services (and runs the systems) says that the functionality that keeps the business running must perform well, be secure and available, patched, people trained etc. It is a necessary evil for the rest of the company to have this IT basement and their budget, but that has limits. The budget is fixed and the outside perspective (and priority setting) is on what keeps the company running and making money. Tell me, haven’t you ever joked about “being just a cost center”? We don’t make money, we just keep the servers going.

So back to the SolMan. To leverage the SolMan powers you need trained and knowledgeable people. Such people don’t come cheap. Even more so every year we are older because you have these shiny start-up hubs all around the world, you have cool companies worth billions (like Facebook and Google with free food and a laundry service) and they push prices (of the smart heads) up as well as the number of available smart heads down. Anyway you know what I mean. More costs on people, on their training, on making them happy. Then you need hardware to run the SolMan on, you need to pay for the license, you need external support time to time, more patching, more auditing etc.

And what is the value? What is the benefit? Once you bend the company’s (IT team’s) processes around the SolMan (and win the motivation of the basis folks for the SolMan, all of them!) then you can see some (substantial?) savings (in the hopefully not-so-distant future). And all that only if people commit to use the new features and the size of your organization makes it easier to reach the point of break-even during this lifetime still.

So let’s briefly summarize:

  1. Initial investment in the people, hardware and software
  2. Investment into the change management process that would readjust your people’s mind-sets and processes around the new tool.
  3. Savings are waiting for you in the future, some of them are rather theoretical and others will only arrive into your budget pool if everyone joins the effort.
  4. The good news is that SolMan is around for long enough that you have the knowledge spread around pretty well so hiring someone for your team to run the SolMan is not like hiring a Tesla engineer.
  5. In my opinion good news is that by slowly consolidating some of your process on the SolMan now and others later gives you the possibility to pay a series of small prices and get a series of smaller returns over the time.
  6. Last but not least SolMan does not do that many things that you can’t do without it. Can you name any such things? You can only do them on SolMan? Not with Excel or lots of clicking in the local system?

SolMan is being underestimated. Underused. Underappreciated.

 

The SolMan syndrome

Now back to HANA. Did you try to replace the SolMan with HANA in some of the comments above? Just try it. How comes that so many people are trained on MM-Purchasing (random business example) but HANA experience and understanding of the big picture is so rare? To leverage the HANA powers you need trained and knowledgeable people. Once you bend the company’s (IT team’s) processes around the HANA… Last but not least HANA does not do that many things that you can’t do without it, with Excel or lots of clicking in the system… I know you can see where I am coming from, right?

At this point I am taking a ten minutes break to re-read the Hasso Plattner’s blog…

We can immediately filter some of his points out as they are irrelevant for customers (or maybe it is better to say they are irrelevant for me and I am a trained SAP engineer, I do this for living and the future of my family depends on the success of SAP at least partly, I engage on SCN and I talk to SAP engineers… I think I have showed enough dedication and loyalty than most of the customers).

Anyway I don’t want to argue here with Mr. Plattner as I respect him very much so I will paint my own picture here and you, dear reader, can choose what is closer to your everyday reality.

The two most important things about HANA are:

  1. The cost that the customer must pay for the new ride
  2. The benefit received for that cost

I don’t run a HANA system myself and only a few of my customers do (and they all run BW on HANA regardless of what other HANA options are). So I don’t have any idea about the costs (other than some mentions on the SCN). But I assume these costs are not low. They can’t be for cutting edge innovation (…see more kittens dying?).

We could go on about costs here, but you are a smart person, dear reader, you can get a rough picture about the costs yourself. It is also a bit unfair to complain about costs. In my opinion if the benefit outweighs the costs, it is worth it no matter what the cost is. So let’s concentrate on the value and especially the obstacles in reaping the value and benefits.

As I see it there are two benefits: speed and simplification.

Well let’s start with speed. Let’s assume I can pick random customers of mine and HANA would boost their business through the roof (since the costs would go through the roof as well because I need to pay for the HANA show I can only see the benefit going through the roof to break that even).

Let’s try … a car manufacturer for example (random example, ok?). I have a production line that builds cars. This production line is built very efficiently. This production must never stop (ok, rarely stop in a controlled and planned manner). If I want to improve my earnings or savings, what do I do? I take a screw that I use 50 times in every car and make it 2% cheaper (replace screw with any other part with a value, if you’re from the car manufacturing business; screw is just an example, ok?). How can speed of my IT speed up my business?

Readjusting the production line based on some HANA invention seems to be out of the question – time consuming, expensive etc. (correct me if I am wrong, I welcome a discussion).

Would I change my supply chain based on the HANA fast data? How? I have dozens of main suppliers I depend on, they each have dozens of their suppliers they depend on. I have my supply chain diversified to reduce the risk of my production line going down because I am out of screws. I don’t see HANA helping me with my supply chain. I have long term contracts with my suppliers (which are not easy to change) and I have Just-in-time (JIT) delivery to be super-efficient. Still no signs of HANA here.

Can I improve my distribution channels based on HANA? Maybe I can ship some cars to a country XYZ because I can see a tendency of the demand to go a bit up there. Normal mortals that order a new car either pay (or are given a voucher) for a speed delivery (anything under 3 months or so) or they just wait for those three months. Is sending a couple cars more (that can’t be customized and must be sold as I built them) improve my numbers?

I am not selling the cars I am producing. How can HANA sell my cars? Maybe I am late to the market with the model. Or it is too expensive compared to my competitors. I can either see it (base it on numbers) or not. But if I get the results of such analysis a day faster (assuming HANA cut the time of a long running job from a day to 8 minutes), how does it matter? What is a day in a life cycle of a car model?

 

On speed and people

Speed. That sounds cool right? Car manufacturers sell fast cars for a premium. People like fast cars. Do you like fast cars, dear reader? I would certainly try a couple of them on a German autobahn.

But do people like speed? I don’t think so. Speed means deadlines. It means thinking fast, acting fast. Sometimes it means making mistakes. It means facing risks. It means stress. It means swimming into the unknown with the pace that leaves less than our usual time for re-adjustment. That makes us uncomfortable. Discomfort. I don’t like that. Here is my comfort zone. I don’t want to go …there. I want to stay here. Inertia. Action and reaction.

Sorry for the emotional detour. What I am trying to say is that processes are run by people. They don’t run in machines. No matter how fast one report is (whether it runs on HANA or not) there are people that work with the machine, that provide inputs, collect outputs etc. There is a threshold when system performance becomes a pain. See a website that takes 30 seconds to load. That is annoying right? But if that report that you only run once a week for your team meeting to discuss it takes 12 second or 14, does it matter? Or let’s say that report takes 2 minutes to run. If you could push that down to 2 seconds, would you run the report more often? If you ran the report more often, would there be a benefit for you, your boss or your company in you doing it?

You can’t change people. At least not easily. For many people – the normal mortals and coincidentally users of a SAP system – the IT thing and the whole SAP system is a black-box. That means that when your secretary types in your travel expenses, she will not do it faster because this system runs on HANA. She does not know about HANA. She does not care either. Let’s say you work in the company’s IT and your boss decides your budget (reality, right?). Your boss is not an IT engineer (no matter if it is a lady or a gentleman) or even if so, it is not a HANA fanatic. Probably not even a SAP fanatic. How do you sell such a person your most recent HANA ambition?

If you are in the business for long enough, you must have heard the expression “bend the company around SAP”. Let’s put aside the fact that SAP brings some great industry best practices and such bending can bring lot of good into a company. People don’t like this. They will change their ways if the stimulus is strong enough (less work?) or the pressure is big enough (you must do it or you go). See iPhone. I don’t like Apple in general, but I can see how iPhone had this strong stimulus when it was introduced (it was idiot proof to use it, colourful, entry barrier very low since it is idiot proof etc.) and that is why it became a huge success. Is this the case with HANA? No. Huge adoption barriers and unclear benefit (for a normal mortal, an iPhone type of user). People will stand their ground. You want to bend your company around the new opportunities and reach for new horizons? Well, you must fire the people of wait for them to die (meaning their career at your company…).

IT is a black box for them. It must just work. They don’t care if you run SolMan. They don’t care if you run HANA (unless there is a problem with a vital threshold – like the one with the web page response – how many companies have such hard thresholds? Retailers maybe. Who else?). Technology shift that brings light-fast speed is not the killer trick.

 

...then it must be the simplification!

Then it must be simplification. Hm, ok. What could that mean? Mr. Plattner drops some hints. Simplified ERP? All my systems (CRM, SRM, ERP) put together? No BW system (because it is not needed)?

That sounds very very cool. If you’re the marketing guy and you buy what you sell. Reality check?

It does not sound like you take what you have (current ERP, current CRM, all the systems that you are currently running) and you push a button a voila… a sERP system. I still remember that OSS support ping pong when the landscape optimizer product (or what the official name was) was introduced. So I don’t see it how I put my current systems together into one easily.

I am a developer. I can see the loads of code that live in my system. Tons and tons of code where the quality varies and the “Date create” varies from 199X to yesterday. I have customers with systems full of non-Unicode programs. How could one turn this around into a sERP easily? As a developer myself I know that it is probably easier (from the development process organization point of view, quality standpoint etc.) and also right (because of the new software design trends etc.) to start over. Oh. But that means several things.

That means that SAP will probably start over with what they have. Either partially or completely. That means new bugs. New support ping pongs. New products aspiring for maturity which will take years and loads of frustration.

That also means I will have to implement or re-implement what I have. More consulting. More money needed. And spent. More dependency on externals that learn fast enough to keep themselves up-to-date with what SAP produces.

What about my custom code? If I have this simplified ERP thing now, it has a different data model. Different APIs. Different programs. I may not need my custom programs anymore. Or I may end up with a need for more. Gosh. More assessments. More upgrades. More development. More audits. More.

That was my company. But things also get personal. It is my job that we are talking about here and what happens to my job when there is no CRM, no SRM etc.? Unless I am overseeing something BW comes first here. If BW is not needed anymore because everything is real-time and I am a BW specialist, what will I do for living? I am not needed anymore. I am obsolete. A dinosaur. A fossil. How many customers are out there running BW? What happens with those people if BW is not needed anymore? Will that happen fast? Or over ten years period so they can adjust themselves so they can still keep their families fed and happy and safe?

When I hear about simplification in other areas these days, people translate it into job cutting. Simplified, lean, that means people will get fired. Not every company is so smart to understand that by automating or simplifying things you can give more advanced, more innovative type of work to people that don’t have to perform repetitious tasks anymore. That would boost their motivation. They would push the horizons further. They would have fun doing it (not necessarily everyone, but ok). Some companies lay people off instead.

I know, dear SAP, that you mean well. But you need to explain that better. You need to give people evidence, roadmaps (with meat on the bones), set expectations right, explain how we get from point A to B so that everyone is still on board. Remember SEALs? We leave no man (customer) behind. Tell us how you plan to do it. Dispel fear, confusion.

I know simplification is good. I like the Einstein’s quote (if it was Einstein, I hope so): “If you can explain it simply, you don’t understand it enough”. I don’t think that is the case with SAP. SAP invented the ERP as we know it (my opinion). The data model and the processes and the customizing, the know-how collected and invented hand in hand with millions of customers, all that is super impressive. I am sure SAP will know how to simplify because they know the business well enough (I am just a bit afraid that the individuals that will be responsible for the simplification process will not deliver on a consistent quality level, but that is a different story).

Back to the SolMan beginning. I didn't mean to criticize HANA. I didn't mean to criticize SolMan either. Both products are great. But they way they're sold, the way they are perceived is in my opinion very similar. You don't need them. They improve something you already have. Challenging.

But all will be well one day. For HANA. For customers. For SAP. But it is not that easy how SAP marketing sees it. You still have room for improvement there and please consider if it is not a good idea to fill that room with guidance, with numbers, with evidence, with fighting with your customers hip-to-hip. It is not you and them. It is us.

 

p. s.: Are you a normal mortal and want to read more on HANA? Consider the Owen Pettiford ’s Impact of Platform Changes:  How SAP HANA Impacts an SAP Landscape, I quite liked it although it is just an overview.

Hay Bouten

Keeping an eye on SAP HANA

Posted by Hay Bouten Oct 17, 2014

During my SAP HANA classes I often get the question “How should I monitor the SAP HANA systems?” I think that is a good question and in this blog I want to explain how you can monitor your SAP HANA systems.

 

The monitoring options for SAP HANA systems

 

As an SAP System Administrator you need to keep your SAP HANA systems up and running. So what would be an good way of doing this? There are two options and depending on your needs and available infrastructure you should decide which option suits you best.

 

  • You can monitor you SAP HANA systems using SAP HANA Studio
  • You can monitor you SAP HANA systems using SAP Solution Manager

 

 

Monitoring SAP HANA systems using SAP HANA Studio

 

This solution is best for customers that have no SAP ABAP footprint. The monitoring can be done using the monitoring tools included in SAP HANA Studio. Using SAP HANA Studio you can monitor the following areas:

 

  • Overall System Status using the System Monitor view
  • Detailed System Status using the Default Administration view
  • Memory usage per system or service using the Memory Overview view
  • Resource usage using the Resource Utilization view
  • System Alerts using the Alert view
  • Disk Storage using the Volumes view
  • Overall view using the SAP HANA Monitor Dashboard

 

Overall System Status using the System Monitor view

 

In the System Monitor you will find an status overview of all the SAP HANA systems connected to your SAP HANA Studio. As you can see in the screenshot below it shows the system status including some important metrics. With this monitor you can quickly see if all systems are functioning within normal specifications.

 

SAP_HANA_System_Overview.png

 

Detailed System Status using the Default Administration view

 

In the Detailed System Status view you get an detailed overview of the most important metrics from the selected SAP HANA system. With this information you can see if all the different areas (memory, CPU and disk) are running within the given thresholds. If not you can follow the links for an deeper investigation.

 

SAP_HANA_Admin_Overview.png

 

Memory usage per system or service using the Memory Overview view

 

I think that memory usage is one of the most important metrics in an SAP HANA system. That is why there are several nice views available to show the overall memory usage and the the usage per service. These views also give the option to look at the memory usage in an specified time period.  This can be very useful to investigate what happened last night.

 

Memory Overview per System

SAP_HANA_Memory_Overview.png

 

Memory Overview per Service

SAP_HANA_Service_Monitor.png

 

Resource usage using the Resource Utilization view

 

The resource monitor lets me look at CPU, memory and storage metrics in an combined view. It also show me the graph in an specified time period. Using this monitor give me the opportunity to find the root cause of the problem.

 

SAP_HANA_Resource_Utilization.png

 

System Alerts using the Alert view

 

The System Alerts view show all the alerts that have been triggered in the system. I can specify my own thresholds values and for the alters that I think are important I can setup email notification. With is email notification I get informed on important alerts even when I'm looking after other systems.

 

SAP_HANA_Alert_View.png

 

Disk Storage using the Volumes view

 

Even in an In-Memory database disk storage is important, so using the Volumes view you can keep track of the filling level and the IO performance.

 

SAP_HANA_Volumes_View.png

 

The SAP HANA Monitoring Dashboard

 

As of SAP HANA SPS08 the is also an Administration Dashboard available that show the most important metrics in a SAP Fiori Dashboard.

 

SAP_HANA_Dashboard.png

 

I have also recorded a video showing all the SAP HANA Studio monitoring features.

 

 

Monitoring SAP HANA systems using SAP Solution Manager

 

This solution is best for customers that use SAP Solution Manager already for managing and monitoring SAP systems. SAP HANA is added to SAP Solution Manager as a Managed System and from there you can setup Technical System monitoring. The Maintenance Optimizer (MOPZ) and DBACOCKPIT are also fully operational.

 

Using the transactions SM_WORKCENTER and DBACOCKPIT gives as many monitoring capabilities (maybe even more) than what is possible in SAP HANA Studio.

 

I have created a few video's to demonstrate the transactions SM_WORKCENTER, the The Maintenance Optimizer (MOPZ) and DBACOCKPIT.

 

Monitoring SAP HANA using transactions SM_WORKCENTER


This video shows how you can monitor SAP HANA system using SAP Solution Manager 7.1 SP12

 

Monitoring SAP HANA using transaction DBACOCKPIT

 

This video shows the transaction DBACOCKPIT in a SAP Solution Manager 7.1 SP12 connected to a SAP HANA system.

 

Requesting an SAP HANA Support Package Stack using MOPZ

 

This video show how you can use SAP Solution Manager 7.1 sp12 to request and download an Support Package Stack for SAP HANA.

 

If you want to learn more on SAP HANA and SAP Solution Manager then visit the SAP Education website. There is a curriculum for SAP HANA Administration and SAP Solution Manager.

 

You can also visit the SAP Leaning Hub and have a look in my SAP Leaning Room "SAP HANA Administration and Operations".

 

SAP_HANA_Learning_Room.png

 

Have fun watching over SAP HANA

Shrink your Tables with SAP HANA SP08

 

Abani Pattanayak, SAP HANA COE (Delivery)

Jako Blagoev, SAP HANA COE (AGS)

 

 

Introduction:

 

Yes, with HANA SP08 you can significantly reduce size of your FACT tables significantly. Depending on the the size of the primary key and cardinality of the dataset, you can get significant (up to 40%) savings in static in static memory usage.

 

This savings in memory is compared to HANA SP07 or earlier revisions.

 

So what's the catch?


There is no catch.


The saving is based on how the primary key of the table is stored in HANA. Please check the biggest FACT table in your SP07 or SP06 HANA database, the size of the primary key will be around 30 - 40 % of the total size of the table.

 

With HANA SP08, we can eliminate this 30 - 40% memory taken by the primary key. So there is no negative performance impact on query performance.

 

 

Show me the Money (What's the trick)?


You need to recreate the primary key of the table with INVERTED HASH option.

 

CREATE COLUMN TABLE "SAPSR3"."MY_FACT_TABLE"(

        "STORID" NVARCHAR(10),

        "ORDERID" NVARCHAR(15),

        "SEQ" NVARCHAR(10),

        "CALMONTH" NVARCHAR(6),

        "CALDAY" NVARCHAR(8),

        "COUNTRY" NVARCHAR(3),

        "REGION" NVARCHAR(3),

..

..

        PRIMARY KEY INVERTED HASH ("STORID",

        "ORDERID",

        "SEQ",

        "CALMONTH",

        "CALDAY",

        "COUNTRY",

        "REGION"))

WITH PARAMETERS ('PARTITION_SPEC' = 'HASH 8 STORID')

;

 

You can use ALTER TABLE command to drop and recreate primary key of the table.

 

However, if you have a scale-out system or a really BIG fact table with billions of record, We'd highly recommend to create a NEW table with INVERTED HASH Primary Key and then copy the data over to the new table. Then rename the tables.

 

Result

 

The following is the result of updating the primary key in a customer project. As you see below, the saving in static memory is around 531GB over 1980GB.

So overall, there is a saving of at least 2-nodes (0.5 TB each) in a 9-node scale out system.

 

The best part of this exercise, there is no negative performance impact on query performance.

 

Note: I'd suggest you review your existing system and evaluate if you can take advantage of this feature.

 

Shrink Table.PNG

Actions

Filter Blog

By author:
By date:
By tag: