cancel
Showing results for 
Search instead for 
Did you mean: 

SQL Anywhere 17.0.0.1358 constantly crashes (assertion fails 200505, 101412)

Former Member
0 Kudos

For following week me and my team had problem with SQL Anywhere 17 (17.0.0.1211 and 17.0.0.1358) constant assertion crashes.

Database was completely reloaded and engine updated to the latest version, but problem still exists. It usually it happens 4-5 times a day.

Here are the errors, they randomly alternate.

SQLANYs_...: *** ERROR *** Assertion failed:  200505 (17.0.0.1211)

Checksum failure on page 64845

SQLANYs_...:: *** ERROR *** Assertion failed: 101412 (17.0.0.1211)

Page number on page does not match page requested

What could be the culprit here? Could this be the fault of hardware or DBMS engine failure? We tried request logging, but catching a faulty query in our case is close to impossible. Furthermore, after assertion, DMP file appeared, which pointed to a page failure on TEMP DbSpace. So far, we have no solution how to reproduce the error.

Accepted Solutions (0)

Answers (5)

Answers (5)

Former Member
0 Kudos

Tried redirecting TEMP file to a different media (SSD drive), still crashes.

Yes, it does generate file called assert.dmp.

Inside this file:

Invalid page at page:     31a9 in file 15

... and then page dump info...

I think file 15 refers to temporary file, because in sysdbspace view, temporary DbSpace is number 15.

For example: 15;316;temporary;1;(NULL).

Former Member
0 Kudos

You are correct.  That is a temporary file page.

Further, of possible interest, is the number of pages 'in' is 0x31a9 == 12,713  or 49.66 MBs (with a 4KB page size) and that may mean you are hitting above your cache size.  Maybe increasing -ch can help relieve this?

Have you contacted support yet?  Your assert.dmp file is not a process dump. 
Dumps for crashes can be identified by running
    `dbsupport -lc`
and if submitted by running

    `dbsupport -ls`
and submitted by running
      `dbsupport -sa`

Do contact support once you have identified/submitted those.

Former Member
0 Kudos

We've tried unloading database few times, running database on different hardware configurations, enabled request logging of SQL statements, updated DBMS version. However, nothing helped so far.

Environment parameters:

OS: MS Server 2012 R2

DMBS: SQL Anywhere 17.0.0.1359

From what we could understand, error seems to be not deterministic. Request logging information showed that these assertions can be triggered by different SQL statements.

It seems to us, this is a core problem. As end users of SQL Anywhere we cannot work around this issue, because we don't know exactly where and at which specific conditions does it crash.

In addition to this, in order to reproduce the error, database service has to be restarted.

Here is the example that could possibly trigger assertion failed crash:

call ComplexProcedure(...); // Fills global temporary table with approx. 1.5 million rows

select * into tmp_crash from global_temp_sample_table where varchar_field like '500001%' and other_varchar_field like '900000%' and date <= now(); <- Assertion is triggered here

select * from tmp_crash; // <- Here the actual error (or connection closed error) message appears

In addition to this, we noticed that assertion is only triggered if there are LIKE conditions varchar_field like '500001%' and other_varchar_field like '900000%'. All these columns are INDEXED (varchar_field, other_varchar_field). If indexes are dropped, database doesn't crash.

Can you give us any advice how to remedy this issue? Or should we just report this to SAP directly?

Former Member
0 Kudos

Since it is crashing, the database server should be generating dumps.   Do try submitting those dumps using `dbsupport -sa`.  With those in hand you should be working directly with support to investigate thie further.  It does sound like you are close to creating a test case to show this issue happening.  That could go a long way to determining the fault.

Is there still an indication that the temporary files are involved? {You have not provided any evidence of that yet} If so, have you tried moving the location where the SQL Anywhere temp files are stored;using the SATMP environment variable or the -dt switch. It is possible there is nothing wrong with the main database files or the media they are stored on. That could explain why validation and rebuilding the database is not finding any issues.

former_member186998
Contributor
0 Kudos

Please refer to the following note.

1959030 - How To Salvage Data When There are Corrupt Pages in the Database

Thanks,

Atsushi

former_member188493
Contributor
0 Kudos

Bad memory often presents itself as bogus disk errors... since all disk data must pass through RAM before any software can calculate checksums, it's easy to see how this can happen.

Bad memory errors can become actual disk errors, if data corruption goes undetected until after the data has been written to disk... so the disk errors may not be bogus any longer, but the disk was not the cause.

So.... always check the RAM in cases like this.

Former Member
0 Kudos

Both of those assertions are known to be related to hardware failures. In fact, the 200505 checksum assertion was specifically added to detect media failures after pages have been written to the media. Further, the 101412 assertion is one of the top 1 or 2 assertions I usually see when there has been a media failure (ie corrupt database); when checksums are not in use of course. 

So hardware failures would be a leading candidate cause here.  If you are seeing the same page numbers come up than I would expect a bad write to media or a bad spot on the media is involved.

There is always a possibility that htis is something different but I would start looking at the more probable media aspects of these.

Also, once a bad write has happen every time you cycle around to the same broken page you will get an assertion and that may explain the frequency of the occurrance.

     {How did you determine the page in question is in the Temporary dbspace?

      If you have been working with Product Support on this issue I would urge you
      to continue to do so; so we can investigate this more deeply.}

IFF it is always the temporary file and media that is involved, then, I would suggest seeing if setting
the %SATMP% or %TEMP% or %TMP% to force the server to run on another device and check to see if that changes the behaviour any.  If it does then the media itself is truly questionable and where you should focus most of your efforts.

If there is corruption in the main dbspaces then you probably should validate the database and possibly rebuild it to see if you have captured bad pages permanently inside the permanent database files.  Maybe switching to a different machine is possible, and if it is that could help to isolate the failing component.