cancel
Showing results for 
Search instead for 
Did you mean: 

Mobilink 16 - Invalid sync sequence ID for remote ID

Former Member
0 Kudos

Hello!

Previously I asked questions on forums.sybase.com, and this is my first post here, so hopefully I am in the right place

I am having a strange situation on a Production environment, that I can't reproduce on Dev environments. Some mobile users are reporting that their synchronization stops working, and the following error appears on Mobilink logs (user-specific information suppressed):

I. 2015-03-31 17:43:18. <15> Request from "UL 16.0.2041" for: remote ID: 3, user name: XXX, version: XXX

I. 2015-03-31 17:43:18. <15> The sync sequence ID in the consolidated database: 8532358e03454d7db35f8c29093b2aad; the remote previous sequence ID: 0c5d767ebaed4f82b60373b992d81d87, and the current sequence ID: 72fbaada560b4f86a6d69b09ff2edfd9

E. 2015-03-31 17:43:18. <15> [-10400] Invalid sync sequence ID for remote ID '3'

I. 2015-03-31 17:43:19. <15> Synchronization failed

As far as I know, this kind of problem would occur only if an old version of the remote database was somehow restored in the device - this is the only way for me to reproduce it. However my Service Desk confirmed that they (or the users themselves) are not messing with the database file in any way.

Is there anything I could do to pin down this problem? What else could let these sequence IDs get out of sync?

Accepted Solutions (1)

Accepted Solutions (1)

jeff_albion
Employee
Employee
0 Kudos

Hi Andre,

Do you know what happened with remote ID 3 prior to this synchronization?

The error basically means that the consolidated database and the remote UltraLite database are out of sync - either the remote database was changed via a backup copy being restored (to be 0c5d7...1d87, instead of 85323...2aad) or the consolidated database was changed from a restored backup (from 0c5d7...1d87 to 85323...b2aad).

The other explanation is that it might also happen if you have the same remote ID synchronizing to two MobiLink servers and the remote isn't 'cancelled' on the other server prior to it being seen again - what does your MobiLink infrastructure look like and how many servers are you using?

Regards,

Jeff Albion

SAP Active Global Support

former_member188493
Contributor
0 Kudos

What is this mysterious "sequence ID"? Where is it stored, how is it generated, and what is it used for?

There is a consolidated database BINARY ( 16 ) column in ml_database called seq_id but it seems to be NULL. No corresponding column seems to exist in the SYS tables on the remote side.

If there is a tracking mechanism for synchronizations, surely it should be fully documented.

Former Member
0 Kudos

Hello Jeff,

We do have two Mobilink servers behind a load-balancer. I don't have the exact load-balance rules at hand, but looking at the logs of both servers, I don't see the same user syncing at the same time on both servers.


What I do see though, is that before the sequence ID error, the user experiences some network instability and the sync drops, generating the following log:

07:54:49Request from "UL 16.0.1823" for: remote ID: 131, user name: XX, version: XX
07:54:49The current synchronization is using a connection with connection ID 'SPID 107'
07:54:49The authenticate_parameters script returned 1000
07:54:49COMMIT Transaction: Authenticate user
07:54:50COMMIT Transaction: Begin synchronization
07:54:51COMMIT Transaction: Upload
07:54:52COMMIT Transaction: Prepare for download
07:54:56Sending the download to the remote database
07:54:56COMMIT Transaction: Download
07:55:27COMMIT Transaction: End synchronization
07:59:01[-10279] Connection was dropped due to lack of network activity
07:59:01Synchronization complete

Then, right after, the mobile app retry logic kicks in, and another sync is requested, which fails with the following reason:

07:55:04Request from "UL 16.0.1823" for: remote ID: 131, user name: XX, version: XX
07:55:04The current synchronization is using a connection with connection ID 'SPID 95'
07:55:04[-10002] Consolidated database server or ODBC error:  ODBC: [Microsoft][SQL Server Native Client 11.0][SQL Server]Lock request time out period exceeded. (ODBC State = 42000, Native error code = 1222)
07:55:04[-10002] Consolidated database server or ODBC error:  ODBC: [Microsoft][SQL Server Native Client 11.0][SQL Server]The cursor was not declared. (ODBC State = 42000, Native error code = 16945)
07:55:04[-10343] The remote database identified by remote ID '131' is already synchronizing or the database connection is unusable: unable to access the lock for that remote ID
07:59:08[-10279] Connection was dropped due to lack of network activity
07:59:08Synchronization failed

Thereafter the sequence ID error appears, and this user cannot sync anymore until it deletes the remote database and start again.

jeff_albion
Employee
Employee
0 Kudos

Hi Breck,

These are internal UltraLite progress offsets ( similar to SQL Anywhere progress offsets: DocCommentXchange ). In previous versions, the progress offsets were simple integers, but were changed in version 16 to be GUIDs to avoid ambivalence in interpreting the integer when using multiple MobiLink servers.


There is a consolidated database BINARY ( 16 ) column in ml_database called seq_id but it seems to be NULL.

It shouldn't be for an UltraLite remote - here is what I see after a successful synchronization:


>> rid,remote_id,script_ldt,seq_id,seq_uploaded,sync_key,description

1,'cf54a0d9-49de-4ae3-bd9d-4e899305ebee','1900-01-01 00:00:00.000',0xf83327fccaa84c138dacaa249d050981,1,'f32c1ee20c5f4d25881701576c6bf273',

Regards,

Jeff Albion

SAP Active Global Support

Former Member
0 Kudos

It would help to have more details about the inner workings, since the lack of it leaves us having to guess too much.

For example, by the way of empirical testing, I can assert that, in the server-side, the sequence ID is changed at the commit of the upload transaction. So, if the upload succeeds but the download fails, the server sequence ID is incremented.

I also assume that between the upload and download (more precisely, right after the server commits the upload), the server sends the new sequence ID for the client so it also changes the local database.


What I don't know is, what if the server could not contact the client after the upload commit? Would it revert the sequence ID increment in this case? While writing this, I can think of ways I could also test this empirically, but it would save me some time by knowing straight from the devs.

jeff_albion
Employee
Employee
0 Kudos

Hi Andre,

Is this over TCP/IP or HTTP? It sounds like there's some state-tracking issue going awry if this is happening over a failed synchronization attempt.

Are you positive these are non-overlapping requests? The times suggest otherwise...

Can you open an incident for this? We would likely need to gather additional network diagnostics to try and figure out what's going wrong in these specific circumstances.

Regards,

Jeff Albion

SAP Active Global Support

Former Member
0 Kudos

Its over HTTP.

Yes, there are overlapping requests due to our mobile app retry logic, which waits a few seconds after a failed sync before trying again. In this particular case, our mobile app is waiting only two seconds before a retry, which may be too short of a time. Do you think avoiding these overlaps could be a solution?


I will ask here internally for the opening of an incident, since I don't have the marketplace login for the company.

jeff_albion
Employee
Employee
0 Kudos

Hi Andre,


Do you think avoiding these overlaps could be a solution?

Yes. We have seen issues in previous versions with a similar overlap problem, and our recommendation has always been to "back off" the sync timeout and the application retry logic to something less aggressive, which resolves the problem.

We would still be very interested however to understand the root cause of why it's happening in the first place as there is logic in the MobiLink server to try and avoid this very situation from happening.

Do you know if your load balancer is caching HTTP requests at all?

Regards,

Jeff Albion

SAP Active Global Support

Former Member
0 Kudos

Jeff,

I don't think there is caching, but to be certain I am consulting our Ops.

In the mean time I wondered about the server timeout when there is a loss of connection, since it means our mobile app sync retry logic must wait at least this time before starting another sync. Looking at the logs it seems to be ~4 minutes. Is it configurable? Is there any downside from reducing it?

jeff_albion
Employee
Employee
0 Kudos

Hi Andre,


since it means our mobile app sync retry logic must wait at least this time before starting another sync.


Yes, precisely. We would recommend at least the timeout value plus a small "fudge factor" to ensure previous synchronizations are cleared from the system before attempting to synchronize again.


Looking at the logs it seems to be ~4 minutes. Is it configurable?


Yes. See "timeout" in the MobiLink client network procotol options.


Is there any downside from reducing it?


With a lower timeout, keep-alive messages will need to be sent more often and you may miss being able to continue active synchronizations if a temporary network problem occurs. As the documentation notes, we generally don't recommend setting this value below 30 seconds.


Regards,


Jeff Albion

SAP Active Global Support

Former Member
0 Kudos

So I can conclude that only the client controls the timeout, not the server? Therefore one server may handle multiple clients, each one with a different timeout. Is my understanding correct?

jeff_albion
Employee
Employee
0 Kudos

Hi Andre,

Yes, this is correct. The timeout is set per-client by the client.

Jeff Albion

SAP Active Global Support

Former Member
0 Kudos

All right then. We opened an incident as you asked, but we took action immediately to reconfigure our retry timeouts and will measure the results in the next weeks.

Jeff, thanks for the quick, clear and precise answers. Awesome job!

Answers (1)

Answers (1)

Former Member
0 Kudos

Hello again.

I have a new development on this subject. Even after performing the changes suggested in this thread, the "sequence ID error" continued to happen. Then we began to suspect that a certain thing in our app could trigger this behavior: the fact that we routinely cancel syncs after starting it. We do it by setting the stop flag described in this documentation. Problem is, the server does not seem to acknowledge this interruption, and keeps waiting for communication from the client for 240 seconds until giving up.

I don't know if we are facing a bug here. I find reasonable that the client should inform the server that the synchronization is being interrupted, instead of letting it timeout by itself.

In our case the client starts a new sync right after canceling the previous one, but the server is still waiting for the timeout, so overlapped syncs occur, which eventually leads to the sequence ID error described here.

chris_keating
Advisor
Advisor
0 Kudos

Can you provide more detail in what conditions trigger the cancelling of a sync? Are is the cancelation something that can happen at any time within the sync progress logic?

I will start experimenting to see if I can reproduce this behavior using this new information.


Former Member
0 Kudos

Hello Chris.

Our app cancels the sync whenever a certain condition is met (example - the user enters a certain screen), and then when the condition is lifted the sync is started again. Therefore this stop flag can be set at any moment during the synchronization.

Is this enough info or do you need any specific details?