on 05-11-2016 9:02 AM
Hi
(ASE 15.7 sp60 running on windows 2008)
We are in the process of aggregating many similar databases to one big database on a new server.
Because this is time consuming, we are planing to do this in steps like this:
1. Create the new database, tables etc (but not index, triggers, constraints)
2. Bcp in some of the databases (using fast bcp)
3. create the indexes, triggers, constraints
4. Let the users in and use the new database for some days
5. Drop indexes, triggers and constraints)
6. Bcp in more databases (using fast bcp)
7. create the indexes, triggers, constraints
8. Let the users in and use the new database for some days
(continue until all databases have been bcp-ed into the new database)
Now, the problem.
When we test this, we get good speed on the first bcp-jobs (2.)
Typical we see 1000+ rows / second
BUT after creating indexes etc, and dropping them we get much slower speeds for the rest of the bcp-jobs (6.)
On the same table, we now get 80 rows / second.
What are we doing wrong?
How can I find out more about what is taking so long?
Regards
Pål
Hi again
Thanks Andrew, Mark and Brett for trying to help.
We have now recieved help from Ruairi Prendiville. He found that if we increased the size of Network packages from 2048 to 65024 and the number of pre-allocated extents from 2 to 32, we got very good speeds again. We also changed our device usage from one 700GB device (located on SAN with extreem IO flashdisks), to chunks of 50GB devicefiles (located on the same SAN). I don't know if the device change made mutch diffence, but we are now happy with the speed of bcp in again.
Regards
Pål
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Today we have two more findings.
1. The decrease in speed seems to come only if we create and drop clustered indexes, and then bcp-in to a table on the same device. If we bcp-in to a table on another device, speed is not suffering so mutch.
2. A collection of sysmonwatis.
SELECT msw.WaitTime,mwei.Description FROM monSysWaits msw inner join monWaitEventInfo mwei on msw.waitEventId = mwei.WaitEventId order by msw.WaitTime des
After starting some threads with bcp-in I ran this command, and after a few minutes I ran it again. This is the differences of the two.
Description | WaitTime |
---|---|
xact coord: pause during idle loop | 720 |
hk: pause for some time | 276 |
waiting for incoming network data | 1028 |
until an engine has been offlined wait | 180 |
distributed xact co-ordinator idle loop | 180 |
checkpoint process idle loop | 162 |
Wait until heartbeat or check interval expires | 12 |
waiting for client connection request | 236 |
waiting on run queue after sleep | 417 |
waiting on run queue after yield | 188 |
waiting for last i/o on MASS to complete | 59 |
waiting for network send to complete | 41 |
wait for i/o to finish after writing last log page | 20 |
wait to acquire latch | 19 |
waiting for buf write to complete before writing | 12 |
waiting for a lock | 15 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
I suspect the decrease in performance is due to the way ASE allocates new extents to objects when doing fast bcp or creating indexes.
The allocation process uses the Global Allocation Map (GAM) structure that is stored in the sysgams system table. Sysgams isn’t a normal table. It is essentially a large bitmap with one bit per allocation unit (block of 256 pages, further broken down into 32 extents). If the bit is on, all 32 extents on the allocation unit are allocated, if the bit is off, at least one extent is free.
There are two types of extents, I call them “short” and “long”. Each allocation unit contains 32 extents. The first page of each allocation unit contains the allocation page itself, and holds information on which objects each of the 32 extents on the allocation unit are allocated too, which pages are actually in use, and which of OAM page is tracking each extent.
Because the first page is used for the allocation page, the first extent is “short”, containing only 7 pages for table data. The remaining 31 extents are all “long”, containing 8 pages for table data.
Because of internal optimizations, “Fast” bcp and Create Index can only allocate long extents. In a database that has recently experienced a lot of BCP and/or Index creation, the result can be a large number of allocation units that have all 31 long extents in use but the short extent free.
As you bcp into ASE and fill the current extent, ASE goes looking for a new free extent. It looks first on the same allocation unit the current extent was on. Then it scans the OAM looking for an allocation unit with a free extent (an OAM hint causes the search here to start where the previous search ended). If the OAM search comes up empty, the GAM is used to find an allocation unit with a free extent. In this state there are many allocation units that the GAM shows as having at least one free extent, however upon reading in the allocation page ASE finds it has only the unusable free extent free, so the next allocation unit that isn’t full is read in and the process is repeated until it finally reaches an allocation unit that has a free long extent. The GAM search always starts at the beginning of the database, and big bcp jobs need to allocate a lot of new extents, so the same all-but-full allocation pages keep getting read over and over as more extents are needed.
Typically this problem is not seen because other types of DML activity (normal inserts, expanding updates, etc.) are perfectly happy to use the short extent and as soon as they do these all-but-full allocation units become full, the GAM bit is cleared, and allocation doesn’t look at that allocation unit again (until something deallocates an extent). But the problem is seen at times when there are heavy consumers of the long extents.
A workaround for this issue is to temporarily drop the default segment from the device fragments early in the database that have already become full – the GAM search is limited to allocation units that are part of the object’s segment.
Alternatively, create a new segment on an unused device, use sp_placeobject to cause new allocations for that object to be made on the new segment. However, this is more likely to run out of space then the previous method of dropping the default segment on full devices.
You can test for this being the issue by issuing a number of DBCC STACKTRACE commands on the spids doing the bcp in, if most of the stacks show functions like pggam_getfree_lowhigh() and pg__allocate_extents() this is probably the issue.
-bret
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hm. That's an interesting note. There are a couple of things that puzzle me in it.
Hi
We see perhaps a small decrease in speed when creating indexes. But not at all the same decrease as bcp-in.
Also, if we never create the clustered indexes, speed is fine all the way with all databases. It is only when clustered indexes is beeing created and dropped inbetween bcp-in commands we get the slow speeds.
We have done some testing with sp_placeobject.
We find that when we place the table (and the Text/Image Index for that table !!) to another segment with another device, we get mutch better performance.
But the performance is still not as good as before we created and dropped the clustered indexes.
Why is this?
Shouldn't be happening. You can collect monSysWaits for both and compare. I do a lot of BCP of late (fast & slow). Speed should be consistent.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
hi again
I have been away, so sorry for late reply.
What we have found since last time is that it looks like this has something to do with clustered indexes. After we drop and create the clustered indexes, bcp in starts slowing down.
So my question is: what kind of side effect does it have to create and drop clustered indexes?
It looks like ASE is still trying to put the rows in "right" order even though the clustered index is dropped...
Regards
Pål
User | Count |
---|---|
81 | |
10 | |
10 | |
8 | |
7 | |
7 | |
7 | |
6 | |
5 | |
5 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.