on 10-08-2015 5:23 PM
Hi community,
I just came accross this document on the Web:
I helped me understanding how data are stored but I didn't get all the points.
Have you ever read it? and do you have any comments about the content?
Based on it, I'd like to 'see by myself' how data are physically stored depending on the objects ( APL / DOL tables ; CI / NCI ; LOB...).
Could you tell me the command I should use if I want to see, for a 'structure' (table / index / lob...) the allocation unit / oam / page-chain?
i found 'dbcc listoam' but I believe there are others.
Thanks in advance.
Simon
Unfortunately, that document is full of a lot of inaccuracies as well..... Segments really make sense when using different classes of storage - e.g. if you know for example that you have tiered storage and can reserve tiers without worrying about migration - then segmentes are useful for specifying locations for partitions or indexes - especially text indexes. However, unless you can control the underlying storage locations, using segments can be an exercise in frustration - especially if the storage system automatically moves data between storage tiers.
One of many fallacies in the paper was that storage fragmentation (and by inference the need for segments) is driven by the fact that table and index storage is intermingled. This is actually false. When any object is defined for storage, we preallocate a set number of extents for precisely that object/index - by default (and minimum I think) this is 2. However, users often find that larger is better as it can reduce hits on the OAM during high txn rates. An APF can be of *any* size - assuming you have a corresponding cache of that size (e.g. 16K), and a large IO will also be the size of any cache pool larger than page size. However, a large IO is best thought of as an extent IO - as we will only read an extent at a time and no larger chunks and all the large IO is done within the boundaries of an extent - meaning a large IO will never span across more than one extent. Given an extent allocation scheme and the number of preallocated extents, this means that fragementation of multiple tables or multiple indexes has no relevance to whether large IO is useful or not. It *DOES* have an impact on IO concurrency and device contention - but that largely depends on OS, volume manger, storage subsystem, etc. This is another area where segments can help as they can distribute writes to different OS devices and thus bypass IO concurrency limits at the device/volume level.
And anything mentioning DOL (DPL or DRL) is highly suspect in that paper. Reading through it, I can only conclude that while the author is familiar with the basics of row forwarding, that is about all. Description of placement indexes (of which a clustered index is precisely that) is more than slightly inaccurate as a DOL table with a clustered index will NEVER operate as a heap unless the clustered index is on monotonic key (such as dates - e.g. txn dates) or on really small tables when the placement rules all result in the same allocation unit or extent due to the small size.
My recommendation would be to start with the ASE P&T Physical Database Tuning doc and then play around with dbcc page() with some tests to see what is happening. However, temper ideals with a cold dose of reality. Yes, city - a required field for address - may be required and therefore tempting according to some to make it char(30) - the reality is that varchar(30) is likely better - even if not null as you can save quite a bit of storage and reduce IOs over 100's of millions of rows. And if a column might have an unknown value - then a "null" column is perfectly acceptable vs. being silly and forcing a not null and putting in spaces just for an unknown value.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi all,
I did a little test and the results are not in line with what I expected -- unless I did something.
The test below was executed on "Adaptive Server Enterprise/15.0.3/EBF 17686 ESD#1.1 RELSE/P/Sun_svr4/OS 5.8/ase1503/2681/64-bit/FBO/Thu Aug 20 14:20:57 2009"
I have a table named TABLE_A
select object_id('TABLE_A') --> 1737782963
select * from syspartitions where name like 'TABLE_A%'
name | indid | id | partitionid | segment | status | datoampage | indoampage | firstpage | rootpage | data_partitionid | crdate | cdataptnname |
TABLE_A_1737782963 | 0 | 1737782963 | 1737782963 | 1 | 2 | 72680 | 0 | 72681 | 101682 | 0 | 2015-05-03 15:44:16 | (null) |
dbcc listoam(5,1737782963,0)
Code: 0 --- -----------------------------------------------------------------------------
Code: 0 --- Partition id: 1737782963 indid: 0 prevpg: 72680 nextpg: 72680
Code: 0 --- OAM pg cnt: 1 Entry cnt: 3
Code: 0 --- Rows: 126 Rows Per pg: 0
Code: 0 --- Used pgs: 19 Unused pgs: 5
Code: 0 --- Attribute entries: 15
Code: 0 --- OAM status bits set: (0x8000 (PG_OAMPG), 0x0008 (PG_OAMATTRIB), 0x0004 (PG_OAMSORT))
Code: 0 --- LAST SCANNED OAM PAGE: 0
Code: 0 --- ALLOCATION HINTS :
Code: 0 --- 72680 0 0 0
Code: 0 --- 0 0 0 0
Code: 0 --- 0 0 0 0
Code: 0 --- 0 0 0
Code: 0 --- BCP LAST PAGE: 72681
Code: 0 --- OAM pg # 1: 72680 has the following 3 entries (allocpg:used/unused):
Code: 0 ---
Code: 0 --- [ 0] 72448: 8/ 0 73984: 8/ 0 101632: 3/ 5
Code: 0 ---
Code: 0 --- ---- End of OAM chain for partition 1737782963 ----
From this output my understanding is:
dbcc page(5,72448,1,0)
Code: 0 ---
Code: 0 --- Page read from disk.
Code: 0 ---
Code: 0 --- BUFFER:
Code: 0 --- Buffer header for buffer 0x1a86ea000 (Mass head)
Code: 0 --- page=0x1a86e9000 bdnew=0x0 bdold=0x0 bhash=0x0
Code: 0 --- bmass_next=0x0 bmass_prev=0x0 bdbid=5
Code: 0 --- bvirtpg= [ 0x1a86ea0b8 vpgdevno=8 vpvpn=144896 vdisk=0x15e8e4090 ]
Code: 0 --- bmass_head=0x1a86ea000 bmass_tail=0x1a86ea000
Code: 0 --- bcache_desc=0x1a86f2ae8 (cache name='default data cache')
Code: 0 --- bpool_desc=0x0 bdbtable=0x0
Code: 0 --- Mass bkeep=0 Mass bpawaited=0 Mass btripsleft=0 Mass btripsleft_orig=0
Code: 0 --- bmass_size=4096 (4K pool) bunref_cnt=0
Code: 0 --- bmass_stat=0x0800 (0x00000800 (MASS_NOTHASHED))
Code: 0 --- bbuf_stat=0x0 (0x00000000)
Code: 0 --- Buffer blpageno=72448 bpg_size=4k Mass blpageno=72448 (Buffer slot #: 0)
Code: 0 --- bxls_pin=0x0 bxls_next=0x0 bspid=0
Code: 0 --- bxls_flushseq=0 bxls_pinseq=0 bcurrxdes=0x0
Code: 0 --- Latch and the wait queue:
Code: 0 --- Latch (address: 0x1a86ea030)
Code: 0 --- latchmode: 0x0 (FREE_LATCH)
Code: 0 --- latchowner: 0
Code: 0 --- latchnoofowners: 0
Code: 0 --- latchwaitq: 0x0 latchwaitqt: 0x0
Code: 0 ---
Code: 0 --- Latch wait queue:
Code: 0 ---
Code: 0 --- PAGE HEADER:
Code: 0 --- Page header for page 0x1a86e9000
Code: 0 --- pageno=72448 dealloc_count=5636 ptnid=99 allocation_page dbid=5 timestamp=0000 00000001, segmap=0x00000003 (0x00000002 (SEG_DEFAULT), 0x00000001 (SYS_SEGMENT))
Code: 0 --- page status bits: 0x300 (0x0200 (PG_AP_PTNID_UPGD_CMPLT), 0x0100 (PG_AP_HAS_PTNID))
Code: 0 ---
Code: 0 --- Allocation Page Data:
Code: 0 --- pallocreserved:
Code: 0 --- 1a86e9020 ( 0): 00000000 00000000 00000000 00000000 ................
Code: 0 --- 1a86e9030 ( 16):
Code: 0 ---
Code: 0 --- pextents[]:
Code: 0 --- no. start page --ptnid--- rsv fwd spre --objid--- alloc deall indid status
Code: 0 --- 0: 72448 803099587 0x00 0x00 0x00 803099587 0x03 0x00 2 0x01
Code: 0 --- 1: 72456 1857347296 0x00 0x00 0x00 1857347296 0x03 0x00 0 0x00
Here I got lost with the output of dbcc page. The extent starting at page 72448 holds information related to object 803099587 and it looks to be an index as indid is 2.
I expected objectid to be 1737782963 and indid 0.
Am I misunderstanding anything?
Thanks all and have a good week-end.
Simon
What you are seeing here is an allocation page. Every 256 pages is an allocation unit and the first page of the allocation unit is the allocation page itself (object id 99), which holds 32 extent records that record which object the 32 extents in the allocation unit are allocated to, and which pages are in use.
The first extent is called a "short extent" because it only has 7 pages available to hold object days, the first page of the first extent is used up by this allocation page.
If the page number is evenly divisible by 256, it is an allocation page.
Goof afternoon,
After a long break working on 'non-ASE' topic, I now have a bit of time to continue on this topic.
I'm curious to know how the linkage is done in an extent when you read data from a table?
Here's the example I ran:
I have a table with one field (numeric(10,6)) and containing 1000 rows
1> sp_help SIMON
2> go
Name Owner Object_type Object_status Create_date
----- ----- ----------- -------------------- -------------------
SIMON dbo user table keep first text page Apr 1 2016 11:31AM
(1 row affected)
Column_name Type Length Prec Scale Nulls Not_compressed Default_name Rule_name Access_Rule_name Computed_Column_object Identity
----------- ------- ------ ---- ----- ----- -------------- ------------ --------- ---------------- ---------------------- ----------
numero numeric 6 10 6 0 0 NULL NULL NULL NULL 0
Object does not have any indexes.
No defined keys for this object.
name type partition_type partitions partition_keys
----- ---------- -------------- ---------- --------------
SIMON base table roundrobin 1 NULL
partition_name partition_id compression_level pages row_count segment create_date
--------------- ------------ ----------------- ----- --------- ------- -------------------
SIMON_925968947 925968947 none 4 1000 default Apr 1 2016 11:31AM
1> select count(*) from SIMON
2> go
-----------
1000
Then I checked table syspartitions to know where the first datapage is
1> select datoampage,firstpage,rootpage from syspartitions where name='SIMON_925968947'
2> go
datoampage firstpage rootpage
------------- ------------- -------------
16472 16473 16473
Then I execute dbcc page on page 16473 and I was expecting to see a link to the next page in field nextpg but this is not the case
Page found in Cache: default data cache. Cachelet: 2
BUFFER:
Buffer header for buffer 0x7fffd7612a000
page=0x7fffd76129000 bdnew=0x7fffe7a776f40 bdold=0x7fffe7a776f40 bhash=0x7fffe7aaabad0
bmass_next=0x7fffe7d120e48 bmass_prev=0x7fffe7aaabad0 bdbid=2
bvirtpg=[ 0x7fffd7612a0c8 vpgdevno=7 vpvpn=29874 vdisk=0x7fffc8ae15d78 ]
bmass_head=0x7fffe7a776f40 bmass_tail=0x7fffe7a776f40
bcache_desc=0x7fffd7612ff38 (cid=0 cache name='default data cache')
bpool_desc=0x7fffd76139d00 bdbtable=0x7fffd0082ce80
Mass bkeep=0 Mass bpawaited=0 Mass btripsleft=0 Mass btripsleft_orig=0
bmass_size=4096 (4K pool) bunref_cnt=0
bmass_stat=0x6001010(0x0000000004000000 (MASS_DONT_DISCARD), 0x0000000002000000 (MASS_WRITE_BY_HK), 0x0000000000001000 (MASS_HASHED), 0x0000000000000010 (MASS_INWASH))
bbuf_stat=0x1 (0x00000001 (BUF_PG_REF))
Buffer blpageno=16473 bpg_size=4k Mass blpageno=16473 Mass blocallockkeep=0 bxls_pin=0x0 bxls_next=0x0 bspid=6
bxls_flushseq=0 bxls_pinseq=0 bcurrxdes=0x7fffd1260d710
Latch and the wait queue:
Latch (address: 0x7fffd7612a038)
latchmode: 0x0 (FREE_LATCH)
latchstatus: 0x0 latchowner: -1
latchnoofowners: 0
latchwaitq: 0x0 latchwaitqt: 0x0
latchlastfileindex=0 (nofilename) latchlastlineno=0
Latch wait queue:
PAGE HEADER:
Page header for page 0x7fffd76129000
pageno=16473 nextpg=0 prevpg=0 ptnid=925968947 timestamp=0000 04fccfea
nextrno=289 level=0 indid=0 freeoff=3512 minlen=12
page status bits: 0x881 (0x0800 (PG_XHEADER), 0x0080 (PG_FIXED), 0x0001 (PG_DATA))
second set of page status bits (dol_stat2): 0x02 (0x02 (PG2_DOL_DATAPG))
dol_pprivstat status bits: 0x01 (0x01 (DOLPG_PRIV_COMMITTED))
pagesize=4096, page version=1, ncfs=0, ndeleted=0, insert free space=4
pagetailts=0xcfea
So I'm wondering how ASE knows it has to read the others pages to retrieve all the information in case I do a 'select * from SIMON'?
Thanks for your comments.
Simon
Hi Simon,
For a table using the DOL lock scheme, table scans are done using an OAM scan. The OAM for the data pages contain an entry for each allocation unit that the data has extents allocated on. Each allocation unit has an allocation page which contains 32 extent records that record which object + index the extent is allocated to and a bit-map showing which of the 8 pages in the extent are currently in use (i.e. have at least one row on it). The scan uses those bitmaps to determine which pages need to be read to retrieve all the information.
If you create the table with the APL lock scheme, I think you will see the page linkages you are expecting.
Cheers,
-bret
Straight out the the internals course materials Bret. Cool! Jean-Pierre
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
That doc is the fine work of Derek Asirvadem. His email is in the doc if you dare to ask any questions. Derek's stuff is always a fascinating read. Mixed in with what is some really good info, seems to be a gratuitous, relentless rant against DOL tables and "Placement Indexes". Aside from that, some really good stuff in that doc. I'm curious what "points" did you not quite get? Was it one of these?:
- "During the discussion of logical or physical DataStructures, non-technical terms such as 'table','base table' and 'object-index pair' are too ambiguous to be meaningful: those who use them are committed to your continued confusion."
- "The use of the term "clustered" Index in relation to DOL tables is therefore incorrect, confusing, and fraudulent."
- "Sites that use such tables generally do not use Segments, and thus all DataStructures in the entire database is fragmented across the single default Segment. Florists call this "striped", and wonder why it is slow; engineers call it retarded."
- "Based on the naïve belief that Evangelists Preach the Gospel, while ignoring the fact that Evangelism is a marketing concept, and in substitution of genuine knowledge and technical examination: Myth that the DOL Placement Index (unfortunately addressed via the "clustered" syntax), is the same as the Clustered Index."
or my personal favorite:
- "A man and a woman are meant to be married; together they achieve more than each achieves separately. Implementing APL tables without a Clustered Index, is analogous to a divorced couple. Likewise, there is no fidelity in non-unique Clustered Indices "
Ah yes. That's all pure Derek.
Go ahead, email him your questions. Let us know how that goes for you
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Simon,
Thanks for the link - a really good detailed description.
Not sure how keen you are to see the actually underlying representation of data but you could ...
create a database using an underlying file (not a raw device)
create a table in this database
populate the table with a few rows with a string you can search for
shutdown the database
do an hex/octal dump on the database and search for the data you've inserted
It'll be difficult to get much out of this for the indexing location - but it might help.
I've only ever used it to understand how data rows are stored on disk but am sure its possible if you put in enough effort and time.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Mike,
My idea is just to do some tests and retrieve those physical information so I can 'better' understand the document.
The purpose behind all this is that today all my tables are created on a default segment and I want to to test and show to the developer/dba that we should really think of creating separated segments as explained in order to speed up the operations.
Best
Simon
User | Count |
---|---|
90 | |
10 | |
10 | |
10 | |
7 | |
7 | |
6 | |
5 | |
4 | |
3 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.