A new file system feature called "Concurrent I/O" (CIO) was introduced in the Enhanced Journaling File System (JFS2) in AIX 5L™ version 5.2.0.10, also known as maintenance level 01 (announced May 27, 2003). This new feature improves performance for many environments, particularly commercial relational databases.
If you are using Oracle 10g on JFS2 filesystems with the parameter FILESYSTEMIO_OPTIONS set to "SETALL" like recommended by SAP (Sapnote #830576) - CIO is already used automatically.
If you want to know more about async I/O, direct I/O or concurrent I/O, please check the links in the references. There is a link to a white paper published by IBM.
In this blog post i will only focus on the filesystems for the online redolog files. The filesystems origlogA, origlogB, mirrlogA and mirrlogB are the import ones in a sap standard environment.
The created logical volumes and its filesystems in my test environment:
shell> lsfs -cq /oracle/<SID>/origlogA
#MountPoint:Device:Vfs:Nodename:Type:Size:Options:AutoMount:Acct
/oracle/<SID>/origlogA:/dev/lvorigA_<SID>:jfs2::<SID>:3932160:cio,rw:no:no
(lv size 3932160:fs size 3932160:block size 4096:sparse files yes:inline log no:inline log size 0:EAformat v1:Quota no:DMAPI no:VIX no)
shell> lsfs -cq /oracle/<SID>/origlogB
#MountPoint:Device:Vfs:Nodename:Type:Size:Options:AutoMount:Acct
/oracle/<SID>/origlogB:/dev/lvorigB_<SID>:jfs2::<SID>:3932160:cio,rw:no:no
(lv size 3932160:fs size 3932160:block size 4096:sparse files yes:inline log no:inline log size 0:EAformat v1:Quota no:DMAPI no:VIX no)
shell> lsfs -cq /oracle/<SID>/mirrlogA
#MountPoint:Device:Vfs:Nodename:Type:Size:Options:AutoMount:Acct
/oracle/<SID>/mirrlogA:/dev/lvmirrA_<SID>:jfs2::<SID>:3932160:cio,rw:no:no
(lv size 3932160:fs size 3932160:block size 4096:sparse files yes:inline log no:inline log size 0:EAformat v1:Quota no:DMAPI no:VIX no)
shell> lsfs -cq /oracle/<SID>/mirrlogB
#MountPoint:Device:Vfs:Nodename:Type:Size:Options:AutoMount:Acct
/oracle/<SID>/mirrlogB:/dev/lvmirrB_<SID>:jfs2::<SID>:3932160:cio,rw:no:no
(lv size 3932160:fs size 3932160:block size 4096:sparse files yes:inline log no:inline log size 0:EAformat v1:Quota no:DMAPI no:VIX no)
As you can see the filesystems are created with the standard block size of 4096 bytes and the filesystems are mounted with the CIO option.
Lets check/compare this with the access method of oracle:
SQL> show parameter filesystemio_options
NAME TYPE VALUE
------------------------------------ ----------- ---------
filesystemio_options string SETALL
shell> lsof +fg /oracle/<SID>/origlogB
COMMAND PID USER FD TYPE FILE-FLAG DEVICE SIZE/OFF NODE NAME
oracle 1224804 <SID>adm 20u VREG R,W,CIO,DSYN,LG;CX 37,7 943718912 4 /oracle/<SID>/origlogB (/dev/lvorigB_<SID>)
oracle 1224804 <SID>adm 24u VREG R,W,CIO,DSYN,LG;CX 37,7 943718912 5 /oracle/<SID>/origlogB (/dev/lvorigB_<SID>)
The crosscheck between the oracle setting and the access method to the online redolog files fits.
So what can be "wrong" now with the configuration / setting above. To get an understanding of the problem you need to know that the oracle redo information is written in 512 byte blocks. So if you are using direct I/O or concurrent I/O, the JFS2 blocksize must fit to the requested I/Os to avoid demoted I/O. IBM describes demoted I/O as "Return to normal I/O after a direct I/O failure".
So now lets check if our system has some demoted I/Os:
shell> trace -aj 59B,59C
shell> trcstop
shell> trcrpt -o demoted_io.check
shell> grep demoted demoted_io.check
59B 0.001330218 0.015755 JFS2 IO dio demoted: vp = F10001006279B7F8, mode = 0001, bad = 0002, rc = 0000, rc2 = 0000
59B 0.001411175 0.018402 JFS2 IO dio demoted: vp = F1000100627AB7F8, mode = 0001, bad = 0002, rc = 0000, rc2 = 0000
59B 0.064204179 0.008152 JFS2 IO dio demoted: vp = F10001006279B7F8, mode = 0001, bad = 0002, rc = 0000, rc2 = 0000
...
...
59B 0.985171921 0.001468 JFS2 IO dio demoted: vp = F1000100627AB7F8, mode = 0001, bad = 0002, rc = 0000, rc2 = 0000
59B 1.005359694 0.030008 JFS2 IO dio demoted: vp = F1000100627AB7F8, mode = 0001, bad = 0002, rc = 0000, rc2 = 0000
59B 1.017411856 0.011505 JFS2 IO dio demoted: vp = F10001006279B7F8, mode = 0001, bad = 0002, rc = 0000, rc2 = 0000
This was a trace with round about 3 seconds - you can see that you have some demoted I/O calls in your system.
To avoid demoted I/O calls you need to create a JFS2 filesystem with a blocksize of 512 bytes. Just keep in mind that this is only necessary for the online redolog filesystems. On my test system i shutdown the database, moved the online redologs to another filesystem, deleted the old redolog filesystems and moved the online redolog files back. Of course you can do that also online by adding additional redolog groups.
The new created logical volumes and its filesystems in my test environment:
shell> lsfs -cq /oracle/<SID>/origlogA
#MountPoint:Device:Vfs:Nodename:Type:Size:Options:AutoMount:Acct
/oracle/<SID>/origlogA:/dev/lvorigA_<SID>:jfs2::<SID>:3932160:cio,rw:no:no
(lv size 3932160:fs size 3932160:block size 512:sparse files yes:inline log no:inline log size 0:EAformat v1:Quota no:DMAPI no:VIX no)
shell> lsfs -cq /oracle/<SID>/origlogB
#MountPoint:Device:Vfs:Nodename:Type:Size:Options:AutoMount:Acct
/oracle/<SID>/origlogB:/dev/lvorigB_<SID>:jfs2::<SID>:3932160:cio,rw:no:no
(lv size 3932160:fs size 3932160:block size 512:sparse files yes:inline log no:inline log size 0:EAformat v1:Quota no:DMAPI no:VIX no)
shell> lsfs -cq /oracle/<SID>/mirrlogA
#MountPoint:Device:Vfs:Nodename:Type:Size:Options:AutoMount:Acct
/oracle/<SID>/mirrlogA:/dev/lvmirrA_<SID>:jfs2::<SID>:3932160:cio,rw:no:no
(lv size 3932160:fs size 3932160:block size 512:sparse files yes:inline log no:inline log size 0:EAformat v1:Quota no:DMAPI no:VIX no)
shell> lsfs -cq /oracle/<SID>/mirrlogB
#MountPoint:Device:Vfs:Nodename:Type:Size:Options:AutoMount:Acct
/oracle/<SID>/mirrlogB:/dev/lvmirrB_<SID>:jfs2::<SID>:3932160:cio,rw:no:no
(lv size 3932160:fs size 3932160:block size 512:sparse files yes:inline log no:inline log size 0:EAformat v1:Quota no:DMAPI no:VIX no)
The results are compared on an oracle database 10.2.0.2.0 and AIX 5300-06-03-0732 with SAN disks on an IBM DS8000.
The test scenario
I have written a PL/SQL script that makes parallel inserts into 10 different test tables (2.000.000 datasets per table) and a commit after 2 datasets per table. This parallel load is much more as you will face it in your productive environment, but with this simulation you can see the difference very well. I take an AWR snapshot before and after the load simulation - each snapshot is round about 5 minutes. I simulate this scenario three times to get more values that i can compare.
With JFS2 blocksize of 4096
Log file sync
Runs | Total waits | Total wait time in s | Average wait in ms |
First | 2,023 | 335 | 166 |
Second | 3,555 | 186 | 52 |
Third | 4,194 | 274 | 65 |
Log file parallel write
Runs | Total Waits | Total wait time in s | Average wait in ms |
First | 8,085 | 253 | 31 |
Second | 10,272 | 232 | 23 |
Third | 15,189 | 235 | 15 |
With JFS2 blocksize of 512
Log file sync
Runs | Total waits | Total wait time in s | Average wait in ms |
First | 4,438 | 8 | 2 |
Second | 12,563 | 25 | 2 |
Third | 4,171 | 10 | 2 |
Log file parallel write
Runs | Total Waits | Total wait time in s | Average wait in ms |
First | 178,120 | 204 | 1 |
Second | 175,998 | 203 | 1 |
Third | 164,397 | 202 | 1 |
The performance of "log file syncs" and "log file parallel writes" increases drastically.
In a normal environment the performance variability of log file syncs / log file parallel writes will be eliminated and the values will be static (except hardware or OS problems).
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Subject | Kudos |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
User | Count |
---|---|
8 | |
7 | |
5 | |
4 | |
4 | |
4 | |
4 | |
3 | |
3 | |
3 |