on 09-30-2013 2:08 PM
I am hitting a strange error when trying to add a new redo logfile group to an Oracle database. I have managed to add new logfile groups to QA copies of this database and other similarly installed Production ECC6 systems. However, on this one Production SAP system I get the following output returned from the command to add the new logfile group:
SQL> ALTER DATABASE ADD LOGFILE GROUP 1 ('/oracle/PRD/origlogA/log_g01m1.dbf', '/oracle/PRD/mirrlogA/log_g01m2.dbf') SIZE 2047M;
ALTER DATABASE ADD LOGFILE GROUP 1 ('/oracle/PRD/origlogA/log_g01m1.dbf', '/oracle/PRD/mirrlogA/log_g01m2.dbf') SIZE 2047M
*
ERROR at line 1:
ORA-19510: failed to set size of 4192256 blocks for file
"/oracle/PRD/origlogA/log_g01m1.dbf" (block size=512)
ORA-27041: unable to open file
IBM AIX RISC System/6000 Error: 22: Invalid argument
Additional information: 4
I have checked and rechecked that the permissions on the raw devices and their links are correct and have also verified that the size of the raw devices is 2Gb (2048M) and that they are clean with a 'dd' read of them. I can even add one of the files to the database successfully as a new datafile to an existing tablespace. But for some reason I cannot get them added as a redo logfile group to this instance.
Can anyone tell me what might be causing this error?
In particular, what is the signal from the AIX layer telling me?
I'd be grateful of any suggestions!
Kind regards,
Craig Stewart
I would stop the database and then unmount and mount the file systems back to see whether it helps.
Regards
RB
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
From the error, I see possible reasons -
1) Check file system space, if it has sufficient space to add log group.
2) Check command format, You have given - '/oracle/PRD/origlogA/log_g01m1.dbf' and in error it has "/oracle/PRD/origlogA/log_g01m1.dbf"
Shivam
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Shivam,
The raw devices are pre-created on the OS and have been checked as being the correct size so space is not an issue.
The exact same commands worked successfully adding the raw devices as redolog groups on the QA copy of the Production system, so I am confident that this problem is NOT a syntax issue with my 'ALTER DATABASE' command.
Regards, Craig
Hi Craig,
> Can anyone tell me what might be causing this error?
Well, we need some more information. Please provide the following output of the following commands.
shell> lsattr -EH -l <disk_for_new_redolog_group>
shell> lsattr -EH -l <disk_of_a_current_redolog_group>
shell> lsattr -EH -l aio0 (not quite sure about this one as i do not have any AIX 7.1 right here)
Please post also the links (or better said the i-nodes?) for the redo log files.
In addition please provide the truss output by creating the redo log group.
shell> truss -o /tmp/truss.out -p <PID>
> In particular, what is the signal from the AIX layer telling me?
The signals are sometimes misleading and it seems like some part of the additional information section is ripped anyway.
By the way which Oracle version are you currently running (4-th digit version information)?
Regards
Stefan
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hello Stefan,
Thanks for your help with this one.
First, of course, the Oracle version. This database was upgraded to 11.2.0.3 with SBP 201303 in June this year. This is the first time we've tried adding new redo logfile groups since then. However, the QA system which was similarly upgraded has not had the same problem and accepts the same redo logfile groups without any error.
The output of the commands you list are as follows:
01.shell> lsattr -EH -l hdisk22 <<disk_for_new_redolog_group
attribute value description user_settable
PCM PCM/friend/sddpcm PCM True
PR_key_value none Reserve Key True
algorithm load_balance Algorithm True
clr_q no Device CLEARS its Queue on error True
dist_err_pcnt 0 Distributed Error Percentage True
dist_tw_width 50 Distributed Error Sample Time True
flashcpy_tgtvol no Flashcopy Target Lun False
hcheck_interval 60 Health Check Interval True
hcheck_mode nonactive Health Check Mode True
location Location Label True
lun_id 0x4000401b00000000 Logical Unit Number ID False
lun_reset_spt yes Support SCSI LUN reset True
max_coalesce 0x40000 Maximum COALESCE size True
max_transfer 0x40000 Maximum TRANSFER Size True
node_name 0x5005037608ffc3c1 FC Node Name False
pvid 00c05b8bfa6f23af0000000000000000 Physical volume identifier False
q_err yes Use QERR bit True
q_type simple Queuing TYPE True
qfull_dly 2 delay in seconds for SCSI TASK SET FULL True
queue_depth 20 Queue DEPTH True
recoverDEDpath no Recover DED Failed Path True
reserve_policy no_reserve Reserve Policy True
retry_timeout 120 Retry Timeout True
rw_timeout 60 READ/WRITE time out value True
scbsy_dly 20 delay in seconds for SCSI BUSY True
scsi_id 0x350900 SCSI ID False
start_timeout 180 START unit time out value True
timeout_policy fail_path Timeout Policy True
unique_id 200X12YZ151001B07270190003XXXfcp Device Unique Identification False
ww_name 0x50050763081843c1 FC World Wide Name False
02.shell> lsattr -EH -l hdisk28 <<disk_of_a_current_redolog_group
attribute value description user_settable
PCM PCM/friend/sddpcm PCM True
PR_key_value none Reserve Key True
algorithm load_balance Algorithm True
clr_q no Device CLEARS its Queue on error True
dist_err_pcnt 0 Distributed Error Percentage True
dist_tw_width 50 Distributed Error Sample Time True
flashcpy_tgtvol no Flashcopy Target Lun False
hcheck_interval 60 Health Check Interval True
hcheck_mode nonactive Health Check Mode True
location Location Label True
lun_id 0x4000402100000000 Logical Unit Number ID False
lun_reset_spt yes Support SCSI LUN reset True
max_coalesce 0x40000 Maximum COALESCE size True
max_transfer 0x40000 Maximum TRANSFER Size True
node_name 0x5005037608ffc3c1 FC Node Name False
pvid 00c05b8bfa6f26b40000000000000000 Physical volume identifier False
q_err yes Use QERR bit True
q_type simple Queuing TYPE True
qfull_dly 2 delay in seconds for SCSI TASK SET FULL True
queue_depth 20 Queue DEPTH True
recoverDEDpath no Recover DED Failed Path True
reserve_policy no_reserve Reserve Policy True
retry_timeout 120 Retry Timeout True
rw_timeout 60 READ/WRITE time out value True
scbsy_dly 20 delay in seconds for SCSI BUSY True
scsi_id 0x330900 SCSI ID False
start_timeout 180 START unit time out value True
timeout_policy fail_path Timeout Policy True
unique_id 200X12YZ151002107270190003XXXfcp Device Unique Identification False
ww_name 0x50050763081343c1 FC World Wide Name False
03.shell> lsattr -EH -l aio0
As of AIX 7.1 the aio0 device no longer exists and the aio parameters are set in the kernel ioo options:
NAME CUR DEF BOOT MIN MAX UNIT TYPE
DEPENDENCIES
--------------------------------------------------------------------------------
aio_active 1 1 boolean S
--------------------------------------------------------------------------------
aio_maxreqs 128K 128K 128K 4K 1M numeric D
--------------------------------------------------------------------------------
aio_maxservers 30 30 30 1 20000 numeric D
aio_minservers
--------------------------------------------------------------------------------
aio_minservers 3 3 3 0 20000 numeric D
aio_maxservers
--------------------------------------------------------------------------------
aio_server_inactivity 300 300 300 1 86400 seconds D
--------------------------------------------------------------------------------
posix_aio_active 0 0 boolean S
--------------------------------------------------------------------------------
posix_aio_maxreqs 128K 128K 128K 4K 1M numeric D
--------------------------------------------------------------------------------
posix_aio_maxservers 30 30 30 1 20000 numeric D
aio_minservers
--------------------------------------------------------------------------------
posix_aio_minservers 3 3 3 0 20000 numeric D
aio_maxservers
--------------------------------------------------------------------------------
posix_aio_server_inactivity
300 300 300 1 86400 seconds D
--------------------------------------------------------------------------------
A list of the links for the redo log files:
Raw device links:>
lrwxrwxrwx 1 root system 15 06 Aug 10:07 /oracle/PRD/origlogA/log_g01m1.dbf -> /dev/rPRD_g01m1
lrwxrwxrwx 1 root system 15 06 Aug 10:07 /oracle/PRD/origlogA/log_g03m1.dbf -> /dev/rPRD_g03m1
lrwxrwxrwx 1 root system 15 06 Aug 10:07 /oracle/PRD/origlogA/log_g05m1.dbf -> /dev/rPRD_g05m1
lrwxrwxrwx 1 root system 15 06 Aug 10:07 /oracle/PRD/origlogA/log_g07m1.dbf -> /dev/rPRD_g07m1
lrwxrwxrwx 1 root system 15 06 Aug 10:07 /oracle/PRD/origlogB/log_g02m1.dbf -> /dev/rPRD_g02m1
lrwxrwxrwx 1 root system 15 06 Aug 10:07 /oracle/PRD/origlogB/log_g04m1.dbf -> /dev/rPRD_g04m1
lrwxrwxrwx 1 root system 15 06 Aug 10:07 /oracle/PRD/origlogB/log_g06m1.dbf -> /dev/rPRD_g06m1
lrwxrwxrwx 1 root system 15 06 Aug 10:07 /oracle/PRD/origlogB/log_g08m1.dbf -> /dev/rPRD_g08m1
lrwxrwxrwx 1 root system 15 06 Aug 10:07 /oracle/PRD/mirrlogA/log_g01m2.dbf -> /dev/rPRD_g01m2
lrwxrwxrwx 1 root system 15 06 Aug 10:07 /oracle/PRD/mirrlogA/log_g03m2.dbf -> /dev/rPRD_g03m2
lrwxrwxrwx 1 root system 15 06 Aug 10:07 /oracle/PRD/mirrlogA/log_g05m2.dbf -> /dev/rPRD_g05m2
lrwxrwxrwx 1 root system 15 06 Aug 10:07 /oracle/PRD/mirrlogA/log_g07m2.dbf -> /dev/rPRD_g07m2
lrwxrwxrwx 1 root system 15 06 Aug 10:08 /oracle/PRD/mirrlogB/log_g02m2.dbf -> /dev/rPRD_g02m2
lrwxrwxrwx 1 root system 15 06 Aug 10:08 /oracle/PRD/mirrlogB/log_g04m2.dbf -> /dev/rPRD_g04m2
lrwxrwxrwx 1 root system 15 06 Aug 10:08 /oracle/PRD/mirrlogB/log_g06m2.dbf -> /dev/rPRD_g06m2
lrwxrwxrwx 1 root system 15 06 Aug 10:08 /oracle/PRD/mirrlogB/log_g08m2.dbf -> /dev/rPRD_g08m2
Raw devices:>
brw-rw---- 1 oraprd dba 42, 5 06 Aug 10:01 /dev/PRD_g01m1
brw-rw---- 1 oraprd dba 43, 1 06 Aug 10:05 /dev/PRD_g01m2
brw-rw---- 1 oraprd dba 42, 6 06 Aug 10:01 /dev/PRD_g02m1
brw-rw---- 1 oraprd dba 43, 2 06 Aug 10:05 /dev/PRD_g02m2
brw-rw---- 1 oraprd dba 42, 7 06 Aug 10:02 /dev/PRD_g03m1
brw-rw---- 1 oraprd dba 43, 3 06 Aug 10:06 /dev/PRD_g03m2
brw-rw---- 1 oraprd dba 42, 9 06 Aug 10:02 /dev/PRD_g04m1
brw-rw---- 1 oraprd dba 43, 4 06 Aug 10:06 /dev/PRD_g04m2
brw-rw---- 1 oraprd dba 42, 10 06 Aug 10:02 /dev/PRD_g05m1
brw-rw---- 1 oraprd dba 43, 5 06 Aug 10:06 /dev/PRD_g05m2
brw-rw---- 1 oraprd dba 42, 11 06 Aug 10:02 /dev/PRD_g06m1
brw-rw---- 1 oraprd dba 43, 6 06 Aug 10:06 /dev/PRD_g06m2
brw-rw---- 1 oraprd dba 42, 12 06 Aug 10:02 /dev/PRD_g07m1
brw-rw---- 1 oraprd dba 43, 7 06 Aug 10:06 /dev/PRD_g07m2
brw-rw---- 1 oraprd dba 42, 13 06 Aug 10:03 /dev/PRD_g08m1
brw-rw---- 1 oraprd dba 43, 8 06 Aug 10:06 /dev/PRD_g08m2
crw-rw---- 1 oraprd dba 42, 5 09 Sep 11:41 /dev/rPRD_g01m1
crw-rw---- 1 oraprd dba 43, 1 06 Aug 10:05 /dev/rPRD_g01m2
crw-rw---- 1 oraprd dba 42, 6 06 Aug 13:30 /dev/rPRD_g02m1
crw-rw---- 1 oraprd dba 43, 2 06 Aug 10:05 /dev/rPRD_g02m2
crw-rw---- 1 oraprd dba 42, 7 06 Aug 13:32 /dev/rPRD_g03m1
crw-rw---- 1 oraprd dba 43, 3 06 Aug 10:06 /dev/rPRD_g03m2
crw-rw---- 1 oraprd dba 42, 9 06 Aug 13:35 /dev/rPRD_g04m1
crw-rw---- 1 oraprd dba 43, 4 06 Aug 10:06 /dev/rPRD_g04m2
crw-rw---- 1 oraprd dba 42, 10 06 Aug 10:02 /dev/rPRD_g05m1
crw-rw---- 1 oraprd dba 43, 5 06 Aug 10:06 /dev/rPRD_g05m2
crw-rw---- 1 oraprd dba 42, 11 06 Aug 10:02 /dev/rPRD_g06m1
crw-rw---- 1 oraprd dba 43, 6 06 Aug 10:06 /dev/rPRD_g06m2
crw-rw---- 1 oraprd dba 42, 12 06 Aug 10:02 /dev/rPRD_g07m1
crw-rw---- 1 oraprd dba 43, 7 06 Aug 10:06 /dev/rPRD_g07m2
crw-rw---- 1 oraprd dba 42, 13 06 Aug 10:03 /dev/rPRD_g08m1
crw-rw---- 1 oraprd dba 43, 8 06 Aug 10:06 /dev/rPRD_g08m2
Output of the truss command:
truss -o /tmp/truss.out -p 21168278
04.shell> cat /tmp/truss.out
kread(0, 0x0000000000000000, 0) = 124
kfcntl(1, F_GETFL, 0x0000000000000008) = 2
kwrite(9, "01 |\0\006\0\0\0\0\003 ^".., 380) = 380
kread(10, "\011\0\006\0\0\0\0\0\t\0".., 8208) (sleeping...)
kread(10, "\0\v\0\0\f\0\0\001\001\0".., 8208) = 22
kwrite(9, "\0\v\0\0\f\0\0\001\002", 11) = 11
kread(10, "01 p\0\006\0\0\0\0\004\0".., 8208) = 368
kwrite(1, " A L T E R D A T A B A".., 123) = 123
kfcntl(1, F_GETFL, 0x0000000000000008) = 2
kwrite(1, " *\n", 2) = 2
kfcntl(1, F_GETFL, 0x0000000000000008) = 2
lseek(4, 5120, 0) = 5120
kread(4, "\0\r\0 é\0\0\0 V\0 ê\0\0".., 512) = 512
kwrite(1, " E R R O R a t l i n".., 17) = 17
kfcntl(1, F_GETFL, 0x0000000000000008) = 2
kwrite(1, " O R A - 1 9 5 1 0 : f".., 57) = 57
kfcntl(1, F_GETFL, 0x0000000000000008) = 2
kwrite(1, " " / o r a c l e / P R D".., 54) = 54
kfcntl(1, F_GETFL, 0x0000000000000008) = 2
kwrite(1, " O R A - 2 7 0 4 1 : u".., 31) = 31
kfcntl(1, F_GETFL, 0x0000000000000008) = 2
kwrite(1, " I B M A I X R I S C".., 53) = 53
kfcntl(1, F_GETFL, 0x0000000000000008) = 2
kwrite(1, " A d d i t i o n a l i".., 26) = 26
kfcntl(1, F_GETFL, 0x0000000000000008) = 2
kwrite(1, "\n", 1) = 1
kwrite(1, "\n", 1) = 1
kfcntl(1, F_GETFL, 0x0000000000000008) = 2
kwrite(1, " S Q L > ", 5) = 5
kfcntl(1, F_GETFL, 0x0000000000000008) = 2
kfcntl(1, F_GETFL, 0x0000000000000008) = 2
kfcntl(2, F_GETFL, 0x0000000000000008) = 2
kfcntl(1, F_GETFL, 0x0000000013A60043) = 2
kread(0, " A L T E R D A T A B A".., 4096) (sleeping...)
Let me know if this sheds any light on the problem.
Kind regards, Craig Stewart
Hi Craig,
thanks for the output. Just to be absolutely sure - what is the hdisk device for major number 42 and minor number 5?
shell> cd /dev/
shell> ls -la | grep "42, 5"
The most important truss part is missing (open file handling / handle with corresponding options to file "/oracle/PRD/origlogA/log_g01m1.dbf"). When did you start the truss command and was PID 21168278 the corresponding oracle shadow process?
You can also enable an AIX kernel trace (trace -a -j <ID> -o /tmp/ktrace.out and trcstop) on that particular event (trcrpt -j) to get an idea of the PID and its handling (trcrpt).
We need the corresponding file handler option due to the error "IBM AIX RISC System/6000 Error: 22: Invalid argument".
Regards
Stefan
Hello Stefan,
Here's the result from the directory listing for hdisk device major number 42 and minor number 5:
01.shell> cd /dev/
02.shell> ls -la | grep "42, 5"
brw-rw---- 1 oraprd dba 42, 5 06 Aug 10:01 PRD_g01m1
crw-rw---- 1 oraprd dba 42, 5 09 Sep 11:41 rPRD_g01m1
Yes, you're absolutely right, I wasn't truss-ing the correct PID! I have created the correct truss output which I have attached to this thread as truss2.txt. Let me know whether this is more what you are looking for.
I will try and see what the kernel trace generates and post that later.
Kind regards, Craig
Hi Craig,
well i am little bit confused about the disk ouput, because of i would have expected a corresponding hdisk with the same major / minor number (and that the device rPRD_g01m1 is the same device just with another device file name - keyword mknod).
However let's go on with the truss output (the kernel traces are very low-level and not needed in that case here - it was just another opportunity to get the relevant information).
kopen("/oracle/PRD/origlogA/log_g01m1.dbf", O_RDONLY|O_LARGEFILE) = 12
kread(12, " A I X L V C B\0\0 r a".., 512) = 512
close(12) = 0
statx("/oracle/PRD/origlogA/log_g01m1.dbf", 0x0FFFFFFFFFFB2608, 176, 010) = 0
kopen("/oracle/PRD/origlogA/log_g01m1.dbf", O_RDONLY) = 12
kioctl(12, 65281, 0x0FFFFFFFFFFB22A8, 0x0000000000000000) = 0
kioctl(12, 30214, 0x0FFFFFFFFFFB22A0, 0x0000000000000000) = 0
close(12)
So in general the access to the raw device works and the disk already got some content in the first 512 bytes like "AIX LVCB ....". The following statx calls are just because of the sym links.
However the following part looks not that good at all.
kopen64x("/oracle/PRD/origlogA/log_g01m1.dbf", 0400400000200, 02023134270, , 1152921504606846944) Err#22 EINVAL
...
kopen64x("/oracle/PRD/origlogA/log_g01m1.dbf", 0400400000202, O_RDWR, , 1152921504606846944) Err#22 EINVAL
...
kopen64x("/oracle/PRD/origlogA/log_g01m1.dbf", 0400400000202, 02023134270, , 1152921504606846944) Err#22 EINVAL
...
This is a known issue (> 11.2.0.2) when FILESYSTEMIO_OPTIONS is set to "SETALL" (SAP default). For further information please check MOS bug note #13478884 or MOS note #1477941.1. This also affects the usual restore and it may fail (by the way have you ever tested it since the Oracle database upgrade?)
Solution: Unset filesystemio_options or set it to ASYNCH and restart the instance.
Regards
Stefan
User | Count |
---|---|
90 | |
10 | |
10 | |
10 | |
7 | |
7 | |
6 | |
5 | |
4 | |
3 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.