cancel
Showing results for 
Search instead for 
Did you mean: 

Dialog instances not comming up, shmget, No space left on device

Former Member
0 Kudos

Hello All,

I have a peculiar problem,

I performed a Hardware migration through installation recently ci + 2 dialog instances.

After the migration, everything was fine, both the CI and App servers came up without issues.

I performed a Kernel upgrade, and then before bringing all the instances I just brought up CI, and imported the profiles for all the active servers which did only for the CI as the dialogs were down.

After that I tried to start the dialog servers and found that they are unable to come up

Please find the errors in dev_w0 below

dev_w0

SHM_PRES_BUF (addr: 0x2bb9dc5000, size: 20000000)

I *** ERROR => shmget(11509,1073741824,992) (28: No space left on device) [shmux.c 1556]

M *** ERROR => ThShMCreate: ShmCreate SHM_ROLL_AREA_KEY failed [thxxhead.c 2577]

M *** ERROR => ThIPCInit: ThShMCreate [thxxhead.c 2074]

M ***LOG R19=> ThInit, ThIPCInit ( TSKH-IPC-000001) [thxxhead.c 1523]

M in_ThErrHandle: 1

M *** ERROR => ThInit: ThIPCInit (step 1, th_errno 17, action 3, level 1) [thxxhead.c 10468]

M

I found many things related to this error and checked the entries in sysctl.conf file which are fine.

Also, the swap space according to me looks fine as shown below:

/sbin/swapon -s

Filename Type Size Used Priority

/dev/cciss/c0d0p2 partition 20972848 898500 -1

/var/swapfile0 file 10485752 0 -2

/var/swapfile1 file 10485752 0 -3

/var/swapfile2 file 10485752 0 -4

Please help me on this..<removed_by_moderator>

Edited by: Juan Reyes on Sep 8, 2011 12:17 PM

Accepted Solutions (1)

Accepted Solutions (1)

JPReyes
Active Contributor
0 Kudos

Seems like your shared memory segment is running out of space...

I would stop the system and the clear the shared memory using cleanipc, then check using ipcs command to list the status of the shared memory if you can see entries for <sid>adm then you can remove them with command ipcrm and try starting the system again.

Regards

Juan

Former Member
0 Kudos

Thanks for the reply Juan,

As a general step for restart we do cleanipc, however I did that as you instructed and its a no go. I could not find any thing through ipcs after a cleanipc.

Any other ideas?

Regards,

Samant Kumar

JPReyes
Active Contributor
0 Kudos

I performed a Kernel upgrade, and then before bringing all the instances I just brought up CI,

Go back to the previous kernel, if that works, then the kernel that you downloaded is not correct.

Regards

Juan

Former Member
0 Kudos

Hello Juan,

Yes, I have done that too, but the issue still persists, I have exhausted all troubleshooting.

Regards,

Samant Kumar

Former Member
0 Kudos

Hello Samant,

Is this your system connected to any portal or backend system. Just for a check try to see any process is used in the backend or connected system. if yes try to kill it and then shutdown the whole system inlucding CI & application instance.

Also do cleanipc and ipcrm, memory clening.

wait for some time and then start back the system.

Hope this will help to resolve this issue.

Thanks and regards,

ram

Former Member
0 Kudos

Thanks for your reply Ram,

Since the Hw was migrated, we did not finish the complete switch.

Anyways, it is not connected to any other system presently.

Any other suggestions?

Regards,

Samant Kumar

Former Member
0 Kudos

Hi,

Please explore on the following if you haven't done before.

1. Ensure that all Filesystem on your dialog instance server mounted properly and enough free space left.

2. Perform complete reboot of your dialog instance.

3. Try re-start your dialog instance.

4. Execute sappfparcheck on the DI to see any memory related error occurs.

Good Luck..

Best Regards,

Vasanth G

Edited by: Vasanth Govindaraj on Sep 9, 2011 10:44 AM

JPReyes
Active Contributor
0 Kudos

Im pretty much convinced you are not using the right kernel or you are not updating the kernel properly.

Whats your procedure?... What kernel are you using?

Regards

Juan

Former Member
0 Kudos

Hi Vasanth.

1. Ensure that all Filesystem on your dialog instance server mounted properly and enough free space left.

Checked

2. Perform complete reboot of your dialog instance.

Cannot do it now will get back to you on this

3. Try re-start your dialog instance.

4. Execute sappfparcheck on the DI to see any memory related error occurs.

sappfpar check was checked, no errors were reported.

Thanks,

Samant kumar

Former Member
0 Kudos

Hi Juan,

I extracted the kernel 201, changed the permissions, created softlinks which were present before I deleted the old Kernel.

However, we are migrating 2 systems in the same host, I do not know what happened but now the DI's have come up properly with the Kernel 201 itself. Do not know what changed. Thanks for helping though!

@All,

Tanks all for your help, the issue is resolved without any change. I think this thread will provide almost every info required to resolve similar issue!!

Regards

Samant Kumar

Answers (3)

Answers (3)

Former Member
0 Kudos

Hello All,

The problem is due to the number of instances on the Host. When we bring down one instance only then the other can be brought up... what should I check now?

Regards,

Samant Kumar

JPReyes
Active Contributor
0 Kudos

You need to increase the size of your shared memory segment

Increase the size of kernel.shmmax and kernel.shmall

Regards

Juan

Former Member
0 Kudos

Hi Juan,

The current seetings are as follows

kernel.shmmax=23136829430

kernel.msgmni=1024

kernel.sem=1250 256000 100 2048

kernel.shmall=8388608

If I have to increase kernel.shmall, how much should I increase it to?

Thanks for your quick reply.

Regards,

Samant Kumar

Former Member
0 Kudos

Hello Juan,

Also, it is strange that the same parameters exists in the CI hosts as well but the CIs are coming up without any issue, why do you think that is happening?

Regards,

Samant kumar

Former Member
0 Kudos

Hi Juan,

please find the hardware details below:

Hardware details

LSB Version: :core-3.0-amd64:core-3.0-ia32:core-3.0-noarch:graphics-3.0-amd64:graphics-3.0-ia32:graphics-3.0-noarch

Distributor ID: RedHatEnterpriseAS

Description: Red Hat Enterprise Linux AS release 4 (Nahant Update 7)

Release: 4

Codename: NahantUpdate7

memory : MemTotal: 65846512 kB

MemFree: 888828 kB

8 processor 64-Bit AMD Opteron Dual Core

Regards,

Samant Kumar

JPReyes
Active Contributor
0 Kudos

Read,

Theres valid info as well as the notes related

Regards

Juan

Former Member
0 Kudos

Hi,

DId you check rfpar program check? Also make sure share pool 10 size is not more than 2GB.

Regards,

Vamshi.

Former Member
0 Kudos

Hi,

As pointed out by Juan this is a shared memory problem.Can you please check the os limits.I think the os is linux

we once had a issue where in shmmax was set to a very low value.

Former Member
0 Kudos

Hello samant,

What was your previous hardware and current , also put details about your system and kernel parameters set on the OS level.

Thanks and regards,

Ram

Former Member
0 Kudos

Old Hardware

LSB Version: :core-3.0-amd64:core-3.0-ia32:core-3.0-noarch:graphics-3.0-amd64:graphics-3.0-ia32:graphics-3.0-noarch

Distributor ID: RedHatEnterpriseAS

Description: Red Hat Enterprise Linux AS release 4 (Nahant Update 7)

Release: 4

Codename: NahantUpdate7

memory : MemTotal: 65846512 kB

MemFree: 888828 kB

8 processor 64-Bit AMD Opteron Dual Core

The new hardware is the same as the old Hardware

  1. SAP settings

kernel.shmmax=23136829430

kernel.msgmni=1024

kernel.sem=1250 256000 100 2048

kernel.shmall=8388608