cancel
Showing results for 
Search instead for 
Did you mean: 

Sap gateway not accepting connections when startsap has finished

benoit-schmid
Contributor
0 Kudos

Hello,

We have upgraded our SAP ECC servers from 6.04 to 6.07.

We currently use the following kernel.

kernel release          745
kernel make variant     745_REL
DBMS client library     OCI_112

DBSL shared library version   745.04

compiled on             Linux GNU SLES-11 x86_64 cc4.3.4  for linuxx86_64
compiled for            64 BIT
compilation mode        Non-Unicode
compile time            Feb 26 2016 14:56:26
update level            0
patch number            100
source id               0.100

We regularly have a problem, that we never faced with our kernel 7.20EXT.

We startsap and wait for the end of the command.

At the end of the command, we get: 'Instance on host xxx started'

Then we start a program that connects to the gateway, through an rfc connection, of this instance.

But the program regularly fails with the following error:

ERROR partner 'xxx:sapgw00' not reached

TIME Sat Sep 24 20:54:16 2016

SYSTEM CALL connect

ERRNO 111

ERRNO TEXT Connection refused

This means that the gateway daemon, running on tcp port 3300 is not up.

Is it a normal behaviour?

How can I ensure that the gateway has been started?

Is there any referenced bug for such problem on kernel 7.45?

Thanks in advance for your help.

Accepted Solutions (0)

Answers (3)

Answers (3)

Reagan
Advisor
Advisor
0 Kudos

>>

ERROR partner 'xxx:sapgw00' not reached

TIME Sat Sep 24 20:54:16 2016

SYSTEM CALL connect

ERRNO 111

ERRNO TEXT Connection refused

<<

>>

Sat Sep 24 20:54:18 2016

GwDpInit: attached to gw_adm at 0x7fd69860fd20

Bind service sapgw00 (socket) to port 3300

<<

You ran the program that tried to connect to the Gateway at 24 20:54:16 2016 but the Gateway service startup was completed at Sat Sep 24 20:54:18 2016. This is the reason why you got the connection refused error.

>>How can I ensure that the gateway has been started?

Once the instance has been started connect to the SAP system and perform some standard BASIS checks like SMGW, STMS, connection to DC and vice versa. The GetProcessList function of the sapcontrol tool will list the process at the OS level with the status but the time it shows is when the process was created. This doesn't mean the service is ready to accept the connections. After the process has been started and the shared memory has been initialized it binds to a port and in your case it was completed at 20:54:18 two seconds after you got the error. So from this time the connections to the Gateway should work as normal. The system is working as design but the problem (if you really think there is a problem) is the delay in the startup of individual components. Now, why there is a delay in the startup? for that you should check the utilisation of the system resources during the instance startup and analyze the usage. I have seen such issues on systems with very minimum hardware resources or when there are multiple SAP instances started at the same time on a machine.

benoit-schmid
Contributor
0 Kudos

Hello Benjamin,


Reagan Benjamin wrote:

>>How can I ensure that the gateway has been started?

Once the instance has been started connect to the SAP system and perform some standard BASIS checks like SMGW, STMS, connection to DC and vice versa. The GetProcessList function of the sapcontrol tool will list the process at the OS level with the status but the time it shows is when the process was created.

Unfortunately, I can not do what you are saying.

I need to automate/script this testing as it done during the night.

Regards,

isaias_freitas
Advisor
Advisor
0 Kudos

The Gateway is a core component of an ABAP instance.

If the Gateway is unable to start, the Dispatcher issues an "emergency shutdown" and the whole instance is stopped.

benoit-schmid
Contributor
0 Kudos

Good morning,


Isaias Freitas wrote:

The Gateway is a core component of an ABAP instance.

I agree with you.

What I do not understand is why SAP has decided to implement a command that returns

instance started although this core component has not even binded its port.

If we compare to Oracle listener, when the start says ok, it means the port has been binded.

Regards,

isaias_freitas
Advisor
Advisor
0 Kudos

Hello,

The contents of the "dev_rd" that were provided on September 28 show that the gateway started successfully and that it opened the port:


Bind service sapgw00 (socket) to port 3300

Regards,

Isaías

avadhesh_sap
Explorer
0 Kudos

Hi,

I would recommend you to please check if proper port details are maintained in  services file in /etc as OS level also check if the relevant ports are being listened on the application server.

Thanks,

Avi

benoit-schmid
Contributor
0 Kudos

Good morning,


Avadhesh Sharma wrote:

I would recommend you to please check if proper port details are maintained in  services file in /etc as OS level also check if the relevant ports are being listened on the application server.

I would recommend you to read in detail the thread to notice that you are out of the scope.

If sapgw00 was missing in /etc/services. The gateway could not start at all.

This is not the case.

Regards,

Former Member
0 Kudos

Hi Benoît,

can you telnet the port 3300?  Do you see gw process is running on os level?
If both answers are yes, then try to check the gw/acl_mode parameter value (in DEFAULT and instance profiles), try to set it to 0.
What the content of dev_rd file in work directory?

BR, Sergo.

Former Member
0 Kudos

Hi Beniot,

Below mentioned sap note is a side effect on gateway when running SAP on kernel 745 PL100

https://launchpad.support.sap.com/#/notes/2328970

Also Gateway start automatically when you issue startsap command. As suggested bu Sergo, please check gateway process is running or not and also take a look at trace file of gateway.

Regards

Prithviraj

benoit-schmid
Contributor
0 Kudos

Hello,


Sergo Beradze wrote:

can you telnet the port 3300?  Do you see gw process is running on os level?
If both answers are yes, then try to check the gw/acl_mode parameter value (in DEFAULT and instance profiles), try to set it to 0.
What the content of dev_rd file in work directory?

The message is clear: "Connection refused".

Therefore, for sur telnet would give the same.

Unfortunately I do not know if the the gwrd process is running because I did not have the possibility

to run a ps before the starting the daemon. A few minutes after the gateway replies. Therefore, it should be up. But this is not for sure 🙂

gw/acl_mode is set to 0 on my instances.

Why are you insisting on this setting?

I will look at dev_rd and come back.

My questions are still not answered.

1. Is it a normal behaviour?

From my understanding, when startsap returns instance is up, the gw should be up.

But I wanted to double check this.

How can I ensure that the gateway has been started correctly?

Is there any referenced bug for such problem on kernel 7.45?

Regards,

Former Member
0 Kudos

Hi,

'Instance on host xxx started' means nothing. It can be started and in next seconds stopped.
You need to have a look inside the logs in work folder, and you can have  2 windows opened, it is not a problem.

SMGW and ST11 t-codes are starting points for you in case you don't have access to OS.

Note provided by Prithviraj can cover your issue, check the logs and you will know.


BR, Sergo.


Message was edited by: Sergo Beradze

Former Member
0 Kudos

Hi Benoit,

You can check the gateway status at OS level using gwmon program

gwmon pf= nr=xx

Regards

Prithviraj

benoit-schmid
Contributor
0 Kudos

Hello,


Prithviraj Rajpurohit wrote:

You can check the gateway status at OS level using gwmon program

gwmon pf= nr=xx

Thanks for the info.

The following seems to work, from waht I have tested:

1. echo GET_CONNTBL | gwmon -cmdfile -

2. Check $? (equals 0 in case of success).

Regards,

benoit-schmid
Contributor
0 Kudos

Hello,

Sergo Beradze wrote:

What the content of dev_rd file in work directory?

This is the the dev_rd:

systemid   390 (AMD/Intel x86_64 with Linux)

relno      7450

patchlevel 0

patchno    100

intno      20151301

make       multithreaded, ASCII, 64 bit, optimized

pid        1998

Sat Sep 24 20:53:57 2016

gateway (version=745.2015.12.21 (with SSL support))

Bind service  (socket) to port

GwPrintMyHostAddr: my host addresses are :

*

* SWITCH TRC-LEVEL from 1 to 1

*

***LOG S00=> GwInitReader, gateway started ( 1998) [gwxxrd.c     1820]

systemid   390 (AMD/Intel x86_64 with Linux)

relno      7450

patchlevel 0

patchno    100

intno      20151301

make       multithreaded, ASCII, 64 bit, optimized

pid        1998

Sat Sep 24 20:53:57 2016

gateway (version=745.2015.12.21 (with SSL support))

gw/reg_no_conn_info = 1

* SWITCH TRC-RESOLUTION from 1 to 1

gw/sim_mode : set to 0

gw/logging : ACTION=Ss LOGFILE=gw_log-%y-%m-%d SWITCHTF=day MAXSIZEKB=100

NI buffering enabled

CCMS: initialize CCMS Monitoring for ABAP instance with J2EE addin.

CCMS: SemInMgt: Semaphore Management initialized by AlAttachShm_Doublestack.

CCMS: SemInit: Semaphore 38 initialized by AlAttachShm_Doublestack.

GwIRegInitRegInfo: reg_info file /usr/sap/XXX/DVEBMGS00/data/reginfo not found

Sat Sep 24 20:53:58 2016

GwPrintMyHostAddr: my host addresses are :

  1 : [x.x.x.x] xxx.xxx.ch (HOSTNAME)

  2 : [127.0.0.1] localhost (LOCALHOST)

Full qualified hostname = xxx.xxx.ch

CGROUPS: changing prio of pid 1998 to medium

CGROUPS: disabled

MtxInit: 30004 0 0

DpIPCInit2: read dp-profile-values from sys_adm_ext

DpShMCreate: alloate/attach shared memory (mode=ATTACH)

DpShMCreate: sizeof(wp_adm)        30520    (872)

DpShMCreate: sizeof(tm_adm)        14101248    (RDISPTERM=46848,MODEINFO=2696, IMODE_INFO=80)

DpShMCreate: sizeof(ca_adm)        384000    (64)

DpCommTableSize: max/headSize/ftSize/tableSize=1000/8/2536040/2741080

DpShMCreate: sizeof(comm_adm)        2741080    (2528)

DpSlockTableSize: max/headSize/ftSize/fiSize/tableSize=0/0/0/0/0

DpShMCreate: sizeof(slock_adm)        0    (232)

DpFileTableSize: max/headSize/ftSize/tableSize=0/0/0/0

DpShMCreate: sizeof(file_adm)        0    (80)

DpSockTableSize: max/headSize/ftSize/tableSize=1000/8/1024040/1024048

DpShMCreate: sizeof(sock_adm)        1024048    (1016)

DpShMCreate: sizeof(vmc_adm)        0    (2400)

DpShMCreate: sizeof(wall_adm)        (ft=33640/fi=84632/hd=56/rec=104)

DpShMCreate: sizeof(amc_rec_adm)        (ft=216040/fi=136232/hd=48/rec=352)

DpShMCreate: sizeof(websocket_adm)        (ft=63640/hd=64/rec=416)

DpShMCreate: sizeof(gw_adm)    56

DpShMCreate: sizeof(j2ee_adm)    2008

DpShMCreate: SHM_DP_ADM_KEY        (addr: 0x7fd69748e000, size: 18921656

DpShMCreate: allocated sys_adm at 0x7fd69748e200

DpShMCreate: allocated wp_adm_list at 0x7fd69749f920

DpShMCreate: allocated wp_adm at 0x7fd69749fc40

DpShMCreate: allocated tm_adm_list at 0x7fd6974a7578

DpShMCreate: allocated tm_adm at 0x7fd6974a77f8

DpShMCreate: allocated ca_adm at 0x7fd69821a4f8

DpShMCreate: allocated comm_adm at 0x7fd6982782f8

DpShMCreate: system runs without slock table

DpShMCreate: allocated sock_adm at  0x7fd698515850

DpShMCreate: allocated vmc_adm_list at 0x7fd69860fa80

DpShMCreate: system runs without VMC

DpShMCreate: allocated gw_adm at 0x7fd69860fd20

DpShMCreate: allocated j2ee_adm at 0x7fd69860ff58

Sat Sep 24 20:53:58 2016

DpShMCreate: allocated ca_info at 0x7fd698610930

DpShMCreate: allocated wall_adm (ft) at 0x7fd698610b68

DpShMCreate: allocated wall_adm (fi) at 0x7fd6986190d0

DpShMCreate: allocated wall_adm (head) at 0x7fd69862dd68

DpShMCreate: allocated amc_rec_adm (ft) at 0x7fd69862dfa0

DpShMCreate: allocated amc_rec_adm (fi) at 0x7fd698662d88

DpShMCreate: allocated amc_rec_adm (head) at 0x7fd6986843b0

DpShMCreate: allocated websocket_adm (ft) at 0x7fd6986845e0

DpShMCreate: allocated websocket_adm (head) at 0x7fd698694078

DpShMCreate: initialized 21 eyes

DpCommAttachTable: attached comm table (header=0x7fd6982782f8/ft=0x7fd698278300/fi=0x7fd6984e3568)

DpSockAttachTable: attached sock table (header=0x7fd698515850/ft=0x7fd698515858)

Sat Sep 24 20:54:18 2016

GwDpInit: attached to gw_adm at 0x7fd69860fd20

Bind service sapgw00 (socket) to port 3300

Regards,

raquel_gomez
Employee
Employee
0 Kudos

Hi,

You can also use command:
  sapcontrol -nr <inst_numbr> -function GetProcessList

to check the processes that are running on a particular server.


This will show you Gateway status.

Checking its dev_rd trace file will also provide more information about it status, if it has been correctly started and listening on 33<nr> port.

Regards,

Raquel

Former Member
0 Kudos

Hi,

are you sure this is the last log you have ?  Sep 24 20:53:57 2016 seems too old one.

what are the output of the next commands ?
ps -ef | grep gwrd
gwmon pf=/usr/sap/XXX/SYS/profile/XXX_DVEBMGS00_hostname

sapcontrol -nr 00 -function GetProcessList

sapcontrol -nr 00 -function ParameterValue gw/acl_mode

BR, Sergo.

benoit-schmid
Contributor
0 Kudos

Hello,


Sergo Beradze wrote:

are you sure this is the last log you have ?  Sep 24 20:53:57 2016 seems too old one.

what are the output of the next commands ?
ps -ef | grep gwrd
gwmon pf=/usr/sap/XXX/SYS/profile/XXX_DVEBMGS00_hostname

sapcontrol -nr 00 -function GetProcessList

sapcontrol -nr 00 -function ParameterValue gw/acl_mode

Yes, I am sure this is the last log.

To convince yourself, please, read again my first post.

You will notice that this correspond to my startsap time.

To make sure that my problem is clear:

One hour after starting SAP the gateway is up.

My problem is only 5 seconds after startsap ha finished.

Is this clear?

Of course your commands work now...

% gwmon nr=00

Gateway monitor, connected to xxx / sapgw00

Connection table (Used: 4, Connected: 4)

...

% ps -ef | grep gwrd

xxxadm    1998  1971  0 Sep24 ?        00:00:11 gwrd -dp pf=/usr/sap/XXX/SYS/profile/XXX_DVEBMGS00_xxx

% sapcontrol -nr 00 -function GetProcessList

28.09.2016 12:15:53

GetProcessList

OK

name, description, dispstatus, textstatus, starttime, elapsedtime, pid

msg_server, MessageServer, GREEN, Running, 2016 09 24 20:53:53, 87:22:00, 1970

disp+work, Dispatcher, GREEN, Running, 2016 09 24 20:53:53, 87:22:00, 1971

igswd_mt, IGS Watchdog, GREEN, Running, 2016 09 24 20:53:53, 87:22:00, 1972

gwrd, Gateway, GREEN, Running, 2016 09 24 20:53:56, 87:21:57, 1998

icman, ICM, GREEN, Running, 2016 09 24 20:53:56, 87:21:57, 1999

% sapcontrol -nr 00 -function ParameterValue gw/acl_mode

28.09.2016 12:16:28

ParameterValue

OK

0

Regards,

Former Member
0 Kudos

Now it is clear, was not clear from your initial post.

Again, 'Instance on host xxx started' means nothing, just that some commands to start were issued,and some inital answer given. You need some time to have all processes started(if everything is ok), you can try on your test system somewhere:

open 2 windows, in one navigate to work folder, try to "ls -ll" or something like this (you can use
some watch or any other command to automatically update the listing).
In second issue the startsap command, and monitor how the contents of the folder will change pointning to dev_disp,dev_w* and other files, you will see when your dev_rd is starting to populate.
From the log looks like it takes a while to allocate some shared memory objects
Sat Sep 24 20:53:58 2016 to Sat Sep 24 20:54:18 2016

GwDpInit: attached to gw_adm at 0x7fd69860fd20

Bind service sapgw00 (socket) to port 3300

P.S. checked on my test system, and it is faster than you have

Wed Sep 28 07:19:44 2016

DpShMCreate: sizeof(wp_adm)             545240  (2536)

xxx

Wed Sep 28 07:19:50 2016

GwDpInit: attached to gw_adm at 7f4ca5bcb360

Maybe it is really some problem in kernel (even if I do not see any problem to wait 10 sec more), try to update it on test system and test again.

Message was edited by: Sergo Beradze

raquel_gomez
Employee
Employee
0 Kudos

Hi,


You may check on START profile (or INSTANCE profile, depending on your release) if there's any command (such a 'sleep' command) that may cause this delay in Gateway starting.

Also more information can be seen on stder<nr> trace file (<nr> will depend on the order od the started program).

Regards,

Raquel

benoit-schmid
Contributor
0 Kudos

Hello,


Sergo Beradze wrote:

Again, 'Instance on host xxx started' means nothing, just that some commands to start were issued,and some inital answer given. You need some time to have all processes started

Where is this information documented?

A process can crash after been started. But this is another issue.

Instance started means all processes started correctly.

Binding to the tcp port is part of the process startup consideration.

This is why I consider that instance started means that all related daemons

have succeeded to bind all their listening related ports.

But this is my point of view. I agree that I did not find SAP documentation that details this.

Regards,

benoit-schmid
Contributor
0 Kudos

Hello,


Prithviraj Rajpurohit wrote:

Hi Beniot,

Below mentioned sap note is a side effect on gateway when running SAP on kernel 745 PL100

https://launchpad.support.sap.com/#/notes/2328970

Thanks for the Note.

As I do not see any crash of the gateway, it does not seem to be relevant.

Regards,