cancel
Showing results for 
Search instead for 
Did you mean: 

SM66 - Work Process Failures

Former Member
0 Kudos

Hi there,

I just noticed the following on transaction SM66:

The system is running fine. Looking in SM21 nothing seems to appear for the

same Process ID.

Is this something to worry about? How can I clear them down?

I went into SM50 and then restart process after error but it still stays highlighted in red.

Any ideas anyone?

Thanks

Andy

Accepted Solutions (1)

Accepted Solutions (1)

mervin_joseph
Explorer

Hi Andy,

Please check the below notes, which gives you the various reason for workprocess restart

Note 101717 - Automatic restart of SAP R/3 work processes

And still if you are seeing the "red" highlights(In Failure column) in SM50. Please follow the below steps to resolve this:

tcode SM50>> List >> Reset >> Failures (After resetting, the red highlights will disappear)

Little late reply but still it may help someone who experience this issue in future

Best Regards,
Mervin Joseph

Former Member
0 Kudos

Many thanks Mervin - I shall take a look

Cheers

Andy

isaias_freitas
Advisor
Advisor
0 Kudos

tcode SM50>> List >> Reset >> Failures (After resetting, the red highlights will disappear)

Although this will reset the counter and make the red light go away, it will not fix the root cause of the issue .

mervin_joseph
Explorer
0 Kudos

Yes, You are right my friend, the above mentioned steps will only clear/reset the red highlights from SM50. As per my understanding I believe that is one of the query raised by Andy

Okay, lets dig little deeper now

Hi Andy,

After resetting the failure status if you are still getting the red highlight in SM50 then please share us few more info from the developer trace (dev_wx)file.

I know its not possible to attach the entire trace file since its big but what you can do is:

- Please make a note of the "failure reset time" (the time when you reset the failure list)

- download the trace file to your local desktop ( Select the wp with failure status from SM50 >> Administartion>> Trace >> Save as local file )

-  Search for the word "error " which appears after the failure reset time. Please take a snap of the error logs and paste it here which will help us to investigate the root case for this issue.

- And also please check and share if you see any error log in tcode SM21 and ST22 at the same time.

Attaching a sample screenshot for your reference from dev_wx file:

Awaiting for your reply!

Best Regards,

Mervin Joseph

Former Member
0 Kudos

Hi Mervin,

Many thanks for your reply with this. I really do appreciate your help.

I have just reset the failure status. I will now wait until this time tomorrow when I

know there will have been more failures.

I will then upload some screen shots.

Again, mnay thanks for your help 

Cheers

Andt

isaias_freitas
Advisor
Advisor
0 Kudos

Hello Andy,

You could also reset the trace files (SM50 -> Administration -> Trace -> Reset -> work processes).

The next time you see a failure, open the corresponding trace file and search for " pid   ".

Each failure means that the work process has restarted.

At each restart, the work process PID will change, and the work process will log the initial header at the trace again (that initial part that shows the system ID, kernel version, the PID, ...).

Once you locate the second occurrence of the PID (the first one would be the PID from before the new failure), copy the trace entries from around 100 lines up until around 100 lines down (from the line of the second PID) and attach it to this thread.

Notice that the failure that occurred at one work process might not be the same that occurred at all of them.

Regards,

Isaías

Former Member
0 Kudos

Good man, cheers will do Isafas

Thanks

Andy

Former Member
0 Kudos

Hi Mervin, just checked and there are no failures as of yet.

Will check again in the morning

Andy

mervin_joseph
Explorer
0 Kudos

That sounds great !

I hope you wont get any more error today

Have a nice day!

Best Regards,

Mervin Joseph

Former Member
0 Kudos

None today either !!! . I'll keep my eyes open and let you known as soon

as they appear Mervin.

Hopefully, things have sorted them selves out

Cheers

Andy

Answers (5)

Answers (5)

former_member188065
Participant
0 Kudos

What function / ABAP report does the reset triggered from SM50?
I'd like to have a look at the source code

SM50>> List >> Reset >> Failures 
(After resetting, the red highlights will disappear)
Former Member
0 Kudos

Hi Mervin,

Here we go, just checked SM66:

Now I went into SM21 and checked the log and seen that the WP's have since started

55 times:

When I now drill down for that WP: 21231

So I now went into the trace file and searched for the SP 21231:

Now I think I know what is causeing the error.

When you look at the first line it says 'ORACLE not available'

It just so happens that at this particulat time (and the repeating erors) is when our offline

backup is still running and so ORACLE will indeed be down.

Do you think this is the case? So I can just ignore them??

Many thanks

Andy

isaias_freitas
Advisor
Advisor
0 Kudos

Hello Andy,

For sure that was the root cause in this case.

Thus, you could simply ignore these.

You could also stop SAP before starting the backup.

No one would be able to do anything at SAP if the database is not available anyway.

Regards,

Isaías

mervin_joseph
Explorer
0 Kudos

Hi Andy,

If you don't see any other error in system log at that point of time, then yes, this is the root cause for this issue.

And this is not gonna make any inconsistency in any of your data so you can ignore this

Best Regards,

Mervin Joseph

Former Member
0 Kudos

Hi Guys,

Yep it is definitely the offline backup causing these failures to appear in SM66. Just looked

now and they have started appearing again after the weekends backup.

I know I can ignore these but is there a job I can schedule to clear these down instead of having

to do this manually??

I like things nice and tidy !!

Many thanks for all your help it is much appreciated.

Andy

isaias_freitas
Advisor
Advisor
0 Kudos

Hello,

I do not know any way to clear those counters automatically.

Have you considered stopping SAP before the backup starts, and starting it again after the backup completes?

No one would be able to do anything at SAP, if the database is not available.

Regards,

Isaías

Former Member
0 Kudos

if possible restart the application when the system load is less .

hope this fix the issue .

Regards

Former Member
0 Kudos

Hi Andy,

If you are seeing no errors in the system log or WP logs, then you may find at the WPs are set to restart periodically - it's a profile parameter which sets this.

Regards,

Graham

Former Member
0 Kudos

Hi Graham,

Thanks for the reply. Would you happen to know the profile parameter so I can

check this out?

Thanks Andy

Former Member
0 Kudos

Hi Andy,

Try this note: http://service.sap.com/sap/support/notes/1709928

Regards,

Graham

Former Member
0 Kudos

Good man, thanks Graham

Former Member
0 Kudos

Hey Graham,

I can see where you're coming from with your initial remark and probably by the screen shot with the number of failures across all WP being almost identical.

That said, the WP autorestart wouldn't (shouldn't) show up in the error count of a WP. You'd just see in the dev trace that the autorestart time had been reached and the WP would restart.

If you have a test system, check it out by setting it to a silly value like 60s.

Cheers,

Amerjit

isaias_freitas
Advisor
Advisor
0 Kudos

The profile parameter for auto-restart (rdisp/wp_auto_restart) does not increase that counter.

Those are unexpected restarts (e.g., crashes?).

JPReyes
Active Contributor
0 Kudos

I agree with Issaias, you need to check your workprocesses developer traces and find out what is making the dialog processes crash.

Regards, JP

Former Member
0 Kudos

Hi Andy,

The count tell the number of time the work process has restarted after the system restart.

In your case I see the main reason is heap memory. It is set low.

Work process restarts itself once it reaches heaplimit value or the heap configured for the work process

Parameters:

abap/heaplimit

abap/heap_area_dia

abap/heap_area_nondia

From your St02 screen-shot I can see the value set to heap is very small.

Regards,

Prithviraj.

isaias_freitas
Advisor
Advisor
0 Kudos

Hello,

abap/heaplimit should not be changed.

It has to be a low value (40MB - 60MB).


This is not a memory allocation limit.

Read the parameter documentation through the transaction RZ11 for more details.


Regards,

Isaías

Former Member
0 Kudos

Hi Andy,

Your SM66 screen shot is telling you that the <nn> WP have restarted <nn> times.

ie: The WP with PID 29906 has had 181 failures (restarts) but is currently in status WAITING (waiting for work).

What you need to check is the developer trace associated with the WP (SM50) and see from within the dev trace what errors are being logged.

Hope this helps you a wee bit.

KR,

Amerjit

Former Member
0 Kudos

Thanks for replying Amerjit, very useful info.

Cheers

Former Member
0 Kudos

Andy,

You're welcome. Could you do two things for us.

1. Post a screen shot of tx ST02.

2. Upload one of the developer trace files. eg: dev_w0

Could you also check tx ST22 to see if you're getting any short dumps.

Cheers,

Amerjit

Former Member
0 Kudos

Hi Amerjit,

Here we go:

There are no short dumps in ST22

Former Member
0 Kudos

trace file

Former Member
0 Kudos

Hey Andy,

Didn't see any attachments to this reply. Can you upload dev_w0 from the work directory so that whoever is interested can have a look and try and help you out.

Cheers,

Amerjit