I am receving the runtime error when i am trying to read the data from the application server(the data is loaded in the application server from the FTP server). The issue is with the character ' ¦ ' . Looks like the system is not able to convert this character from one code page to another.
I am getting this issue in both Development and in Quality Systems. Both systems are unicode systems.
I have gone through the existing threads on it and found 2 solutions. One is adding UTF-8 in the Open Dataset statement and other is adding IGNORING CONVERSION ERRORS. I tried both these options and the issue still exists.
Runtime error text:
Characters are always displayed in only a certain codepage. Many
codepages only define a limited set of characters. If a text from a
codepage should be converted into another codepage, and if this text
contains characters that are not defined in one of the two codepages, a
conversion error occurs.
Moreover, a conversion error can occur if one of the needed codepages
'4110' or '4103' is not known to the system.
Here is it..
OPEN DATASET g_file_path FOR INPUT IN TEXT MODE ENCODING UTF-8.
For Ignoring Conv errors:
OPEN DATASET g_file_path FOR INPUT IN TEXT MODE ENCODING DEFAULT IGNORING CONVERSION ERRORS.
Just to add:
1. The pipe symbol is appearing as ' # ' in the application server, which is the main problem. So when we are reading it, it is not able to recognize. If it would have appeared as pipe symbol, we would not face this issue.
Edited by: Venkata Anjaneya Karthick Kolisetty on Dec 20, 2010 10:28 PM
Let's first name the code pages you've given: 4110 is UTF-8 and 4103 is UTF-16LE. A conversion between Unicode code pages should not be an issue, but converting from a Unicode page to a non-Unicode page obviously is likely to face conversion issues due to missing characters in the non-Unicode target code page.
So let's check the real problem: It sounds like you're reading a file that's not in a UTF-8 encoding, yet you claim it is. I.e. you mention the character ¦ is causing problems. Assuming that you actually copied the character you can already see that it's a bit unusual, i.e. it's not the normal pipe symbol | (which is closed, whereas your character is divided). Now taking the hex code of the character off your posting I get a A6 (decimal 166, binary 10100110). Now in UTF-8 only ASCII (i.e. 7 bit values) are 1 byte characters (i.e. one character equals to one byte).
Now if you check the UTF-8 table you'll see that your pipe symbol with hex value A6 is a continuation character, i.e. it has to be part of a longer multibyte sequence. So if your file errors out on this specific character it's probably simply due to the fact that you don't have a UTF-8 file, but some other encoding (e.g. often it's latin1). So your first step should be to figure out what encoding your file is that you're reading. Once you know it, conversion from that code page to UTF-8 or one of the UTF-16 flavors is no challenge.
By the way, one easy check for invalid UTF-8 byte sequences is to load the file into a web browser. E.g. in your case you could try copying the following data into a file and opening it in your favorite web browser (load will fail).
<?xml version="1.0" encoding="UTF-8"?> <data>¦</data>
After testing that, change the encoding from UTF-8 to latin1 and you'll see that you can load the page into the web browser...
Try following :
CALL FUNCTION 'NLS_GET_FRONTEND_CP' EXPORTING langu = sy-langu fetype = 'MS' IMPORTING frontend_codepage = lv_codepage EXCEPTIONS illegal_syst_codepage = 1 no_frontend_cp_found = 2 internal_or_db_error = 3 OTHERS = 4. OPEN DATASET lv_filename FOR INPUT IN LEGACY TEXT MODE CODE PAGE lv_codepage.
This should resolve your problem.