Solved: What is fragment view in unicode concepts

Former Member · ‎05-31-2007

What is fragment view in unicode concepts

Former Member · ‎05-31-2007

hi

go through the following Document

1. Codes In the past, SAP developers used various codes to encode characters of different alphabets, for example, ASCII, EBCDI, or double-byte code pages. ASCII (American Standard Code for Information Interchange) encodes each character using 1 byte = 8 bit. This makes it possible to represent a maximum of 28 = 256 characters to which the combinations are assigned. Common code pages are, for example, ISO88591 for West European or ISO88595 for Cyrillic fonts. EBCDIC (Extended Binary Coded Decimal Interchange) also uses 1 byte to encode each character, which again makes it possible to represent 256 characters. EBCDIC 0697/0500 is an old IBM format that is used on AS/400 machines for West European fonts, for example. Double-byte code pages require 1 or 2 bytes for each character. This allows you to form 216 = 65536 combinations where usually only 10,000 - 15,000 characters are used. Double-byte code pages are, for example, SJIS for Japanese and BIG5 for traditional Chinese. Using these character sets, you can account for each language relevant to the SAP System. However, problems occur if you want to merge texts from different incompatible character sets in a central system. Equally, exchanging data between systems with incompatible character sets can result in unprecedented situations. One solution to this problem is to use a code comprising all characters used on earth. This code is called Unicode (ISO/IEC 10646) and consists of at least 16 bit = 2 bytes, alternatively of 32 bit = 4 bytes per character. Although the conversion effort for the SAP kernel and applications is considerable, the migration to Unicode provides great benefits in the long run: The Internet (www) and consequently also mySAP.com are entirely based on Unicode, which thus is a basic requirement for international competitiveness. Unicode allows all SAP users to install a central system that covers all business processes worldwide. Companies using different distributed systems frequently want to aggregate their worldwide corporate data. Without Unicode, they would be able to do this only to a limited degree. With Unicode, you can use multiple languages simultaneously at a single frontend computer. Unicode is required for cross-application data exchange without loss of data due to incompatible character sets. One way to present documents in the World Wide Web (www) is XML, for example. ABAP programs must be modified wherever an explicit or implicit assumption is made with regard to the internal length of a character. As a result, a new level of abstraction is reached which makes it possible to run one and the same program both in conventional and in Unicode systems. In addition, if new characters are 3

added to the Unicode character set, SAP can decide whether to represent these characters internally using 2 or 4 bytes. The examples presented in the following sections are based on a Unicode encoding using 2 bytes per character. 2. ABAP Development Under Unicode A Unicode-enabled ABAP program (UP) is a program in which all Unicode checks are effective. Such a program returns the same results in a non-Unicode system (NUS) as in a Unicode system (US). In order to perform the relevant syntax checks, you must activate the Unicode flag in the screens of the program and class attributes. In a US, you can only execute programs for which the Unicode flag is set. In future, the Unicode flag must be set for all SAP programs to enable them to run on a US. If the Unicode flag is set for a program, the syntax is checked and the program executed according to the rules described in this document. This is regardless of whether it is a Unicode or non-Unicode program. From now on, the Unicode flag must be set for all new programs and classes that are created. If the Unicode flag is not set, a program can only be executed in an NUS. The syntactical and semantic changes described below do not apply to such programs. However, you can use all language extensions that have been introduced in the process of the conversion to Unicode. As a result of the modifications and restrictions associated with the Unicode flag, programs are executed in both Unicode and non-Unicode systems with the same semantics to a large degree. In rare cases, however, differences may occur. Programs that are designed to run on both systems therefore need to be tested on both platforms. Additionally, as part of the introduction of Unicode, the following modifications have been made in the syntax check to the Unicode flag: 1. In Unicode programs, unreachable statements now cause a syntax error. In non-Unicode programs, this previously only caused a syntax warning. 2. In Unicode programs, calling a function module, whose parameter names are specified statically as a literal or constant, will raise an exception that can be handled if an incorrect parameter name is specified. This only applies to function modules that are not called via Remote Function Call. In non-Unicode programs, an incorrect name was previously ignored. You are recommended to follow the procedure below to make your programs US-compliant: The UNICODE task in transaction SAMT performs first an NUS and then a US syntax check for a selected program set. For an overview of the syntax errors by systems, programs and authors, consult the following document in SAPNet: Alternatively, you can start the ABAP program RSUNISCAN_FINAL to determine the Unicode-relevant syntax errors for a single program. 4

Before you can set the Unicode flag in the NUS in the attributes of the program concerned, all syntax errors must be removed. Having enabled the Unicode flag in the NUS, you can run the syntax check for this program. To display a maximum of 50 syntax errors simultaneously, choose Utilities -> Settings -> Editor in the ABAP Editor and select the corresponding checkbox. Once all syntactical requirements are met in the NUS, you must test the program both in the NUS and US. The purpose of this test is to recognize any runtime errors and make sure that the results are correct in both systems. To rule out runtime errors in advance, you should always type field symbols and parameters so that any potential problems can be detected during the syntax check. ABAP: Change Program AttributesVER00778XAttributeTypeStatusApplicationAuthorization GroupPackageLogicalDatabaseSelection Screen VersionEditor LockFixed Point ArithmeticUnicode Check ActiveStartUsingVariantSaveSABP ExecutableProgramProductiveSAP Standard ProgramBasis (System) TitleOriginal Language DEGermanCreated09.06.1999SchröderLastChanged15.01.2000SchröderStatusActiveDocu Check 3. Concepts and Conventions 3.1 Data Types The data types that can be interpreted as character-type in a UP include: C Character (letters, numbers, special characters) N Numeric character (numbers) D Date T Time STRING Character string Character type structures Structures which either directly or in substructures contain only fields of types C, N, D or T. 5

In an NUS, a character of this type has a length of 1 byte, and in a US a length corresponding to the length of one character on the relevant platform. The data type W is no longer supported. Variables of the types X and XSTRING are called byte-type. The main characteristics of the different kinds of structures are: Flat structures contain only fields of the elementary types C, N, D, T, F, I, P, and X, or structures containing these types. Deep structures contain strings, internal tables and field or object references in addition to the elementary types. Nested structures are structures that contain substructures as components. Non-nested structures are structures that do not contain any substructures. 3.2 Data Layout of Structures For several data types, such as I and F or object references, certain alignment requirements are in place that depend on the platform used. Fields of these types must begin in memory at an address divisible by 4 or 8. Character-type types must begin at a memory address divisible by 2 or 4 depending on their Unicode representation. Within structures, bytes can be inserted before or after components with alignment requirements to achieve the necessary alignment. These bytes are referred to as alignment (A). A substructure is aligned according the field with the biggest alignment requirement. In this case a contained substructure counts as a field. Includes in structures are treated as substructures. In the sample structure below that contains three fields, no alignments are created in an NUS or US. BEGIN OF struc1, a(1) TYPE X, b(1) TYPE X, c(6) TYPE C, END OF struc1. In the next example, however, alignments are created in a US but not in an NUS. The first alignment gap is created because of the alignment of structure struc3, the second because of the alignment of C field c, and the third because of the addressing of integer d. BEGIN OF struc2, a(1) TYPE X, BEGIN OF struc3, b(1) TYPE X, c(6) TYPE C, END OF struc3, d TYPE I, END OF struc2. 6

NUS a b c d US a A b A c A d | struc3 | 3.3 Unicode-Fragment View The data layout of structures is relevant to UP checks with regard to the reliability of assignments and comparisons, for example. This data layout is represented in the Unicode fragment view. The fragment view breaks down the structure into alignment gaps, in byte and character-type areas, and all other types such as P, I, F, strings, references or internal tables. Juxtaposed character-type components of a structure except strings are internally combined into a group if no alignment gaps exist between these components. All possible alignment requirements for characters are considered. Juxtaposed byte type components are grouped together in the same way. BEGIN OF struc, a(2) TYPE C, b(4) TYPE N, c TYPE D, d TYPE T, e TYPE F, f(2) TYPE X, g(4) TYPE X, h(8) TYPE C, i(8) TYPE C, END OF struc. In the following example, F1 - F6 show the individual fragments of structure struc: a b c d A e f g A h i | F1 | F2 | F3 | F4 | F5 | F6 | 3.4 Permitted Characters In a US, all ABAP program sources are also stored as Unicode. As in ABAP Objects, you may only use the following characters as identifiers in programs for which the Unicode flag is set: 1. Letters a - z and A - Z without the German 'umlauts' 2. Numbers 0 - 9 3. The underscore _ For compatibility reasons, the characters %, $, ?, -, #, *, and / are also still permitted, but they should only be used for good reason in exceptional cases. Note 7

that the slash can only be used to separate namespaces in the form /name/. There must be at least three characters between two slashes. To ensure that programs can be transported from a US to a NUS without any loss of information in the process of conversion, you should not use any characters for comments and literals even in a US that cannot be represented in an NUS. 4. Restrictions in Unicode Programs The adjustments you have to make and the restrictions that apply in the Unicode context have been limited to the essentials on the ABAP development side to keep the conversion effort for ABAP users to a minimum. In some cases, however, this has led to the emergence of more complex rules, for example, with regard to assignments and comparisons between incompatible structures. 4.1 Character and Numeric Type Operands Up to now, you have been able to use flat structures as arguments of ABAP statements wherever single fields of type C were expected. In a UP this is no longer generally permitted. In a UP, you can use a structured field in a statement expecting a single field only if this structured field consists of character-type elementary types or purely character-type substructures. The structure is treated like a single field of type C. The main restrictions applying to a UP in contrast to an NUS result from the fact that flat structures are only considered character-type on a limited basis, and fields of type X or STRING are never considered character-type. In addition, flat structures are only considered numeric-type if they are purely character-type. Numeric-type arguments include, for example, offset or index specifications as in READ TABLE ... INDEX i. The following examples show a structure that is character-type and a structure that is not: BEGIN OF struc1, BEGIN OF struc2, a(2) TYPE C, a(2) TYPE C, b(2) TYPE C, Not n(6) TYPE N, Character- x(1) TYPE X, character-type d TYPE D, type i TYPE I, t TYPE T, END OF struc. END OF struc. Another example is a control break in an internal table, triggered by the AT keyword. In a NUS, fields of type X to the right of the control key are treated as character-type, and are thus filled with an asterisk. In Unicode systems, conversely, the same type is filled with its initial value. 8

4.2 Access Using Offset and Length Specifications Offset and length specifications are generally critical since the length of each character is platform-dependent. As a result, it is initially unclear as to whether the byte unit or the character unit is referred to in mixed structures. This forced us to put in place certain considerable restrictions. However, access using offset or length specifications is still possible to the degree described in the following. The tasks subject to this rule include accessing single fields and structures, passing parameters to subroutines and working with field symbols. Single field access Offset-based or length-based access is supported for character-type single fields, strings and single fields of types X and XSTRING. For character-type fields and fields of type STRING, offset and length are interpreted on a character-by-character basis. Only for types X and XSTRING, the values for offset and length are interpreted in bytes. Structure access Offset-based or length-based access to structured fields is a programming technique that should be avoided. This access type results in errors if both character and non-character-type components exist in the area identified by offset and length. Offset-based or length-based access to structures is only permitted in a UP if the structures are flat and the offset/length specification includes only character-type fields from the beginning of the structure. The example below shows a structure with character-type and non-character-type fields. Its definition in the ABAP program and the resulting assignment in the main memory is as follows: BEGIN OF STRUC, a(3) TYPE C, "Length 3 characters b(4) TYPE N, "Length 4 characters c TYPE D, "Length 8 characters d TYPE T, "Length 6 characters e TYPE F, "Length 8 bytes f(26) TYPE C, "Length 28 characters g(4) TYPE X, "Length 2 bytes END OF STRUC. a b c d A e f g | F1 | F2 | F3 | F4 |F5| Internally, the fragment view contains four fragments . Offset-based or length-based access in this case is only possible in the initial part F1. Statements like struc(21) or struc7(14) are accepted by the ABAP interpreter and treated like a single field of type C. By contrast, struc57(2) access is now only allowed in an NUP. If offset-based or length-based access to a structure is permitted, both the offset and length specifications are generally interpreted as characters in a UP. 9

Passing parameters to subroutines Up to now, parameter passing with PERFORM has allowed you to use cross-field offset and length specifications. In future, this will no longer be allowed in a UP. In a UP, offset-based and length-based access beyond field boundaries returns a syntax or runtime error. For example, access types c15 or c5(10) would trigger such an error for a ten-digit C field c. If only an offset but no length is specified for a parameter, the entire length of the field instead of the remaining length was previously used for access. As a result, parameter specifications are cross-field if you use only an offset, and therefore trigger a syntax error in a UP. PERFORM test USING c5 is consequently not permitted. In addition, in a UP, you can continue to specify the remaining length starting from the offset off for parameters using the form fieldoff(*). Ranges for offset-based and length-based access when using field symbols A UP ensures that offset-based or length-based access with ASSIGN is only permitted within a predefined range. Normally, this range corresponds to the field boundaries in case of elementary fields or, in case of flat structures, to the purely character-type initial part. Using a special RANGE addition for ASSIGN, you can expand the range beyond these boundaries. Field symbols are assigned a range allowed for offset/length specifications. If the source of an ASSIGN statement is specified using a field symbol, the target field symbol adopts the range of the source. If not explicitly specified otherwise, the RANGE is determined as follows: ASSIGN field TO . ASSIGN elfieldoff(len) TO <f>. In a UP, the field boundaries of the elementary field elfield are assigned to <f> as the range. ASSIGN <elfield>off(len) TO . ASSIGN strucoff(len) TO <f>. ASSIGN <struc>off(len) TO determines the range boundaries. If the assignment to the field symbol is not possible because the offset or length specification exceeds the range permitted, the field symbol is set to UNASSIGNED in a UP. Other checks such as type or alignment checks return a runtime error in a UP. As a rule, offset and length specifications are counted in 10

characters for data types C, N, D, and T as well as for flat structures, and in bytes in all other cases. Offset without length specification when using field symbols Up to now, ASSIGN fieldoff TO <f> has shown the special behavior that the field length instead of the remaining length of field was used if only an offset but not length was specified. Since an ASSIGN with a cross-field offset is therefore problematic under Unicode, you must observe the following rules: 1. Using ASSIGN fieldoff(*)... you can explicitly specify the remaining length. 2. ASSIGN off TO <g> is only permitted if the runtime type of <f> is flat and elementary, that is, C, N, D, T (offset in characters) or X (offset in bytes). 3. ASSIGN fieldoff TO , as it is the case in a loop, for example. 4.3 Assignments This section deals with implicit and explicit type conversions using the equal sign (=) or the MOVE statement. Two fields can be converted if the content of one field can be assigned to the other field without triggering a runtime error. For conversions between structured fields or a structured field and a single field, flat structures were previously treated like C fields. With the implementation of Unicode, this approach has become too error-prone since it is not clear if programs can be executed with platform-independent semantics. Two fields are compatible if they have the same type and length. If deep structures are assigned, the fragment views must therefore be identical. One requirement in connection with the assignment and comparison of deep structures has been that type compatibility must exist between the operands, which requires both operands to have the same structure. This requirement will continue to apply to Unicode systems. Conversion between flat structures To check whether conversion is permitted at all, the Unicode fragment view of the structures is set up initially by combining character and byte type groups and alignment gaps as well as any other components. If the type and length of the fragments of the source structure are identical in the length of the shorter structure, conversion is permitted. Assignment is allowed subject to the fulfillment of the following conditions: 11

1. The fragments of both structures up to the second-last fragment of the shorter structure are identical. 2. The last fragment of the shorter structure is a character or byte type group. 3. The corresponding fragment of the longer structure is a character or byte type group with a greater length. If the target structure is longer than the source structure, the character-type components of the remaining length are filled with blank characters. All other components of the remaining length are filled with the type-adequate initial value, and alignment gaps are filled with zero bytes. Since longer structures were previously filled with blanks by default, using initial values for non-character-type component types is incompatible. This incompatible change is, however, rather an error correction. For reasons of compatibility, character-type components are not filled with initial values. BEGIN OF struc1, BEGIN OF struc2, a(1) TYPE C, a(1) TYPE C, x(1) TYPE X, b(1) TYPE C, END OF struc1. END OF struc2. The assignment struc1 = struc2 is not allowed under Unicode since struc1-x in contrast to struc2-b occupies only one byte. BEGIN OF struc3, BEGIN OF struc4, a(2) TYPE C, a(8) TYPE C, n(6) TYPE N, i TYPE I, i TYPE I, f TYPE F, END OF struc3. END OF struc4. The assignment struc3 = struc4 is allowed since the fragment views of the character-type fields and the integer are identical. BEGIN OF struc5, BEGIN OF struc6, a(1) TYPE X, a(1) TYPE X, b(1) TYPE X, BEGIN OF struc0, c(1) TYPE C, b(1) TYPE X, END OF struc5. c(1) TYPE C, END OF struc0, END OF struc6. struc5 = struc6 is again not permitted since the fragment views of both structures are not identical due to the alignment gaps before struc0 and struc0-c. BEGIN OF struc7, BEGIN OF struc8, p(8) TYPE P, p(8) TYPE P, c(1) TYPE C, c(5) TYPE C, END OF struc7. o(8) TYPE P, END OF struc8. 12

The assignment struc7 = struc8 works since the Unicode fragment views are identical with regard to the length of structure struc7. For deep structures, the operand types must be compatible as usual. As an enhancement measure, we slightly generalized the convertibility in case of object references and table components. Conversion between structures and single fields The following rules apply for converting a structure into a single field and vice versa: 1. If a structure is purely character-type, it is treated like a C field during conversion. 2. If the single field is of type C, but only part of the structure is character-type, conversion is only possible if the structure begins with a character-type structure and if this structure is at least as long as the single field. Conversion now takes place between the first character-type group of the structure and the single field. If the structure is the target field, the character type sections of the remainder are filled with blanks, and all other components are filled with the type-adequate initial value. 3. Conversion is not permitted if the structure is not purely character-type and if the single field is not of type C. As with the assignment between structures, filling non-character-type components with the initial value is incompatible. Conversion between internal tables Tables can be converted if their row types are convertible. The restrictions described above therefore also effect the conversion of tables. Implicit conversions The above rules also apply to all ABAP statements that use implicit conversions according to the MOVE semantics. For example, this is true for the following statements for internal tables: APPEND wa TO itab. APPEND LINES OF itab1 TO itab2. INSERT wa INTO itab. INSERT LINES OF itab1 INTO itab2. MODIFY itab FROM wa. MODIFY itab ... TRANSPORTING ... WHERE ... ki = vi ... READ TABLE itab ...INTO wa. READ TABLE itab ...WITH KEY ...ki = vi ... LOOP AT itab INTO wa. LOOP AT itab .... WITH KEY ... ki = vi ... The restrictions for explicit conversion also apply to the implicit conversion of VALUE specifications. 13

4.4 Comparisons In general, the rule applies that operands that can be assigned to one another with the MOVE statement can also be compared. An exception is object references, which can be compared but not always assigned. Comparison of flat structures Structures can also be compared if they are not compatible. As in the MOVE statement, the fragment views must be the same for the length of the shorter structure. If the structures have different lengths, the shorter structure is filled until it has the length of the other structure. As in the assignment, all character-type components are filled with spaces and all other components with initial values of the right type. The structures are compared fragment by fragment as defined by the fragment view. Comparison of single fields and structures The following rules are valid when single fields are compared with structures: 1. If a structure is purely character-type, it is treated like a C field in the comparison. 2. If the single field is of character-type, but the structure is only partly of character-type, the comparison is only possible if the first fragment of character-type in the structure is longer than the single field. The single field is extended to the structure length at runtime and filled with initial values for the comparison. The comparison is the same as for structured fields, where the fields are filled as in the MOVE statement. c0(10) TYPE C. c0 0 BEGIN OF struc, c1(15) TYPE C, i TYPE I, c2(5) TYPE C, c1 i c2 n n(7) TYPE N, END OF struc. In this example, c0 is extended to the length of struc in storage. All areas > 10 are filled with initial values of the correct type for components that are not character-type and filled with space for other components. Comparison of deep structures As previously, mainly type compatibility of the operands is needed for comparing deep structures. The compatibility test for comparability was generalized so that structure components with references to classes or interfaces can be compared with one another, whatever the class hierarchy and implementation relation, as for single fields. Only comparability of table types is required for table components. 14

Comparison of internal tables Tables can be compared if their row types can be compared. The restrictions described above therefore also affect table comparisons. 4.5 Processing Strings String processing statements, whose arguments were all interpreted as fields of type C until now, are now divided into statements with character arguments and those with byte arguments. String processing statements CLEAR ... WITH CONCATENATE CONDENSE CONVERT TEXT ... INTO SORTABLE CODE OVERLAY REPLACE SEARCH SHIFT SPLIT TRANSLATE ... TO UPPER/LOWER CASE TRANSLATE ... USING The arguments of these instructions must be single fields of type C, N, D, T or STRING or purely character-type structures. There is a syntax or runtime error if arguments of a different type are passed. A subset of this function is provided with the addition IN BYTE MODE for processing byte strings – that is, operands of type X or XSTRING. A statement such as CONCATENATE a x b INTO c is thus no longer possible when a, b, and c are all character-type, but x is of type X. TRANSLATE ... CODEPAGE ... TRANSLATE ... NUMBER FORMAT ... The above statements are not allowed in Unicode programs. Instead, you can use the new conversion classes, which are described in more detail on page 37. Comparison operators for string processing CO CN CA NA CS NS CP NP As with the string processing statements, these operators need single fields of type C, N, D, T or STRING as arguments and again purely character-type 15

structures are allowed. Special compare operators defined with the prefix BYTE- are provided for byte strings. Functions for string processing Function STRLEN only works with character-type fields and returns the length in characters. The new function XSTRLEN finds the length of byte strings. Until now, function CHARLEN returned the value1 for a text field beginning with a single byte character under an NUS. The value 2 is returned for text fields beginning with a double byte character. Under a US, CHARLEN returns the value 1 if text begins with a single Unicode character. If text begins with a Unicode double character from the surrogate area, the value 2 is returned. Function NUMOFCHAR returns the number of characters in a string or a character-type field. In single byte code pages, the function behaves like STRLEN. In multi-byte code pages, characters filling more than 1 byte are nevertheless considered to have length 1. Output in fields and lists In WRITE ... TO, any flat data types that were handled like C fields were allowed as target. For the WRITE statement, the following rules apply in Unicode programs: TO ... requires the target field to be of character-type. For the table variant WRITE ... TO itab INDEX idx the line type of the table must be of character-type. The offset and length are counted in characters. Until now, any flat structures could be output with WRITE. If the source field is a flat structure in a WRITE, it must have character-type only, in a UP. This affects the following statements: WRITE f. WRITE f TO g[off][(len)]. WRITE (name) TO g. WRITE f TO itab[off][(len)] INDEX idx. WRITE (name) TO itab[+off][(len)] INDEX idx. 4.6 Type Checks and Type Compatibility For historical reasons, the types of field symbols and parameters in subroutines or function modules can be defined with the STRUCTURE addition. If the types of field symbols are defined with FIELD-SYMBOLS ... , in a NUP both statements are checked to see if wa is at least as long as s and wa satisfies the alignment requirements of s at runtime. 16

If parameter types in function modules or subroutines are defined with FORM form1 USING/CHANGING arg STRUCTURE s ... or FORM form2 TABLES itab_a STRUCTURE s ... and the parameters are passed actual parameters with PERFORM form1 USING/CHANGING wa or PERFORM form2 USING/CHANGING itab_b, the NUP also only checks if wa or the line type of itab_b is at least as long as s and wa or the line type of itab_b satisfies the alignment requirements of s. The same is true for function module parameters whose types are defined with STRUCTURE. The following extra rules are checked in a UP after defining the type with STRUCTURE when assigning data objects, that is for the DEFAULT addition in the FIELD-SYMBOLS statement, for ASSIGN, and when passing actual parameters. 1. If wa or the line type of itab_b is a flat or deep structure, the length of s must be the same for the Unicode fragment views of wa or of itab_b and s. 2. If wa is a single field, only the character-types C, N, D or T are allowed and the structure s must be purely character-type. Checking both these rules requires additional runtime. It is therefore recommended that, if possible, you type the parameters using TYPE, since the test for actual compatibility is much faster. If the type of an argument in a function module was defined with ... LIKE struc, where struc is a flat structure, the NUP only checks if the argument is a flat structure with the same length when the parameters are passed. In the UP, it also checks that the fragment views of the current and formal parameters are the same. For performance reasons, it is again recommended that you use TYPE to assign types. Furthermore, two structures of which one or both contain Includes, are only compatible if the alignment gaps caused by the Include are the same on all platforms. In the following example, struc1 and struc2 are not compatible because a further alignment gap occurs in the US before the INCLUDE: BEGIN OF struc1, BEGIN OF struc2, BEGIN OF struc3, a(1) TYPE X, a(1) TYPE X. b(1) TYPE X, b(1) TYPE X, INCUDE struc3. c(1) TYPE C, c(1) TYPE C, END OF struc2. END OF struc3. END OF struc1. Since the type compatibility can differ in a UP and an NUP, the type compatibility rules of the calling program are valid in an NUS for checking the parameters. This means that if an NUP calls a UP, the type compatibility is defined as in the NUP. Conversely, the Unicode check is activated if a UP calls an NUP. 4.7 Changes to Database Operations Until now, in an NUP the data is copied to field wa or to table line itab as defined by the structure of the table work area dbtab without taking its structure into consideration. Only the length and alignment are checked. 17

SELECT * FROM dbtab ... INTO wa ... SELECT * FROM dbtab ... INTO TABLE itab ... SELECT * FROM dbtab ... APPENDING TABLE itab ... FETCH NEXT CURSOR c ... INTO wa. FETCH NEXT CURSOR c ... INTO TABLE itab. FETCH NEXT CURSOR c ... APPENDING TABLE itab. INSERT INTO dbtab ... FROM wa. INSERT dbtab ... FROM wa. INSERT dbtab ... FROM TABLE itab. UPDATE dbtab ... FROM wa. UPDATE dbtab ... FROM TABLE itab. MODIFY dbtab ... FROM wa. MODIFY dbtab ... FROM TABLE itab. DELETE dbtab FROM wa. DELETE dbtab FROM TABLE itab. The following rules are now valid in a UP: If the work area or the line of the internal table is a structure, there is also a check if the fragment views of the work area and the database table are the same up to the length of the database table. If the work area is a single field, the field must be character-type and the database table must be purely character-type. These requirements are valid for all the commands mentioned above. Only the types C, N, D, T, – and flat structures of these types – are now valid for the version field in any statement that processes database tables (READ, MODIFY, DELETE, LOOP) and uses the VERSION addition. Otherwise, a warning is triggered in an NUS, and a syntax error in a US. 4.8 Determining the Length and Distance You may no longer use the DESCRIBE DISTANCE statement to define the lengths and distances of fields. It must be replaced with one of the new statements DESCRIBE DISTANCE ...IN BYTE MODE or DESCRIBE DISTANCE ... IN CHARACTER MODE. The DESCRIBE FIELD ...LENGTH statement is also obsolete and must be replaced with one of the new statements DESCRIBE FIELD ... LENGTH ... IN BYTE MODE or DESCRIBE FIELD ... LENGTH ... IN CHARACTER MODE. Until now, the DESCRIBE FIELD ... TYPE field statement returned type C for flat structures. In a UP, type u is now returned for flat structures. This can be queried in the ABAP source code. There are no changes for the DESCRIBE FIELD ... TYPE ... COMPONENTS ... statement under US. Similarly, the DESCRIBE ... OUTPUT LENGTH ... statement still returns the output length in characters. 18

4.9 Other Changes The following text describes the file interface, key definitions for tables and the bit and bit mask operations. The introduction of Unicode results in the following changes: The OPEN DATASET command was completely revised In the file interface. At least one of the additions IN TEXT MODE ENCODING, IN BINARY MODE, IN LEGACY MODE, or IN LEGACY BINARY MODE must be defined in a UP. In a US, you can only read and write files with READ DATASET and TRANSFER if the file to be edited was first opened explicitly. A runtime error is triggered if there is no OPEN statement for these statements. If the file was opened in TEXT MODE, only character type fields, strings and purely character-type structures are allowed for READ DATASET dsn INTO f for f, and the type is only checked at runtime. The LENGTH addition defines the length of the data record in characters in TEXT MODE. In all other cases it is defined in bytes. A syntax error is triggered in a UP for the obsolete statements LOOP AT dbtab, READ TABLE dbtab, and READ TABLE itab if the key is purely character-type. A syntax or runtime error is triggered for the READ TABLE itab statement if the standard key of the internal table contains types X or XSTRING. With this READ variant, the key that is actually used is determined by hiding all the components filled with spaces. The comparison with SPACE must be allowed in a UP. A syntax or runtime error is also triggered when you access the database with generic key if the key is not purely character-type. This affects the following commands: READ TABLE dbtab ...SEARCH GKEQ ... READ TABLE dbtab ...SEARCH GKGE ... LOOP AT dbtab ... REFRESH itab FROM TABLE dbtab. The actual table key is determined by truncating the closing spaces of the database key in these statements. In a UP you must make sure that all the components of the key can be compared with SPACE. Until now, there was a check in bit statements SET BIT i OF f and GET BIT i OF f to see if field f has character type, where normally X fields, X strings and flat structures were also considered to have character type. This no longer is meaningful in a UP because on the one hand types X and XSTRING are no longer considered to have character-type, and on the other hand bit-by-bit access to fields or structures of character-type is no longer platform-independent. In an UP, field f must therefore be of type X or XSTRING for these operations.

19

Until now, all numeric types and thus all character types were allowed for the left operand f in the bit mask operations f O x, f Z x and f M x. Operand f now must have type X or XSTRING in a UP. In a UP, the operand f must have the type X or XSTRING. There are certain restrictions in UP for the following statements when adding field strings: ADD n1 THEN n2 UNTIL nz GIVING m ... ADD n1 THEN n2 UNTIL nz TO m . 1. Operands n1, n2, and nz must have compatible types. 2. The distance between nz and n1 must be an integer multiple of the distance between n2 and n1. 3. There is a syntax or runtime error if fields n1, n2 and nz are not in a structure. Either the syntax check must be able to recognize this fact or its valid range must be marked explicitly with a RANGE addition. 4. The system ensures that the RANGE area is not left at runtime. ADD n1 FROM i1 GIVING m . 1. The field n1 must lie within a structure. Field n1 must lie within a structure that must be explicitly defined with a RANGE addition if the syntax check cannot recognize this fact. 2. This variant also checks at runtime if n1 and the addressed values lie within the structure. Loops with the VARY or VARYING addition also cause Unicode problems because on the one hand you cannot be sure to access the contents of memory with the correct type and on the other hand memory could be overwritten inadvertently. DO ... VARYING f FROM f1 NEXT f2. Fields f, f1 and f2 must have compatible types in this statement. To prevent storage contents from being overwritten, a RANGE for valid accesses is implicitly or explicitly introduced for the following statements: DO ... TIMES VARYING f FROM f1 NEXT f2 . WHILE ... VARY f FROM f1 NEXT f2 . A syntax or runtime error is also triggered if f1 or f2 are not included in f3. If the RANGE addition is missing, it is implicitly defined as follows with FROM f1 NEXT f2: 1. If the syntax check recognizes that both f1 and f2 are components of the same structure, the valid RANGE range is defined from the smallest structure containing f1 and f2. 2. There is a syntax error if the syntax check recognizes that f1 and f2 do not belong to the same structure.

20

report kzi_temp_02.

*----

-

start-of-selection.

perform sub_Main.

*----

-

types:

begin of ty_S_Line,

Type type c length 1,

Text type String,

Text_X type String,

Size type i,

Dist type i,

Indent type i,

end of ty_S_Line,

ty_T_Lines type standard table of ty_S_Line with default key.

*----

-

class lcl_Fragment definition.

public section.

class-methods:

get_Infos

importing input type data

returning value(Result) type ty_T_Lines.

private section.

class-methods:

calc_Info

importing

input type data

distance type i default 0

changing

Result type ty_T_Lines.

class-data:

fg_Indent type i.

endclass.

class lcl_Fragment implementation.

*================

method get_Infos.

*================

fg_Indent = 0.

calc_Info( exporting Input = Input changing Result = Result ).

endmethod.

*================

method calc_Info.

*================

data:

wa_Line like line of Result,

lon_Distance type i,

lon_Size type i.

field-Symbols:

.

enddo.

fg_Indent = fg_Indent - 1.

when others. "nop

endcase.

endmethod.

endclass.

*----

-

form sub_Main.

data:

begin of struc2,

c1 type c,

x1 type x,

begin of struc3,

c2(4) type c,

x2 type x,

d1 type d,

end of struc3,

t1 type t,

i1 type i,

end of struc2.

struc2-c1 = 'C'.

struc2-x1 = 'AA'.

struc2-struc3-c2 = 'ESEL'.

struc2-struc3-x2 = 'BB'.

struc2-struc3-d1 = 12345678.

struc2-t1 = '121212'.

struc2-i1 = 13.

perform sub_Dump_Data using struc2.

endform.

form sub_Dump_Data using p_Data.

data:

it_Infos type ty_T_Lines.

field-symbols:

Message was edited by:

ravish goyal

sreeramkumar_madisetty · ‎05-31-2007

Hi,

ABAP Development under Unicode

Prior to Unicode the length of a character was exactly one byte, allowing implicit typecasts or memory-layout oriented programming. With Unicode this situation has changed: One character is no longer one byte, so that additional specifications have to be added to define the unit of measure for implicit or explicit references to (the length of) characters.

Character-like data in ABAP are always represented with the UTF-16 - standard (also used in Java or other development tools like Microsoft's Visual Basic); but this format is not related to the encoding of the underlying database.

A Unicode-enabled ABAP program (UP) is a program in which all Unicode checks are effective. Such a program returns the same results in a non-Unicode system (NUS) as in a Unicode system (US). In order to perform the relevant syntax checks, you must activate the Unicode flag in the screens of the program and class attributes.

In a US, you can only execute programs for which the Unicode flag is set. In future, the Unicode flag must be set for all SAP programs to enable them to run on a US. If the Unicode flag is set for a program, the syntax is checked and the program executed according to the rules described in this document, regardless of whether the system is a US or a NUS. From now on, the Unicode flag must be set for all new programs and classes that are created.

If the Unicode flag is not set, a program can only be executed in an NUS. The syntactical and semantic changes described below do not apply to such programs. However, you can use all language extensions that have been introduced in the process of the conversion to Unicode.

As a result of the modifications and restrictions associated with the Unicode flag, programs are executed in both Unicode and non-Unicode systems with the same semantics to a large degree. In rare cases, however, differences may occur. Programs that are designed to run on both systems therefore need to be tested on both platforms.

http://help.sap.com/saphelp_nw04/helpdata/en/62/3f2cadb35311d5993800508b6b8b11/content.htm

http://help.sap.com/saphelp_nw2004s/helpdata/en/79/c55458b3dc11d5993800508b6b8b11/content.htm

Regards,

Sree

Former Member · ‎05-31-2007