CharacterEncoderImplementations__ISO10646_to_UTF16BE.st
author Jan Vrany <jan.vrany@fit.cvut.cz>
Tue, 22 Sep 2015 16:28:42 +0100
branchjv
changeset 18759 c1217211909c
parent 18011 deb0c3355881
child 19227 5e949760a4e8
permissions -rw-r--r--
Changed identification strings to contain jv-branch ...to make explicit that this distribution is not the official one used by eXept and therefore that eXept is not to be blamed in case of any problem.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
8903
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     1
"
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     2
 COPYRIGHT (c) 2005 by eXept Software AG
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     3
              All Rights Reserved
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     4
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     5
 This software is furnished under a license and may be used
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     6
 only in accordance with the terms of that license and with the
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     7
 inclusion of the above copyright notice.   This software may not
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     8
 be provided or otherwise made available to, or used by, any
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     9
 other person.  No title to or ownership of the software is
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    10
 hereby transferred.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    11
"
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    12
"{ Package: 'stx:libbasic' }"
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    13
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    14
"{ NameSpace: CharacterEncoderImplementations }"
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    15
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    16
TwoByteEncoder subclass:#ISO10646_to_UTF16BE
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    17
	instanceVariableNames:''
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    18
	classVariableNames:''
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    19
	poolDictionaries:''
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    20
	category:'Collections-Text-Encodings'
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    21
!
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    22
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    23
!ISO10646_to_UTF16BE class methodsFor:'documentation'!
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    24
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    25
copyright
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    26
"
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    27
 COPYRIGHT (c) 2005 by eXept Software AG
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    28
              All Rights Reserved
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    29
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    30
 This software is furnished under a license and may be used
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    31
 only in accordance with the terms of that license and with the
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    32
 inclusion of the above copyright notice.   This software may not
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    33
 be provided or otherwise made available to, or used by, any
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    34
 other person.  No title to or ownership of the software is
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    35
 hereby transferred.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    36
"
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    37
!
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    38
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    39
documentation
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    40
"
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    41
    encodes/decodes UTF16 BigEndian (big-end-first)
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    42
"
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    43
!
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    44
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    45
examples
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    46
"
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    47
  Encoding (unicode to utf16BE)
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    48
     ISO10646_to_UTF16BE encodeString:'hello'.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    49
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    50
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    51
  Decoding (utf16BE to unicode):
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    52
     |t|
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    53
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    54
     t := ISO10646_to_UTF16BE encodeString:''.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    55
     ISO10646_to_UTF16BE decodeString:t.
9325
a4c635a6f8eb *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 8903
diff changeset
    56
a4c635a6f8eb *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 8903
diff changeset
    57
  Decoding (utf16LE-Bytes to unicode):
a4c635a6f8eb *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 8903
diff changeset
    58
     |bytes|
a4c635a6f8eb *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 8903
diff changeset
    59
a4c635a6f8eb *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 8903
diff changeset
    60
     bytes := #[ 16r40 0 16r41 0 16r42 0 16r43 0 16r44 0 ].
a4c635a6f8eb *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 8903
diff changeset
    61
     ISO10646_to_UTF16LE decodeString:bytes.
8903
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    62
"
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    63
! !
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    64
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    65
!ISO10646_to_UTF16BE methodsFor:'encoding & decoding'!
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    66
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    67
decode:aCode
12432
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
    68
    ^ aCode
8903
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    69
!
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    70
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    71
decodeString:aStringOrByteCollection
14170
eed0dbcc471c comment/format in: #decodeString:
Stefan Vogel <sv@exept.de>
parents: 12432
diff changeset
    72
    "given a byteArray (2-bytes per character) or unsignedShortArray in UTF16 encoding,
8903
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    73
     return a new string containing the same characters, in 8, 16bit (or more) encoding.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    74
     Returns either a normal String, a TwoByte- or a FourByte-String instance.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    75
     Only useful, when reading from external sources.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    76
     This only handles up-to 30bit characters."
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    77
14170
eed0dbcc471c comment/format in: #decodeString:
Stefan Vogel <sv@exept.de>
parents: 12432
diff changeset
    78
    |s newString bitsPerElementIn nextIn
12432
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
    79
     codeIn codeIn1 codeIn2 estimatedSize out|
8903
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    80
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    81
    aStringOrByteCollection isByteArray ifTrue:[
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    82
        bitsPerElementIn := 8.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    83
    ] ifFalse:[
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    84
        aStringOrByteCollection isString ifTrue:[
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    85
            bitsPerElementIn := aStringOrByteCollection bitsPerCharacter.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    86
        ] ifFalse:[
14175
84e1adb65a5d changed: #decodeString: detect odd number of bytes
Stefan Vogel <sv@exept.de>
parents: 14170
diff changeset
    87
            "can be a ShortArray"
8903
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    88
            bitsPerElementIn := 16.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    89
        ].
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    90
    ].
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    91
14175
84e1adb65a5d changed: #decodeString: detect odd number of bytes
Stefan Vogel <sv@exept.de>
parents: 14170
diff changeset
    92
    s := aStringOrByteCollection readStream.
8903
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    93
    bitsPerElementIn == 8 ifTrue:[
14175
84e1adb65a5d changed: #decodeString: detect odd number of bytes
Stefan Vogel <sv@exept.de>
parents: 14170
diff changeset
    94
        s size odd ifTrue:[
14208
4a7349aba15f changed: #decodeString:
Claus Gittinger <cg@exept.de>
parents: 14175
diff changeset
    95
            InvalidEncodingError raiseWith:aStringOrByteCollection errorString:' - size is not a multiple of 2 bytes'.
14175
84e1adb65a5d changed: #decodeString: detect odd number of bytes
Stefan Vogel <sv@exept.de>
parents: 14170
diff changeset
    96
        ].
8903
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    97
        nextIn := [self nextTwoByteValueFrom:s].
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    98
    ] ifFalse:[
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    99
        nextIn := [s next].
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   100
    ].
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   101
14175
84e1adb65a5d changed: #decodeString: detect odd number of bytes
Stefan Vogel <sv@exept.de>
parents: 14170
diff changeset
   102
    estimatedSize := s size * bitsPerElementIn // 16.
12432
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   103
    out := CharacterWriteStream on:(String new:estimatedSize).
8903
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   104
    [s atEnd] whileFalse:[
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   105
        codeIn := nextIn value.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   106
        codeIn <= 16rFF ifTrue:[
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   107
        ] ifFalse:[
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   108
            (codeIn between:16rD800 and:16rDBFF) ifTrue:[
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   109
                codeIn1 := codeIn.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   110
                codeIn2 := nextIn value.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   111
                codeIn := ((codeIn1 - 16rD800) bitShift:10)
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   112
                          +
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   113
                          (codeIn2 - 16rDC00)
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   114
                          + 16r00010000.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   115
            ].
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   116
        ].
12432
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   117
        out nextPut:(Character value:codeIn).
8903
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   118
    ].
12432
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   119
    newString := out contents.
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   120
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   121
"/    nBitsRequired := 8.
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   122
"/    sz := 0.
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   123
"/    [s atEnd] whileFalse:[
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   124
"/        codeIn := nextIn value.
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   125
"/        sz := sz + 1.
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   126
"/
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   127
"/        codeIn <= 16rFF ifTrue:[
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   128
"/        ] ifFalse:[
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   129
"/            nBitsRequired := nBitsRequired max:16.
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   130
"/            (codeIn between:16rD800 and:16rDBFF) ifTrue:[
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   131
"/                nBitsRequired := 32.
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   132
"/                codeIn2 := nextIn value.
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   133
"/            ].
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   134
"/        ]
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   135
"/    ].
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   136
"/
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   137
"/    nBitsRequired == 8 ifTrue:[
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   138
"/        newString := String uninitializedNew:sz
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   139
"/    ] ifFalse:[
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   140
"/        nBitsRequired <= 16 ifTrue:[
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   141
"/            newString := Unicode16String new:sz
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   142
"/        ] ifFalse:[
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   143
"/            newString := Unicode32String new:sz
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   144
"/        ]
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   145
"/    ].
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   146
"/
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   147
"/    s := aStringOrByteCollection readStream.
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   148
"/    idx := 1.
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   149
"/    [s atEnd] whileFalse:[
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   150
"/        codeIn := nextIn value.
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   151
"/        codeIn <= 16rFF ifTrue:[
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   152
"/        ] ifFalse:[
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   153
"/            nBitsRequired := nBitsRequired max:16.
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   154
"/            (codeIn between:16rD800 and:16rDBFF) ifTrue:[
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   155
"/                nBitsRequired := 32.
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   156
"/                codeIn1 := codeIn.
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   157
"/                codeIn2 := nextIn value.
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   158
"/                codeIn := ((codeIn1 - 16rD800) bitShift:10)
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   159
"/                          +
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   160
"/                          (codeIn2 - 16rDC00)
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   161
"/                          + 16r00010000.
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   162
"/            ].
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   163
"/        ].
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   164
"/        newString at:idx put:(Character value:codeIn).
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   165
"/        idx := idx + 1.
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   166
"/    ].
8903
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   167
    ^ newString
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   168
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   169
    "
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   170
     self new decodeString:#[ 16r00 16r42 ]            
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   171
     self new decodeString:#[ 16r01 16r42 ]            
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   172
     self new decodeString:#[ 16r00 16r48
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   173
                              16r00 16r69  
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   174
                              16rD8 16r00  
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   175
                              16rDC 16r00  
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   176
                              16r00 16r21  
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   177
                              16r00 16r21  
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   178
                            ]            
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   179
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   180
     self new decodeString:#( 16r0048
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   181
                              16r0069  
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   182
                              16rD800  
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   183
                              16rDC00  
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   184
                              16r0021  
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   185
                              16r0021  
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   186
                            )
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   187
    "
14208
4a7349aba15f changed: #decodeString:
Claus Gittinger <cg@exept.de>
parents: 14175
diff changeset
   188
4a7349aba15f changed: #decodeString:
Claus Gittinger <cg@exept.de>
parents: 14175
diff changeset
   189
    "Modified: / 12-07-2012 / 19:56:12 / cg"
8903
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   190
!
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   191
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   192
encode:aCode
12432
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   193
    ^ aCode
8903
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   194
!
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   195
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   196
encodeString:aUnicodeString
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   197
    "return the UTF-16 representation of a aUnicodeString.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   198
     The resulting string is only useful to be stored on some external file,
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   199
     not for being used inside ST/X."
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   200
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   201
    |s|
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   202
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   203
    s := WriteStream on:(ByteArray uninitializedNew:aUnicodeString size).
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   204
    aUnicodeString do:[:eachCharacter |
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   205
        |codePoint t hi low|
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   206
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   207
        codePoint := eachCharacter codePoint.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   208
        (codePoint <= 16rFFFF) ifTrue:[
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   209
            ((codePoint <= 16rD7FF) or:[ codePoint between:16rE000 and:16rFFFF]) ifTrue:[
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   210
                self nextPutTwoByteValue:codePoint to:s.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   211
            ] ifFalse:[
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   212
                "/ unrepresentable: D800..DFFFF
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   213
                self error:'unrepresentable value (D800..DFFFF) in utf16Encode'.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   214
            ].
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   215
        ] ifFalse:[
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   216
            t := codePoint - 16r00010000.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   217
            hi := t bitShift:-10.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   218
            low := t bitAnd:16r3FF.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   219
            hi > 16r3FF ifTrue:[
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   220
                "/ unrepresentable: above 110000
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   221
                self error:'unrepresentable value (> 10FFFF) in utf16Encode'.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   222
            ].
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   223
            self nextPutTwoByteValue:(hi + 16rD800) to:s.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   224
            self nextPutTwoByteValue:(low + 16rDC00) to:s.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   225
        ].
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   226
    ].
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   227
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   228
    ^ s contents
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   229
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   230
    "
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   231
     (self encodeString:'hello')                                         #[0 104 0 101 0 108 0 108 0 111]
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   232
     (self encodeString:(Character value:16r40) asString)                #[0 64]
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   233
     (self encodeString:(Character value:16rFF) asString)                #[0 255]
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   234
     (self encodeString:(Character value:16r100) asString)               #[1 0]
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   235
     (self encodeString:(Character value:16r1000) asString)              #[16 0]
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   236
     (self encodeString:(Character value:16r2000) asString)              #[32 0]
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   237
     (self encodeString:(Character value:16r4000) asString)              #[64 0]
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   238
     (self encodeString:(Character value:16r8000) asString)              #[128 0]
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   239
     (self encodeString:(Character value:16rD7FF) asString)              #[215 255]
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   240
     (self encodeString:(Character value:16rE000) asString)              #[224 0]
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   241
     (self encodeString:(Character value:16rFFFF) asString)              #[255 255]
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   242
     (self encodeString:(Character value:16r10000) asString)             #[216 64 220 0]
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   243
     (self encodeString:(Character value:16r10FFF) asString)             #[216 67 223 255]
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   244
     (self encodeString:(Character value:16r1FFFF) asString)             #[216 127 223 255]
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   245
     (self encodeString:(Character value:16r10FFFF) asString)            #[219 255 223 255]             
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   246
    error cases:
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   247
     (self encodeString:(Character value:16rD800) asString) 
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   248
     (self encodeString:(Character value:16rD801) asString) 
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   249
     (self encodeString:(Character value:16rDFFF) asString) 
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   250
     (self encodeString:(Character value:16r110000) asString)   
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   251
    "
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   252
! !
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   253
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   254
!ISO10646_to_UTF16BE methodsFor:'private'!
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   255
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   256
nextPutTwoByteValue:anInteger to:aStream
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   257
    aStream nextPutShort:anInteger MSB:true
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   258
!
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   259
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   260
nextTwoByteValueFrom:aStream
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   261
    ^ aStream nextUnsignedShortMSB:true
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   262
! !
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   263
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   264
!ISO10646_to_UTF16BE methodsFor:'queries'!
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   265
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   266
nameOfEncoding
14170
eed0dbcc471c comment/format in: #decodeString:
Stefan Vogel <sv@exept.de>
parents: 12432
diff changeset
   267
    ^ #utf16be
8903
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   268
! !
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   269
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   270
!ISO10646_to_UTF16BE class methodsFor:'documentation'!
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   271
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   272
version
14208
4a7349aba15f changed: #decodeString:
Claus Gittinger <cg@exept.de>
parents: 14175
diff changeset
   273
    ^ '$Header: /cvs/stx/stx/libbasic/CharacterEncoderImplementations__ISO10646_to_UTF16BE.st,v 1.6 2012-07-12 18:07:54 cg Exp $'
12432
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   274
!
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   275
2c2adc733221 changed:
Claus Gittinger <cg@exept.de>
parents: 9325
diff changeset
   276
version_CVS
14208
4a7349aba15f changed: #decodeString:
Claus Gittinger <cg@exept.de>
parents: 14175
diff changeset
   277
    ^ '$Header: /cvs/stx/stx/libbasic/CharacterEncoderImplementations__ISO10646_to_UTF16BE.st,v 1.6 2012-07-12 18:07:54 cg Exp $'
8903
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   278
! !