CharacterEncoderImplementations__ISO10646_to_UTF16LE.st
author Claus Gittinger <cg@exept.de>
Tue, 09 Jul 2019 20:55:17 +0200
changeset 24417 03b083548da2
parent 24215 0653f0a9a05c
permissions -rw-r--r--
#REFACTORING by exept class: Smalltalk class changed: #recursiveInstallAutoloadedClassesFrom:rememberIn:maxLevels:noAutoload:packageTop:showSplashInLevels: Transcript showCR:(... bindWith:...) -> Transcript showCR:... with:...
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
21476
680cb70d9eb9 #TUNING by stefan
Stefan Vogel <sv@exept.de>
parents: 21302
diff changeset
     1
"{ Encoding: utf8 }"
680cb70d9eb9 #TUNING by stefan
Stefan Vogel <sv@exept.de>
parents: 21302
diff changeset
     2
8903
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     3
"
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     4
 COPYRIGHT (c) 2005 by eXept Software AG
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     5
              All Rights Reserved
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     6
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     7
 This software is furnished under a license and may be used
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     8
 only in accordance with the terms of that license and with the
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     9
 inclusion of the above copyright notice.   This software may not
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    10
 be provided or otherwise made available to, or used by, any
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    11
 other person.  No title to or ownership of the software is
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    12
 hereby transferred.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    13
"
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    14
"{ Package: 'stx:libbasic' }"
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    15
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    16
"{ NameSpace: CharacterEncoderImplementations }"
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    17
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    18
ISO10646_to_UTF16BE subclass:#ISO10646_to_UTF16LE
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    19
	instanceVariableNames:''
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    20
	classVariableNames:''
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    21
	poolDictionaries:''
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    22
	category:'Collections-Text-Encodings'
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    23
!
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    24
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    25
!ISO10646_to_UTF16LE class methodsFor:'documentation'!
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    26
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    27
copyright
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    28
"
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    29
 COPYRIGHT (c) 2005 by eXept Software AG
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    30
              All Rights Reserved
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    31
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    32
 This software is furnished under a license and may be used
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    33
 only in accordance with the terms of that license and with the
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    34
 inclusion of the above copyright notice.   This software may not
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    35
 be provided or otherwise made available to, or used by, any
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    36
 other person.  No title to or ownership of the software is
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    37
 hereby transferred.
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    38
"
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    39
!
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    40
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    41
documentation
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    42
"
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    43
    encodes/decodes UTF16 LittleEndian (little-end-first)
21302
fbc3e8da3733 #OTHER by cg
Claus Gittinger <cg@exept.de>
parents: 19620
diff changeset
    44
fbc3e8da3733 #OTHER by cg
Claus Gittinger <cg@exept.de>
parents: 19620
diff changeset
    45
    Notice the naming (many are confused):
fbc3e8da3733 #OTHER by cg
Claus Gittinger <cg@exept.de>
parents: 19620
diff changeset
    46
        Unicode is the set of number-to-glyph assignments
fbc3e8da3733 #OTHER by cg
Claus Gittinger <cg@exept.de>
parents: 19620
diff changeset
    47
    whereas:
fbc3e8da3733 #OTHER by cg
Claus Gittinger <cg@exept.de>
parents: 19620
diff changeset
    48
        UTF8, UTF16 etc. are a concrete way of xmitting Unicode codePoints (numbers).
fbc3e8da3733 #OTHER by cg
Claus Gittinger <cg@exept.de>
parents: 19620
diff changeset
    49
fbc3e8da3733 #OTHER by cg
Claus Gittinger <cg@exept.de>
parents: 19620
diff changeset
    50
    ST/X NEVER uses UTF8 or UTF16 internally - all characters are full 24bit characters.
fbc3e8da3733 #OTHER by cg
Claus Gittinger <cg@exept.de>
parents: 19620
diff changeset
    51
    Only when exchanging data, are these converted into UTF8 (or other) byte sequences.
8903
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    52
"
9326
46b7df422c6c *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 8903
diff changeset
    53
!
46b7df422c6c *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 8903
diff changeset
    54
46b7df422c6c *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 8903
diff changeset
    55
examples
46b7df422c6c *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 8903
diff changeset
    56
"
46b7df422c6c *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 8903
diff changeset
    57
  Encoding (unicode to utf16LE)
46b7df422c6c *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 8903
diff changeset
    58
     ISO10646_to_UTF16LE encodeString:'hello'.
46b7df422c6c *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 8903
diff changeset
    59
46b7df422c6c *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 8903
diff changeset
    60
46b7df422c6c *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 8903
diff changeset
    61
  Decoding (utf16LE to unicode):
46b7df422c6c *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 8903
diff changeset
    62
     |t|
46b7df422c6c *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 8903
diff changeset
    63
22473
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
    64
     t := ISO10646_to_UTF16LE encodeString:'ÄÖÜäöüß'.
9326
46b7df422c6c *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 8903
diff changeset
    65
     ISO10646_to_UTF16LE decodeString:t.
46b7df422c6c *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 8903
diff changeset
    66
46b7df422c6c *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 8903
diff changeset
    67
  Decoding (utf16LE-Bytes to unicode):
46b7df422c6c *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 8903
diff changeset
    68
     |bytes|
46b7df422c6c *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 8903
diff changeset
    69
46b7df422c6c *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 8903
diff changeset
    70
     bytes := #[ 16r40 0 16r41 0 16r42 0 16r43 0 16r44 0 ].
46b7df422c6c *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 8903
diff changeset
    71
     ISO10646_to_UTF16LE decodeString:bytes.
46b7df422c6c *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 8903
diff changeset
    72
"
8903
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    73
! !
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    74
14171
Stefan Vogel <sv@exept.de>
parents: 9326
diff changeset
    75
!ISO10646_to_UTF16LE methodsFor:'encoding & decoding'!
Stefan Vogel <sv@exept.de>
parents: 9326
diff changeset
    76
Stefan Vogel <sv@exept.de>
parents: 9326
diff changeset
    77
decode:codePoint
Stefan Vogel <sv@exept.de>
parents: 9326
diff changeset
    78
    ^ codePoint swapBytes
Stefan Vogel <sv@exept.de>
parents: 9326
diff changeset
    79
!
Stefan Vogel <sv@exept.de>
parents: 9326
diff changeset
    80
Stefan Vogel <sv@exept.de>
parents: 9326
diff changeset
    81
encode:codePoint
Stefan Vogel <sv@exept.de>
parents: 9326
diff changeset
    82
    ^ codePoint swapBytes
21476
680cb70d9eb9 #TUNING by stefan
Stefan Vogel <sv@exept.de>
parents: 21302
diff changeset
    83
!
680cb70d9eb9 #TUNING by stefan
Stefan Vogel <sv@exept.de>
parents: 21302
diff changeset
    84
22473
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
    85
encodeString:aUnicodeString
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
    86
    "return the UTF-16 representation of a aUnicodeString.
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
    87
     The resulting string is only useful to be stored on some external file,
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
    88
     not for being used inside ST/X."
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
    89
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
    90
    |stream size "{ Class:SmallInteger }"|
21476
680cb70d9eb9 #TUNING by stefan
Stefan Vogel <sv@exept.de>
parents: 21302
diff changeset
    91
22473
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
    92
    size := aUnicodeString size.
24215
0653f0a9a05c #TUNING by stefan
Stefan Vogel <sv@exept.de>
parents: 22473
diff changeset
    93
    stream := WriteStream on:(ByteArray uninitializedNew:size * 2).
21476
680cb70d9eb9 #TUNING by stefan
Stefan Vogel <sv@exept.de>
parents: 21302
diff changeset
    94
22473
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
    95
    1 to:size do:[:idx |
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
    96
        stream nextPutUtf16Bytes:(aUnicodeString at:idx) MSB:false.
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
    97
    ].
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
    98
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
    99
    ^ stream contents
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
   100
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
   101
    "Created: / 16-01-2018 / 19:45:15 / stefan"
24215
0653f0a9a05c #TUNING by stefan
Stefan Vogel <sv@exept.de>
parents: 22473
diff changeset
   102
    "Modified: / 28-05-2019 / 13:50:22 / Stefan Vogel"
21476
680cb70d9eb9 #TUNING by stefan
Stefan Vogel <sv@exept.de>
parents: 21302
diff changeset
   103
!
680cb70d9eb9 #TUNING by stefan
Stefan Vogel <sv@exept.de>
parents: 21302
diff changeset
   104
680cb70d9eb9 #TUNING by stefan
Stefan Vogel <sv@exept.de>
parents: 21302
diff changeset
   105
encodeString:aUnicodeString on:aStream
680cb70d9eb9 #TUNING by stefan
Stefan Vogel <sv@exept.de>
parents: 21302
diff changeset
   106
    "given a string in unicode, encode it onto aStream."
680cb70d9eb9 #TUNING by stefan
Stefan Vogel <sv@exept.de>
parents: 21302
diff changeset
   107
680cb70d9eb9 #TUNING by stefan
Stefan Vogel <sv@exept.de>
parents: 21302
diff changeset
   108
     aStream nextPutAllUtf16Bytes:aUnicodeString MSB:false.
680cb70d9eb9 #TUNING by stefan
Stefan Vogel <sv@exept.de>
parents: 21302
diff changeset
   109
680cb70d9eb9 #TUNING by stefan
Stefan Vogel <sv@exept.de>
parents: 21302
diff changeset
   110
    "Created: / 16-02-2017 / 16:42:36 / stefan"
14171
Stefan Vogel <sv@exept.de>
parents: 9326
diff changeset
   111
! !
Stefan Vogel <sv@exept.de>
parents: 9326
diff changeset
   112
8903
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   113
!ISO10646_to_UTF16LE methodsFor:'private'!
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   114
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   115
nextTwoByteValueFrom:aStream
19216
d1325e650883 #REFACTORING
Claus Gittinger <cg@exept.de>
parents: 14171
diff changeset
   116
    ^ aStream nextUnsignedInt16MSB:false
8903
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   117
! !
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   118
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   119
!ISO10646_to_UTF16LE methodsFor:'queries'!
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   120
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   121
nameOfEncoding
14171
Stefan Vogel <sv@exept.de>
parents: 9326
diff changeset
   122
    ^ #utf16le
8903
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   123
! !
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   124
22473
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
   125
!ISO10646_to_UTF16LE methodsFor:'stream support'!
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
   126
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
   127
encodeCharacter:aUnicodeCharacter on:aStream
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
   128
    "given a character in unicode, encode it onto aStream."
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
   129
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
   130
     aStream nextPutUtf16Bytes:aUnicodeCharacter MSB:false.
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
   131
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
   132
    "Created: / 16-02-2017 / 16:42:19 / stefan"
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
   133
    "Modified (comment): / 16-01-2018 / 19:22:50 / stefan"
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
   134
! !
35fd10859181 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 21476
diff changeset
   135
8903
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   136
!ISO10646_to_UTF16LE class methodsFor:'documentation'!
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   137
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   138
version
19216
d1325e650883 #REFACTORING
Claus Gittinger <cg@exept.de>
parents: 14171
diff changeset
   139
    ^ '$Header$'
8903
4e15c297fadc initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   140
! !
19216
d1325e650883 #REFACTORING
Claus Gittinger <cg@exept.de>
parents: 14171
diff changeset
   141