CharacterEncoderImplementations__ISO10646_to_UTF8_MAC.st
author Claus Gittinger <cg@exept.de>
Fri, 27 Feb 2015 19:26:01 +0100
changeset 17567 2d57395ef7e0
parent 17566 a990c12c71c0
child 17568 e90410336cc2
permissions -rw-r--r--
class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC changed: #initializeDecomposeMap
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
17490
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     1
"{ Encoding: utf8 }"
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     2
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     3
"
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     4
 COPYRIGHT (c) 2015 by eXept Software AG
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     5
              All Rights Reserved
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     6
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     7
 This software is furnished under a license and may be used
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     8
 only in accordance with the terms of that license and with the
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     9
 inclusion of the above copyright notice.   This software may not
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    10
 be provided or otherwise made available to, or used by, any
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    11
 other person.  No title to or ownership of the software is
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    12
 hereby transferred.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    13
"
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    14
"{ Package: 'stx:libbasic' }"
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    15
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    16
"{ NameSpace: CharacterEncoderImplementations }"
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    17
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    18
ISO10646_to_UTF8 subclass:#ISO10646_to_UTF8_MAC
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    19
	instanceVariableNames:''
17564
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
    20
	classVariableNames:'AccentMap DecomposeMap ComposeMap'
17490
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    21
	poolDictionaries:''
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    22
	category:'Collections-Text-Encodings'
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    23
!
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    24
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    25
!ISO10646_to_UTF8_MAC class methodsFor:'documentation'!
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    26
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    27
copyright
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    28
"
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    29
 COPYRIGHT (c) 2015 by eXept Software AG
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    30
              All Rights Reserved
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    31
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    32
 This software is furnished under a license and may be used
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    33
 only in accordance with the terms of that license and with the
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    34
 inclusion of the above copyright notice.   This software may not
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    35
 be provided or otherwise made available to, or used by, any
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    36
 other person.  No title to or ownership of the software is
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    37
 hereby transferred.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    38
"
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    39
!
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    40
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    41
documentation
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    42
"
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    43
    UTF-8 can encode some diacritical characters (umlauts) in multiple ways:
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    44
        - either with a single uniode (e.g. ae -> ä -> &#228 -> C3 A4)
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    45
        - or as so called 'Normalization Form canonical Decomposition', i.e. as a regular 'a' followed by a
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    46
          combining diacritical mark (for example: acute).
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    47
17564
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
    48
    MAC OSX needs the second form for its file names.
17490
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    49
    However, OSX does not decompose the ranges U+2000-U+2FFF, U+F900-U+FAFF and U+2F800-U+2FAFF.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    50
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    51
    This is a q&d hack, to at least support the first page (latin1) characters.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    52
    Will be enhanced for the 2nd and 3rd unicode page, when I find time.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    53
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    54
    [author:]
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    55
        Claus Gittinger
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    56
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    57
    [instance variables:]
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    58
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    59
    [class variables:]
17564
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
    60
        ComposeMap DecomposeMap
17490
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    61
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    62
    [see also:]
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    63
        http://developer.apple.com/library/mac/#qa/qa2001/qa1173.html
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    64
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    65
"
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    66
! !
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    67
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    68
!ISO10646_to_UTF8_MAC class methodsFor:'initialization'!
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    69
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    70
initializeDecomposeMap
17564
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
    71
    "the map which decomposes a diacritical character into its two components"
17490
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    72
17564
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
    73
    DecomposeMap := Dictionary new.
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
    74
    ComposeMap := Dictionary new.
17490
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    75
17564
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
    76
    #(
17566
a990c12c71c0 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17565
diff changeset
    77
        "/ attention: the following strings contain non-latin characters
a990c12c71c0 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17565
diff changeset
    78
        "/ if you don't see them, change your font setting for a better font
a990c12c71c0 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17565
diff changeset
    79
17567
2d57395ef7e0 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17566
diff changeset
    80
        (16r0300 "gravis"       'AÀaàEÈeèIÌiìoòOÒUÙuùNǸnǹÜǛüǜWẀwẁYỲyỳ')
2d57395ef7e0 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17566
diff changeset
    81
        (16r0301 "akut"         'AÁaáEÉeéIÍiíOÓoóUÚuúyýYÝCĆcćNŃnńRŔrŕSŚsśZŹzźGǴgǵÆǼæǽØǾøǿÜǗüǘMḾmḿKḰkḱPṔpṕWẂwẃ')
2d57395ef7e0 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17566
diff changeset
    82
        (16r0302 "circonflex"   'AÂaâEÊeêIÎiîOÔoôUÛuûCĈcĉGĜgĝHĤhĥJĴjĵSŜsŝWŴwŵYŶyŷZẐzẑ')
2d57395ef7e0 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17566
diff changeset
    83
        (16r0303 "tilde"        'AÃaãNÑnñOÕoõUŨuũYỸyỹEẼeẽVṼvṽ')
2d57395ef7e0 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17566
diff changeset
    84
        (16r0304 "macron"       'AĀaāEĒeēIĪiīOŌoōUŪuūÜǕüǖGḠgḡ' )
2d57395ef7e0 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17566
diff changeset
    85
        (16r0306 "breve"        'AĂaăEĔeĕGĞgğIĬiĭOŎoŏUŬuŭ')
2d57395ef7e0 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17566
diff changeset
    86
        (16r0307 "dot above"    'AȦaȧOȮoȯCĊcċEĖeėGĠgġZŻzżBḂbḃDḊdḋFḞfḟHḢhḣMṀmṁNṄnṅPṖpṗRṘrṙSṠsṡTṪtṫWẆwẇXẊxẋYẎyẏ' )
2d57395ef7e0 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17566
diff changeset
    87
        (16r0308 "umlaut/trema" 'AÄaäEËeëOÖoöUÜuüIÏiïyÿYŸHḦhḧXẌxẍtẗ')
2d57395ef7e0 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17566
diff changeset
    88
        (16r030A "ring"         'AÅaåUŮuůwẘyẙ')
2d57395ef7e0 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17566
diff changeset
    89
        (16r030B "dbl akut"     'OŐoőUŰuű')
2d57395ef7e0 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17566
diff changeset
    90
        (16r030C "hatcheck"     'CČcčDĎEĚeěNŇnňRŘrřSŠsšZŽzžAǍaǎIǏiǐOǑoǒUǓuǔGǦgǧKǨkǩÜǙüǚ')
2d57395ef7e0 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17566
diff changeset
    91
        (16r030F "dbl grave"    'AȀaȁEȄeȅIȈiȉOȌoȍRȐrȑUȔuȕ')
2d57395ef7e0 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17566
diff changeset
    92
        (16r0311 "inv. breve"   'AȂaȃEȆeȇIȊiȋOȎoȏRȒrȓUȖuȗ')
2d57395ef7e0 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17566
diff changeset
    93
        (16r0317 "acute. below" 'KĶkķLĻlļNŅnņRŖrŗSȘsșTȚtț')
2d57395ef7e0 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17566
diff changeset
    94
        (16r0327 "cedille"      'CÇc窺TŢtţEȨeȩDḐdḑHḨhḩ')       
2d57395ef7e0 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17566
diff changeset
    95
        (16r0328 "ogonek"       'AĄaąEĘeęIĮiįOǪoǫUŲuų')
17564
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
    96
    ) do:[:eachPair |
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
    97
        |composeCode mapping|
17490
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    98
17564
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
    99
        composeCode := eachPair first.
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   100
        mapping := eachPair second.
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   101
        mapping pairWiseDo:[:baseChar :composedChar |
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   102
            "/ setup, so that we find
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   103
            "/    DecomposeMap at:"$à codePoint" 16rE0 put:#( "$a codePoint" 16r61 "greve codePoint" 16r0300).
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   104
            DecomposeMap 
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   105
                at:composedChar codePoint 
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   106
                put:(Array with:baseChar codePoint with:composeCode)
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   107
        ].
17490
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   108
17564
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   109
        ComposeMap at:composeCode put:mapping.
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   110
    ].
17490
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   111
! !
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   112
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   113
!ISO10646_to_UTF8_MAC methodsFor:'encoding & decoding'!
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   114
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   115
compositionOf: baseChar with: diacriticalChar  to: outStream
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   116
    "compose two characters into one
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   117
     a + umlaut-diacritic-mark -> ä."
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   118
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   119
    |cp map i|
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   120
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   121
    cp := diacriticalChar codePoint.
17564
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   122
    map := ComposeMap at:cp ifAbsent:nil.
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   123
17490
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   124
    map notNil ifTrue:[
17564
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   125
        "/ compose
17490
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   126
        i := map indexOf: baseChar.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   127
        i ~~ 0 ifTrue:[
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   128
            outStream nextPut: (map at:i+1).
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   129
            ^ self.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   130
        ].
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   131
    ].
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   132
17564
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   133
    "/ leave as is
17490
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   134
    outStream nextPut: baseChar.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   135
    outStream nextPut: diacriticalChar.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   136
!
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   137
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   138
decodeString:aStringOrByteCollection
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   139
    "return a Unicode string from the passed in UTF-8-MAC encoded string.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   140
     This is UTF-8 with compose-characters decomposed 
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   141
     (i.e. as separate codes, not as single combined characters).
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   142
17564
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   143
     For now, here is a limited version, which should work
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   144
     at least for most european countries...
17490
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   145
    "
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   146
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   147
    |s buff previous|
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   148
17564
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   149
    s := super decodeString:aStringOrByteCollection.
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   150
    (s contains:[:char | char codePoint between:16r0300 and:16r0327]) ifFalse:[^ s].
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   151
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   152
    ComposeMap isNil ifTrue:[
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   153
        self class initializeDecomposeMap
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   154
    ].
17522
eea77b0b2c82 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17497
diff changeset
   155
eea77b0b2c82 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17497
diff changeset
   156
    buff := CharacterWriteStream on:''.
eea77b0b2c82 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17497
diff changeset
   157
    previous := nil.
eea77b0b2c82 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17497
diff changeset
   158
    s do:[:each |
17564
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   159
        (each codePoint between:16r0300 and:16r0327) ifTrue:[
17522
eea77b0b2c82 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17497
diff changeset
   160
            self compositionOf:previous with:each to:buff.
eea77b0b2c82 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17497
diff changeset
   161
            previous := nil.
eea77b0b2c82 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17497
diff changeset
   162
        ] ifFalse:[
eea77b0b2c82 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17497
diff changeset
   163
            previous notNil ifTrue:[
eea77b0b2c82 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17497
diff changeset
   164
                buff nextPut:previous.
17490
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   165
            ].
17522
eea77b0b2c82 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17497
diff changeset
   166
            previous := each.
17490
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   167
        ].
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   168
    ].
17522
eea77b0b2c82 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17497
diff changeset
   169
    previous notNil ifTrue:[
eea77b0b2c82 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17497
diff changeset
   170
        buff nextPut:previous.
eea77b0b2c82 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17497
diff changeset
   171
    ].
eea77b0b2c82 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17497
diff changeset
   172
    ^ buff contents.
17490
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   173
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   174
    "
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   175
     (ISO10646_to_UTF8 new encodeString:'aäoöuü') asByteArray   
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   176
        -> #[97 195 164 111 195 182 117 195 188]
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   177
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   178
     (ISO10646_to_UTF8 new decodeString:
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   179
            (ISO10646_to_UTF8 new encodeString:'aäoöuü') asByteArray)    
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   180
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   181
    (ISO10646_to_UTF8_MAC new encodeString:'aäoöuü') asByteArray 
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   182
        -> #[97 97 204 136 111 111 204 136 117 117 204 136]  
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   183
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   184
     (ISO10646_to_UTF8_MAC new decodeString:
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   185
            (ISO10646_to_UTF8_MAC new encodeString:'aäoöuü') asByteArray)    
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   186
    "
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   187
!
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   188
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   189
decompositionOf: codePointIn into:outBlockWithTwoArgs
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   190
    "if required, decompose a diacritical character into a base character and a punctuation;
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   191
     eg. ä -> a + umlaut-diacritic-mark.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   192
     Pass both as args to the given block.
17564
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   193
     For non diactit. chars, pass a nil diacrit-mark value.
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   194
     Return true, if a decomposition was done."
17490
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   195
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   196
    |entry|
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   197
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   198
    codePointIn < 16rC0 ifTrue:[ ^ false ].
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   199
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   200
    entry := DecomposeMap at:codePointIn ifAbsent:nil.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   201
    entry isNil ifTrue:[ ^ false ].
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   202
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   203
    outBlockWithTwoArgs value:(entry at:1) value:(entry at:2).
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   204
    ^ true
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   205
!
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   206
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   207
encodeString:aUnicodeString
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   208
    "return the UTF-8-MAC representation of a aUnicodeString.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   209
     This is UTF-8 with compose-characters decompose (i.e. as separate codes, not as
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   210
     single combined characters).
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   211
17564
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   212
     For now, here is a limited version, which should work
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   213
     at least for most european countries...
17490
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   214
    "
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   215
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   216
    |gen s decomp codePoint composeCodePoint|
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   217
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   218
    DecomposeMap isNil ifTrue:[
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   219
        self class initializeDecomposeMap
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   220
    ].
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   221
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   222
    gen := 
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   223
        [:codePointArg |
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   224
            |codePoint "{Class: SmallInteger }" b1 b2 b3 b4 b5 v "{Class: SmallInteger }"|
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   225
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   226
            codePoint := codePointArg.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   227
            codePoint <= 16r7F ifTrue:[
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   228
                s nextPut:(Character value:codePoint).
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   229
            ] ifFalse:[
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   230
                b1 := Character value:((codePoint bitAnd:16r3F) bitOr:2r10000000).
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   231
                v := codePoint bitShift:-6.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   232
                v <= 16r1F ifTrue:[
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   233
                    s nextPut:(Character value:(v bitOr:2r11000000)).
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   234
                    s nextPut:b1.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   235
                ] ifFalse:[
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   236
                    b2 := Character value:((v bitAnd:16r3F) bitOr:2r10000000).
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   237
                    v := v bitShift:-6.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   238
                    v <= 16r0F ifTrue:[
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   239
                        s nextPut:(Character value:(v bitOr:2r11100000)).
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   240
                        s nextPut:b2; nextPut:b1.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   241
                    ] ifFalse:[
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   242
                        b3 := Character value:((v bitAnd:16r3F) bitOr:2r10000000).
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   243
                        v := v bitShift:-6.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   244
                        v <= 16r07 ifTrue:[
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   245
                            s nextPut:(Character value:(v bitOr:2r11110000)).
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   246
                            s nextPut:b3; nextPut:b2; nextPut:b1.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   247
                        ] ifFalse:[
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   248
                            b4 := Character value:((v bitAnd:16r3F) bitOr:2r10000000).
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   249
                            v := v bitShift:-6.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   250
                            v <= 16r03 ifTrue:[
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   251
                                s nextPut:(Character value:(v bitOr:2r11111000)).
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   252
                                s nextPut:b4; nextPut:b3; nextPut:b2; nextPut:b1.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   253
                            ] ifFalse:[
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   254
                                b5 := Character value:((v bitAnd:16r3F) bitOr:2r10000000).
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   255
                                v := v bitShift:-6.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   256
                                v <= 16r01 ifTrue:[
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   257
                                    s nextPut:(Character value:(v bitOr:2r11111100)).
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   258
                                    s nextPut:b5; nextPut:b4; nextPut:b3; nextPut:b2; nextPut:b1.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   259
                                ] ifFalse:[
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   260
                                    "/ cannot happen - we only support up to 30 bit characters
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   261
                                    self error:'ascii value > 31bit in utf8Encode'.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   262
                                ]
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   263
                            ].
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   264
                        ].
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   265
                    ].
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   266
                ].
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   267
            ].
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   268
        ].
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   269
17564
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   270
    decomp := 
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   271
        [:baseCodePointArg :composeCodePointArg | 
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   272
            codePoint := baseCodePointArg. composeCodePoint := composeCodePointArg
67ae75f28757 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17522
diff changeset
   273
        ].
17490
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   274
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   275
    s := WriteStream on:(String uninitializedNew:aUnicodeString size).
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   276
    aUnicodeString do:[:eachCharacter |
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   277
        |needExtra|
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   278
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   279
        codePoint := eachCharacter codePoint.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   280
        needExtra := self decompositionOf: codePoint into:decomp.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   281
        gen value:codePoint.
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   282
        needExtra ifTrue:[
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   283
            gen value:composeCodePoint
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   284
        ].
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   285
    ].
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   286
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   287
    ^ s contents
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   288
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   289
    "
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   290
     (self encodeString:'hello') asByteArray                             #[104 101 108 108 111]
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   291
     (self encodeString:(Character value:16r40) asString) asByteArray    #[64]
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   292
     (self encodeString:(Character value:16r7F) asString) asByteArray    #[127]
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   293
     (self encodeString:(Character value:16r80) asString) asByteArray    #[194 128]
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   294
     (self encodeString:(Character value:16rFF) asString) asByteArray    #[195 191]
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   295
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   296
     (ISO10646_to_UTF8     new encodeString:'aäoöuü') asByteArray   
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   297
        -> #[97 195 164 111 195 182 117 195 188]
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   298
     (ISO10646_to_UTF8_MAC new encodeString:'aäoöuü') asByteArray 
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   299
        -> #[97 97 204 136 111 111 204 136 117 117 204 136]  
17522
eea77b0b2c82 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17497
diff changeset
   300
eea77b0b2c82 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17497
diff changeset
   301
     ISO10646_to_UTF8_MAC new decodeString:
eea77b0b2c82 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17497
diff changeset
   302
         (ISO10646_to_UTF8_MAC new encodeString:'Packages aus VSE für Smalltalk_X') asByteArray 
17490
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   303
    "
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   304
! !
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   305
17497
36ab19b73c1f class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17490
diff changeset
   306
!ISO10646_to_UTF8_MAC methodsFor:'queries'!
36ab19b73c1f class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17490
diff changeset
   307
36ab19b73c1f class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17490
diff changeset
   308
nameOfEncoding
36ab19b73c1f class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17490
diff changeset
   309
    ^ #'utf8-mac'
36ab19b73c1f class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17490
diff changeset
   310
! !
36ab19b73c1f class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17490
diff changeset
   311
17490
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   312
!ISO10646_to_UTF8_MAC class methodsFor:'documentation'!
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   313
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   314
version
17567
2d57395ef7e0 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17566
diff changeset
   315
    ^ '$Header: /cvs/stx/stx/libbasic/CharacterEncoderImplementations__ISO10646_to_UTF8_MAC.st,v 1.7 2015-02-27 18:26:01 cg Exp $'
17490
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   316
!
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   317
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   318
version_CVS
17567
2d57395ef7e0 class: CharacterEncoderImplementations::ISO10646_to_UTF8_MAC
Claus Gittinger <cg@exept.de>
parents: 17566
diff changeset
   319
    ^ '$Header: /cvs/stx/stx/libbasic/CharacterEncoderImplementations__ISO10646_to_UTF8_MAC.st,v 1.7 2015-02-27 18:26:01 cg Exp $'
17490
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   320
! !
dd28d3bda290 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   321