# HG changeset patch # User Stefan Vogel # Date 1516369279 -3600 # Node ID c9dc532200c99944e7d4289c8753e9acbb8b4b4b # Parent 6058c02151eb03fdc278f6200a543d840ca97c07 #REFACTORING by stefan Refactor inheritance. Provide #characterSize: for all private classes class: CharacterEncoder added: #encodeCharacter: #readNext:charactersFrom: removed: #decode: #encode: #readNextInputCharacterFrom: comment/format in: #encodeString: changed: #decodeString: #encodeCharacter:on: #readNextCharacterFrom: category of: #encodeCharacter:on: #encodeString:on: class: CharacterEncoder class removed: #decode: #encode: comment/format in: #decodeString: #decodeString:from: #encodeString: #encodeString:into: #encoderForUTF8 #encoderToEncodeFrom:into: #guessEncodingOfStream: #initializeEncodingDetectors changed: #encode:from:into: #encodeString:from:into: #guessEncodingOfFile: class: CharacterEncoder::CompoundEncoder added: #characterSize: #readNext:charactersFrom: #readNextCharacterFrom: removed: #decode: #encode: comment/format in: #decodeString: #encodeString: class: CharacterEncoder::DefaultEncoder class definition class: CharacterEncoder::InverseEncoder added: #readNext:charactersFrom: #readNextCharacterFrom: removed: #decode: #encode: #readNextInputCharacterFrom: comment/format in: #decodeString: #encodeString: class: CharacterEncoder::InverseEncoder class comment/format in: #documentation class: CharacterEncoder::NullEncoder added: #readNext:charactersFrom: #readNextCharacterFrom: removed: #decode: #encode: comment/format in: #characterSize: changed: #decodeString: class: CharacterEncoder::OtherEncoding class added: #isAbstract removed: #generateEncoderCode comment/format in: #flushCode class: CharacterEncoder::TwoStepEncoder added: #readNext:charactersFrom: #readNextCharacterFrom: removed: #decode: #encode: comment/format in: #decodeString: diff -r 6058c02151eb -r c9dc532200c9 CharacterEncoder.st --- a/CharacterEncoder.st Fri Jan 19 12:01:41 2018 +0100 +++ b/CharacterEncoder.st Fri Jan 19 14:41:19 2018 +0100 @@ -32,7 +32,7 @@ privateIn:CharacterEncoder ! -CharacterEncoder subclass:#DefaultEncoder +CharacterEncoder subclass:#NullEncoder instanceVariableNames:'' classVariableNames:'' poolDictionaries:'' @@ -46,7 +46,7 @@ privateIn:CharacterEncoder ! -CharacterEncoder subclass:#NullEncoder +CharacterEncoder::NullEncoder subclass:#DefaultEncoder instanceVariableNames:'' classVariableNames:'' poolDictionaries:'' @@ -427,25 +427,25 @@ ^ self encoderFor:#utf8 " - self encoderForUTF8' + self encoderForUTF8 " - "Modified (comment): / 27-02-2017 / 16:06:20 / stefan" + "Modified (comment): / 17-01-2018 / 13:07:31 / stefan" ! encoderToEncodeFrom:oldEncodingArg into:newEncodingArg |oldEncoding newEncoding encoders encoderClasses encoder decoder clsName cls| oldEncoding := oldEncodingArg ? #unicode. - oldEncoding == #'iso10646-1' ifTrue:[ oldEncoding := #unicode]. + oldEncoding == #'iso10646-1' ifTrue:[ oldEncoding := #unicode]. newEncoding := newEncodingArg ? #unicode. - newEncoding == #'iso10646-1' ifTrue:[ newEncoding := #unicode]. + newEncoding == #'iso10646-1' ifTrue:[ newEncoding := #unicode]. oldEncoding = newEncoding ifTrue:[^ NullEncoderInstance]. (oldEncoding match:newEncoding) ifTrue:[^ NullEncoderInstance]. (oldEncoding = #unicode) ifTrue:[ - "/ something -> unicode + "/ unicode -> something ^ self encoderFor:newEncoding. ]. @@ -472,16 +472,12 @@ ]. encoder isNil ifTrue:[ + "/ something -> unicode + decoder := self encoderFor:oldEncoding. (newEncoding == #unicode) ifTrue:[ - "/ something -> unicode - decoder := self encoderFor:oldEncoding. encoder := InverseEncoder new decoder:decoder. ] ifFalse:[ "/ do it as: oldEncoding -> unicode -> newEncoding - - "/ something -> unicode - decoder := self encoderFor:oldEncoding. - "/ unicode -> something encoder := self encoderFor:newEncoding. encoder := CompoundEncoder new encoder:encoder decoder:decoder. @@ -499,11 +495,14 @@ CharacterEncoder encoderToEncodeFrom:#'koi8-r' into:#'mac-cyrillic' CharacterEncoder encoderToEncodeFrom:#'ms-arabic' into:#'mac-arabic' CharacterEncoder encoderToEncodeFrom:#'iso8859-5' into:#'koi8-r' + CharacterEncoder encoderToEncodeFrom:#'iso8859-5' into:#'unicode' CharacterEncoder encoderToEncodeFrom:#'koi8-r' into:#'koi8-u' + CharacterEncoder encoderToEncodeFrom:#'utf-8' into:#unicode " "Modified: / 12-07-2012 / 19:45:15 / cg" - "Modified: / 27-02-2017 / 16:49:14 / stefan" + "Modified: / 16-01-2018 / 17:11:17 / stefan" + "Modified (comment): / 17-01-2018 / 12:58:32 / stefan" ! ! !CharacterEncoder class methodsFor:'Compatibility-ST80'! @@ -780,58 +779,61 @@ !CharacterEncoder class methodsFor:'encoding & decoding'! -decode:aCodePoint - ^ self new decode:aCodePoint -! +decodeString:anEncodedStringOrByteCollection + ^ self new decodeString:anEncodedStringOrByteCollection -decodeString:aString - ^ self new decodeString:aString + " + CharacterEncoderImplementations::ISO8859_1 decodeString:'hello' + CharacterEncoderImplementations::ISO8859_1 decodeString:'hello' asByteArray + " + + "Modified (comment): / 17-01-2018 / 13:44:41 / stefan" ! decodeString:aString from:oldEncoding - ^ self encodeString:aString from:oldEncoding into:#'unicode' -! - -encode:aCodePoint - ^ self new encode:aCodePoint + ^ self encodeString:aString from:oldEncoding into:#unicode " - ISO8859_1 encode:16r00FF - ISO8859_1 decodeString:'hello' - ISO8859_1 encodeString:(ISO8859_1 decodeString:'hello') + self encodeString:'hello' into:#ebcdic - ISO8859_5 decodeString:(String - with:(Character value:16rE4) - with:(Character value:16rE0)) + self decodeString:(self encodeString:'hello' into:#ebcdic) from:#ebcdic " + + "Modified (format): / 17-01-2018 / 15:47:00 / stefan" ! encode:codePoint from:oldEncodingArg into:newEncodingArg |oldEncoding newEncoding encoder| - oldEncoding := oldEncodingArg ? #'unicode'. - oldEncoding == #'iso10646-1' ifTrue:[ oldEncoding := #'unicode']. - newEncoding := newEncodingArg ? #'unicode'. - newEncoding == #'iso10646-1' ifTrue:[ newEncoding := #'unicode']. + oldEncodingArg == newEncodingArg ifTrue:[ + ^ codePoint + ]. + oldEncoding := oldEncodingArg. + newEncoding := newEncodingArg. - oldEncoding == newEncoding ifTrue:[^ codePoint]. + (oldEncoding isNil or:[oldEncoding == #'iso10646-1' or:[oldEncoding == #'ms-default']]) ifTrue:[ + oldEncoding := #unicode + ]. - oldEncoding == #'unicode' ifTrue:[ - newEncoding == #'iso8859-1' ifTrue:[ - codePoint <= 16rFF ifTrue:[ - ^ codePoint - ] - ] + (newEncoding isNil or:[newEncoding == #'iso10646-1' or:[newEncoding == #'ms-default']]) ifTrue:[ + newEncoding := #unicode. + ]. + + oldEncoding == newEncoding ifTrue:[ + ^ codePoint ]. - newEncoding == #'unicode' ifTrue:[ - oldEncoding == #'iso8859-1' ifTrue:[ - codePoint <= 16rFF ifTrue:[ - ^ codePoint - ] - ] + + (oldEncoding == #unicode and:[newEncoding == #'iso8859-1' and:[codePoint <= 16rFF]]) ifTrue:[ + ^ codePoint ]. + (newEncoding == #unicode and:[oldEncoding == #'iso8859-1' and:[codePoint <= 16rFF]]) ifTrue:[ + ^ codePoint + ]. + encoder := self encoderToEncodeFrom:oldEncoding into:newEncoding. ^ encoder encode:codePoint. + + "Modified: / 17-01-2018 / 14:33:08 / stefan" ! encodeString:aUnicodeString @@ -840,53 +842,66 @@ ^ self new encodeString:aUnicodeString " - ISO8859_1 decodeString:'hello' + CharacterEncoderImplementations::ISO8859_1 encodeString:'hello' " + + "Modified (comment): / 16-01-2018 / 21:57:35 / stefan" ! encodeString:aString from:oldEncodingArg into:newEncodingArg |oldEncoding newEncoding encoder| - "/ some hard coded aliases - oldEncoding := oldEncodingArg ? #'unicode'. - oldEncoding == #'iso10646-1' ifTrue:[ oldEncoding := #'unicode']. - oldEncoding == #'ms-default' ifTrue:[ oldEncoding := #'unicode']. + oldEncodingArg == newEncodingArg ifTrue:[ + ^ aString + ]. - newEncoding := newEncodingArg ? #'unicode'. - newEncoding == #'iso10646-1' ifTrue:[ newEncoding := #'unicode']. - newEncoding == #'ms-default' ifTrue:[ newEncoding := #'unicode']. + oldEncoding := oldEncodingArg. + newEncoding := newEncodingArg. + "/ some hard coded aliases + (oldEncoding isNil or:[oldEncoding == #'iso10646-1' or:[oldEncoding == #'ms-default']]) ifTrue:[ + oldEncoding := #'unicode' + ]. - oldEncoding == newEncoding ifTrue:[^ aString]. + (newEncoding isNil or:[newEncoding == #'iso10646-1' or:[newEncoding == #'ms-default']]) ifTrue:[ + newEncoding := #'unicode' + ]. + + oldEncoding == newEncoding ifTrue:[ + ^ aString + ]. "/ for single-byte strings, iso8859-1 and unicode (up to FF) have the same encoding - oldEncoding == #'unicode' ifTrue:[ - (newEncoding == #'iso8859-1') ifTrue:[ - aString isWideString ifFalse:[ - ^ aString - ] - ]. + (oldEncoding == #unicode and:[newEncoding == #'iso8859-1' and:[aString isWideString not]]) ifTrue:[ + ^ aString ]. - newEncoding == #'unicode' ifTrue:[ - (oldEncoding == #'iso8859-1') ifTrue:[ - aString isWideString ifFalse:[ - ^ aString - ] - ] + (newEncoding == #unicode and:[oldEncoding == #'iso8859-1' and:[aString isWideString not]]) ifTrue:[ + ^ aString ]. encoder := self encoderToEncodeFrom:oldEncoding into:newEncoding. ^ encoder encodeString:aString. + + " + self encodeString:(self encodeString:'hello' into:#ebcdic) from:#ebcdic into:#ascii + self encodeString:(self encodeString:'hello' into:#ebcdic) from:#ebcdic into:#unicode + self encodeString:(self encodeString:'Äh ... hello' into:#ebcdic) from:#ebcdic into:#utf8 + " + + "Modified (comment): / 17-01-2018 / 15:49:40 / stefan" ! encodeString:aString into:newEncoding - ^ self encodeString:aString from:#'unicode' into:newEncoding + ^ self encodeString:aString from:#unicode into:newEncoding " self encodeString:'hello' into:#ebcdic self encodeString:(self encodeString:'hello' into:#ebcdic) from:#ebcdic into:#ascii self encodeString:(self encodeString:'hello' into:#ebcdic) from:#ebcdic into:#unicode + self encodeString:(self encodeString:'hello' into:#ebcdic) from:#ebcdic into:#utf8 " + + "Modified (comment): / 17-01-2018 / 15:48:07 / stefan" ! ! !CharacterEncoder class methodsFor:'private'! @@ -1103,13 +1118,13 @@ If that's not found, use heuristics (in CharacterArray) to guess. Return a symbol like #utf8." - |s buffer n "{Class: SmallInteger }"| + |s buffer| s := aFilename asFilename readStreamOrNil. s isNil ifTrue:[^ nil]. buffer := String new:512. - n := s nextBytes:buffer size into:buffer. + s nextBytes:buffer size into:buffer. s close. ^ self guessEncodingOfBuffer:buffer. @@ -1121,6 +1136,7 @@ " "Modified: / 31-05-2011 / 15:45:19 / cg" + "Modified: / 16-01-2018 / 17:12:41 / stefan" ! guessEncodingOfStream:aStream @@ -1131,20 +1147,24 @@ in the first few bytes of aStream. Return a symbol like #utf8." - |oldPosition buffer n| + |oldPosition buffer| "/ must be able to position back - aStream isPositionable ifFalse:[^ nil]. + aStream isPositionable ifFalse:[ + ^ nil + ]. buffer := String new:512. oldPosition := aStream position. - n := aStream nextBytes:buffer size into:buffer. + aStream nextBytes:buffer size into:buffer. aStream position:oldPosition. ^ self guessEncodingOfBuffer:buffer "Modified: / 31-05-2011 / 15:45:23 / cg" + "Modified: / 16-01-2018 / 17:12:57 / stefan" + "Modified (format): / 17-01-2018 / 15:51:09 / stefan" ! initializeEncodingDetectors @@ -1202,7 +1222,7 @@ "check for an inline encoding markup (charset= / encoding=) substring" EncodingDetectors add:[:buffer | - |guess lcBuffer quote peek| + |guess lcBuffer quote| lcBuffer := buffer asLowercase. @@ -1213,8 +1233,7 @@ guess isNil ifTrue:[ (idx := lcBuffer findString:keyWord) ~~ 0 ifTrue:[ s := ReadStream on:buffer. - s position:idx-1. - s skip:keyWord size. + s position:idx-1 + keyWord size. s skipSeparators. "do not include '=' here, otherwise @@ -1251,13 +1270,13 @@ "/ check for JIS7 encoding EncodingDetectors add:[:buffer | - (buffer findString:self jisISO2022EscapeSequence) ~~ 0 ifTrue:[ + (buffer includesString:self jisISO2022EscapeSequence) ifTrue:[ #'iso2020-jp' ] ifFalse:[ - (buffer findString:self jis7KanjiEscapeSequence) ~~ 0 ifTrue:[ + (buffer includesString:self jis7KanjiEscapeSequence) ifTrue:[ #jis7 ] ifFalse:[ - (buffer findString:self jis7KanjiOldEscapeSequence) ~~ 0 ifTrue:[ + (buffer includesString:self jis7KanjiOldEscapeSequence) ifTrue:[ #jis7 ] ifFalse:[ nil @@ -1287,6 +1306,8 @@ "/ "/ look for SJIS ... "/ ] "/ ]. + + "Modified: / 17-01-2018 / 15:55:36 / stefan" ! showCharacterSet @@ -1318,98 +1339,30 @@ !CharacterEncoder methodsFor:'encoding & decoding'! -decode:anEncoding - "given an integer in my encoding, return a unicode codePoint for it" - - self subclassResponsibility -! - -decodeString:anEncodedString +decodeString:anEncodedStringOrByteCollection "given a string in my encoding, return a unicode-string for it" - |newString myCode uniCodePoint bits| - - newString := String new:(anEncodedString size). - bits := newString bitsPerCharacter. + ^ self subclassResponsibility - 1 to:anEncodedString size do:[:idx | - uniCodePoint := (anEncodedString at:idx) codePoint. - myCode := self decode:uniCodePoint. - myCode > 16rFF ifTrue:[ - myCode > 16rFFFF ifTrue:[ - bits < 32 ifTrue:[ - newString := Unicode32String fromString:newString. - bits := 32. - ] - ] ifFalse:[ - bits < 16 ifTrue:[ - newString := Unicode16String fromString:newString. - bits := 16. - ] - ] - ]. - newString at:idx put:(Character value:myCode). - ]. - ^ newString - - " - ISO8859_1 decodeString:'hello' - " + "Modified: / 16-01-2018 / 19:54:51 / stefan" + "Modified (format): / 17-01-2018 / 13:45:06 / stefan" ! -encode:aCodePoint - "given a codePoint in unicode, return a byte in my encoding for it" - - self subclassResponsibility -! +encodeCharacter:aUnicodeCharacterOrCodePoint + "encode aUnicodeCharacterOrCodePoint to a (8-bit) String or ByteArray" -encodeCharacter:aUnicodeCharacter on:aStream - "given a character in unicode, encode it onto aStream. - Subclasses can redefine this to avoid allocating many new string instances." + ^ self encodeString:aUnicodeCharacterOrCodePoint asString. - aStream nextPutAll:(self encodeString:aUnicodeCharacter asString). - - "Created: / 16-02-2017 / 16:18:33 / stefan" - "Modified (comment): / 16-02-2017 / 20:41:18 / stefan" + "Created: / 17-01-2018 / 13:59:44 / stefan" ! encodeString:aUnicodeString - "given a string in unicode, return a string in my encoding for it" - - |newString myCode uniCodePoint bits - stringSize "{ Class: SmallInteger }"| - - stringSize := aUnicodeString size. - newString := String new:stringSize. - bits := newString bitsPerCharacter. + "given a string in unicode, return a string or ByteArray in my encoding for it" - 1 to:stringSize do:[:idx | - uniCodePoint := (aUnicodeString at:idx) codePoint. - myCode := self encode:uniCodePoint. - myCode > 16rFF ifTrue:[ - myCode > 16rFFFF ifTrue:[ - bits < 32 ifTrue:[ - newString := Unicode32String fromString:newString. - bits := 32. - ] - ] ifFalse:[ - bits < 16 ifTrue:[ - newString := Unicode16String fromString:newString. - bits := 16. - ] - ] - ]. - newString at:idx put:(Character value:myCode). - ]. - ^ newString -! + ^ self subclassResponsibility -encodeString:aUnicodeString on:aStream - "given a string in unicode, encode it onto aStream. - Subclasses can redefine this to avoid allocating many new string instances. - (but must then also redefine encodeString:aUnicodeString to collect the characters)" - - aStream nextPutAll:(self encodeString:aUnicodeString). + "Modified: / 16-01-2018 / 19:54:44 / stefan" + "Modified (comment): / 17-01-2018 / 13:54:44 / stefan" ! ! !CharacterEncoder methodsFor:'error handling'! @@ -1519,27 +1472,45 @@ !CharacterEncoder methodsFor:'stream support'! -readNext:charactersToRead charactersFrom:stream - ^ self decodeString:(stream next:charactersToRead) +encodeCharacter:aUnicodeCharacter on:aStream + "given a character in unicode, encode it onto aStream. + Subclasses can redefine this to avoid allocating many new string instances." + + aStream nextPutAll:(self encodeCharacter:aUnicodeCharacter). + + "Created: / 16-02-2017 / 16:18:33 / stefan" + "Modified: / 17-01-2018 / 14:00:28 / stefan" +! + +encodeString:aUnicodeString on:aStream + "given a string in unicode, encode it onto aStream. + Subclasses can redefine this to avoid allocating many new string instances. + (but must then also redefine encodeString:aUnicodeString to collect the characters)" + + aStream nextPutAll:(self encodeString:aUnicodeString). +! + +readNext:countArg charactersFrom:aStream + |writeStream count "{ Class:SmallInteger }"| + + count := countArg. + writeStream := CharacterWriteStream on:(String new:count). + count timesRepeat:[ + writeStream nextPut:(self readNextCharacterFrom:aStream). + ]. + ^ writeStream contents. + + "Created: / 16-01-2018 / 20:08:10 / stefan" + "Modified: / 17-01-2018 / 16:44:29 / stefan" ! readNextCharacterFrom:aStream - - | c | - - c := aStream next. - - ^ c isNil - ifTrue: [nil] - ifFalse: [(self decode:c asInteger) asCharacter] + ^ self subclassResponsibility "Created: / 14-06-2005 / 17:03:21 / janfrog" "Modified: / 15-06-2005 / 15:27:49 / janfrog" "Modified: / 20-06-2005 / 13:13:52 / masca" -! - -readNextInputCharacterFrom:aStream - ^ aStream next + "Modified: / 16-01-2018 / 20:12:07 / stefan" ! ! !CharacterEncoder::CompoundEncoder class methodsFor:'documentation'! @@ -1572,20 +1543,16 @@ !CharacterEncoder::CompoundEncoder methodsFor:'encoding & decoding'! -decode:aCode - ^ decoder encode:(encoder decode:aCode) +decodeString:anEncodedStringOrByteCollection + ^ decoder encodeString:(encoder decodeString:anEncodedStringOrByteCollection) + + "Modified (format): / 17-01-2018 / 13:44:08 / stefan" ! -decodeString:aString - ^ decoder encodeString:(encoder decodeString:aString) -! +encodeString:anEncodedStringOrByteCollection + ^ encoder encodeString:(decoder decodeString:anEncodedStringOrByteCollection) -encode:aCode - ^ encoder encode:(decoder decode:aCode) -! - -encodeString:aString - ^ encoder encodeString:(decoder decodeString:aString) + "Modified (format): / 17-01-2018 / 13:46:26 / stefan" ! ! !CharacterEncoder::CompoundEncoder methodsFor:'printing'! @@ -1600,20 +1567,93 @@ encoder printOn:aStream ! ! -!CharacterEncoder::DefaultEncoder class methodsFor:'documentation'! +!CharacterEncoder::CompoundEncoder methodsFor:'queries'! + +characterSize:aCharacterOrCodepoint + "return the number of bytes required to encode aCharacterOrCodepoint" + + ^ encoder characterSize:(decoder decode:aCharacterOrCodepoint) + + "Created: / 16-01-2018 / 17:58:51 / stefan" +! ! + +!CharacterEncoder::CompoundEncoder methodsFor:'stream support'! + +readNext:count charactersFrom:aStream + ^ decoder encodeString:(encoder readNext:count charactersFrom:aStream) asString + + "Created: / 16-01-2018 / 20:50:56 / stefan" +! + +readNextCharacterFrom:aStream + ^ (decoder encodeString:(encoder readNextCharacterFrom:aStream) asString) first + + "Created: / 16-01-2018 / 21:10:28 / stefan" +! ! + +!CharacterEncoder::NullEncoder class methodsFor:'documentation'! documentation " - That is only a dummy for ST80 compatibility + A NullEncoder does nothing. " ! ! +!CharacterEncoder::NullEncoder methodsFor:'encoding & decoding'! + +decodeString:anEncodedStringOrByteCollection + ^ anEncodedStringOrByteCollection asString + + "Modified: / 17-01-2018 / 13:43:42 / stefan" +! + +encodeString:aString + ^ aString +! ! + +!CharacterEncoder::NullEncoder methodsFor:'queries'! + +characterSize:charOrCodePoint + "return the number of bytes required to encode aCharacterOrCodepoint" + + ^ charOrCodePoint asCharacter bytesPerCharacter + + " + NullEncoder basicNew characterSize:$a codePoint + NullEncoder basicNew characterSize:16r3fe + NullEncoder basicNew characterSize:16r3ffe + " + + "Modified (comment): / 16-01-2018 / 21:15:01 / stefan" +! + +isNullEncoder + ^ true +! ! + +!CharacterEncoder::NullEncoder methodsFor:'stream support'! + +readNext:count charactersFrom:aStream + ^ (aStream next:count) asString + + "Created: / 16-01-2018 / 20:19:38 / stefan" +! + +readNextCharacterFrom:aStream + ^ aStream next asCharacter + + "Created: / 16-01-2018 / 20:04:01 / stefan" +! ! + !CharacterEncoder::InverseEncoder class methodsFor:'documentation'! documentation " - An inverseEncoder does the inverse - i.e. encode is really a decode + An InverseEncoder does the inverse - i.e. encode is really a decode and decode is really an encode. + + InverseEncoder is always used to encode to unicode and decode from unicode + (see CharacterEncoder class >> #encoderToEncodeFrom:into:). " ! ! @@ -1625,20 +1665,16 @@ !CharacterEncoder::InverseEncoder methodsFor:'encoding & decoding'! -decode:aCode - ^ decoder encode:aCode +decodeString:anEncodedStringOrByteCollection + ^ decoder encodeString:anEncodedStringOrByteCollection + + "Modified (format): / 17-01-2018 / 13:43:57 / stefan" ! -decodeString:aString - ^ decoder encodeString:aString -! +encodeString:anEncodedStringOrByteCollection + ^ decoder decodeString:anEncodedStringOrByteCollection -encode:aCode - ^ decoder decode:aCode -! - -encodeString:aString - ^ decoder decodeString:aString + "Modified (format): / 17-01-2018 / 13:46:47 / stefan" ! ! !CharacterEncoder::InverseEncoder methodsFor:'printing'! @@ -1660,58 +1696,47 @@ !CharacterEncoder::InverseEncoder methodsFor:'stream support'! -readNextInputCharacterFrom:aStream - ^ decoder readNextInputCharacterFrom:aStream +readNext:count charactersFrom:aStream + "decode the next count bytes or characters on aStream from unicode to something else" + + ^ decoder encodeString:(aStream next:count). + + "Created: / 16-01-2018 / 20:53:42 / stefan" + "Modified (comment): / 17-01-2018 / 13:28:41 / stefan" +! + +readNextCharacterFrom:aStream + "decode the next byte or character on aStream from unicode to something else" + + ^ decoder encodeString:(Array with:aStream next). + + "Created: / 16-01-2018 / 21:08:11 / stefan" + "Modified: / 17-01-2018 / 13:29:59 / stefan" ! ! -!CharacterEncoder::NullEncoder class methodsFor:'documentation'! +!CharacterEncoder::DefaultEncoder class methodsFor:'documentation'! documentation " - A NullEncoder does nothing. + That is only a dummy for ST80 compatibility " ! ! -!CharacterEncoder::NullEncoder methodsFor:'encoding & decoding'! - -decode:aCode - ^ aCode -! - -decodeString:aString - ^ aString -! - -encode:aCode - ^ aCode -! - -encodeString:aString - ^ aString -! ! - -!CharacterEncoder::NullEncoder methodsFor:'queries'! - -characterSize:charOrCodePoint - ^ charOrCodePoint asCharacter bytesPerCharacter - - " - NullEncoder basicNew characterSize:$a codePoint - NullEncoder basicNew characterSize:16r3fe - NullEncoder basicNew characterSize:16r3ffe - " -! - -isNullEncoder - ^ true -! ! - !CharacterEncoder::OtherEncoding class methodsFor:'private'! flushCode -! + "do nothing here" + + "Modified (comment): / 16-01-2018 / 17:08:17 / stefan" +! ! -generateEncoderCode +!CharacterEncoder::OtherEncoding class methodsFor:'testing'! + +isAbstract + ^ self == CharacterEncoder::OtherEncoding + + "Created: / 17-01-2018 / 16:06:13 / stefan" + "Modified: / 17-01-2018 / 17:50:37 / stefan" ! ! !CharacterEncoder::TwoStepEncoder class methodsFor:'documentation'! @@ -1737,16 +1762,10 @@ !CharacterEncoder::TwoStepEncoder methodsFor:'encoding & decoding'! -decode:aCode - ^ encoder1 decode:(encoder2 decode:aCode) -! +decodeString:anEncodedStringOrByteCollection + ^ encoder1 decodeString:(encoder2 decodeString:anEncodedStringOrByteCollection) -decodeString:aString - ^ encoder1 decodeString:(encoder2 decodeString:aString) -! - -encode:aCode - ^ encoder2 encode:(encoder1 encode:aCode) + "Modified (format): / 17-01-2018 / 13:45:20 / stefan" ! encodeString:aString @@ -1795,6 +1814,20 @@ ! ! +!CharacterEncoder::TwoStepEncoder methodsFor:'stream support'! + +readNext:count charactersFrom:aStream + ^ encoder1 decodeString:(encoder2 readNext:count charactersFrom:aStream) + + "Created: / 16-01-2018 / 20:47:52 / stefan" +! + +readNextCharacterFrom:aStream + ^ (encoder1 decodeString:(encoder2 readNextCharacterFrom:aStream) asString) first + + "Created: / 16-01-2018 / 21:06:48 / stefan" +! ! + !CharacterEncoder class methodsFor:'documentation'! version