transforms/Xtreams__EncodeReadStream.st
author joe
Tue, 19 Mar 2013 23:58:40 -0400
changeset 116 fa5b4c9f582d
parent 111 44ac233b2f83
permissions -rw-r--r--
- Xtreams::EncodeReadStream class: Xtreams::EncodeReadStream changed: #get #read:into:at: - Xtreams::ASCIIEncoder class: Xtreams::ASCIIEncoder - Xtreams::TransformReadStream class: Xtreams::TransformReadStream - Xtreams::ObjectReadStream class: Xtreams::ObjectReadStream - Xtreams::InterpretedReadStream class: Xtreams::InterpretedReadStream - stx_goodies_xtreams_transforms class: stx_goodies_xtreams_transforms - Xtreams::DuplicateWriteStream class: Xtreams::DuplicateWriteStream - Xtreams::EncodeWriteStream class: Xtreams::EncodeWriteStream changed: #setLineEndCRLF #setLineEndLF - Xtreams::ObjectAnalyseStream class: Xtreams::ObjectAnalyseStream - Xtreams::ObjectWriteStream class: Xtreams::ObjectWriteStream - Xtreams::TransformWriteStream class: Xtreams::TransformWriteStream - Xtreams::CollectReadStream class: Xtreams::CollectReadStream - Xtreams::InterpretedWriteStream class: Xtreams::InterpretedWriteStream - Xtreams::DuplicateReadStream class: Xtreams::DuplicateReadStream - Xtreams::ObjectMarshaler class: Xtreams::ObjectMarshaler - Xtreams::ISO8859L1Encoder class: Xtreams::ISO8859L1Encoder - Xtreams::CollectWriteStream class: Xtreams::CollectWriteStream - Xtreams::MessagePackMarshaler class: Xtreams::MessagePackMarshaler - extensions ...
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
10
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
     1
"{ Package: 'stx:goodies/xtreams/transforms' }"
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
     2
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
     3
"{ NameSpace: Xtreams }"
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
     4
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
     5
ReadStream subclass:#EncodeReadStream
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
     6
	instanceVariableNames:'transparent crPreceeding encoder buffer bufferWriting
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
     7
		bufferReading'
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
     8
	classVariableNames:''
111
44ac233b2f83 * removed namespace from pool references and stray extension methods
joe
parents: 97
diff changeset
     9
	poolDictionaries:'XtreamsPool'
27
2cc5a8a3ca14 added XtreamsPool to fix DefaultBufferSize; set proper category names
mkobetic
parents: 10
diff changeset
    10
	category:'Xtreams-Transforms'
10
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    11
!
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    12
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    13
EncodeReadStream comment:'Converts bytes into characters using pre-configured encoding. At the same time, if set to lineEndAuto (default) it can perform line-end translation, converting any line-end convention into CRs. The source stream must provide bytes (0...255).
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    14
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    15
Instance Variables
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    16
	transparent	<Boolean> should the stream perform line-end translations
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    17
	crPreceeding	<Boolean> was previous character read a CR (used when not transparent)
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    18
	encoder	<Encoder> converts bytes to characters
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    19
	buffer	<Buffer on: ByteArray> used to optimize bulk reads
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    20
	bufferWriting	<WriteStream> write stream on buffer
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    21
	bufferReading	<ReadStream> read stream on buffer
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    22
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    23
'
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    24
!
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    25
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    26
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    27
!EncodeReadStream class methodsFor:'instance creation'!
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    28
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    29
on: aSource encoding: anEncoding
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    30
	^self new on: aSource encoding: anEncoding
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    31
! !
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    32
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    33
!EncodeReadStream methodsFor:'accessing'!
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    34
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    35
encoder
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    36
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    37
	^encoder
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    38
!
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    39
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    40
get
116
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    41
        | character |
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    42
        buffer hasDataToRead ifTrue: [^super get].
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    43
        character := encoder decodeFrom: source.
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    44
        transparent ifTrue: [ ^character ]. 
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    45
        character == LF
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    46
           ifTrue: [crPreceeding
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    47
                ifTrue: 
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    48
                        [character := encoder decodeFrom: source.
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    49
                        crPreceeding := character = CR]
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    50
                ifFalse: 
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    51
                        [crPreceeding := false.
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    52
                        character := Character cr]]
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    53
            ifFalse: [crPreceeding := character = CR].
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    54
        ^character == CR ifTrue: [ LF ] ifFalse: [ character ]
10
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    55
!
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    56
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    57
read: anInteger into: aSequenceableCollection at: startIndex
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    58
116
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    59
        | remaining position character bufferAvailable |
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    60
        remaining := anInteger.
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    61
        position := startIndex.
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    62
        [remaining > 0] whileTrue: [
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    63
                | mark |
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    64
                "Top up our buffer if we have room and we need data"
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    65
                [bufferWriting write: (buffer writeSize min: remaining) from: source] on: Incomplete do: [:incomplete |
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    66
                        (incomplete count == 0 and: [buffer hasDataToRead not]) ifTrue: [
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    67
                                (Incomplete on: aSequenceableCollection count: anInteger - remaining at: startIndex) raise]].
10
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    68
116
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    69
                "We now conduct an inner loop that iterates over the buffer data while:
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    70
                        a) we need to read more data
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    71
                        b) there is data available in the buffer
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    72
                        c) a character can successfully be decoded
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    73
                "
10
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    74
116
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    75
                "If our buffer size is too low before we begin our decode loop, we need to take an undo copy in case we cannot decode a character."
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    76
                buffer readSize < 10 ifTrue:
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    77
                        [mark := buffer readPosition.
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    78
                        encoder backupState ].
10
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    79
116
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    80
                [["The following may raise an incomplete, which means we don't have enough data in the buffer to decode the full character.
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    81
                 This is handled by the Incomplete exception capture before."
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    82
                character := encoder decodeFrom: bufferReading.
10
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    83
116
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    84
                "If we are not transparent, convert stray LFs in to CRs and CRLFs in to CRs"
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    85
                transparent ifFalse: [
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    86
                        character == LF
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    87
                                ifTrue: [character := crPreceeding ifTrue: [nil] ifFalse: [CR]. crPreceeding := false]
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    88
                                ifFalse:        [crPreceeding := character = CR]].
10
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    89
116
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    90
                "If we didn't filter out an LF at the tail of a CRLF, commit the character to the output."
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    91
                character == nil ifFalse: [
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    92
                        (transparent or: [ character ~~ CR ]) ifFalse: [ character := LF ].
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    93
                        aSequenceableCollection at: position put: character.
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    94
                        remaining := remaining - 1.
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    95
                        position := position + 1].
10
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
    96
116
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    97
                "Find out how much data we have left in the buffer. If it's too low we need to keep track of the undo record in case we cannot decode a character."
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    98
                (bufferAvailable := buffer readSize) < 10 ifTrue:
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
    99
                        [mark := buffer readPosition.
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
   100
                        encoder backupState ].
10
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   101
116
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
   102
                remaining > 0 and: [bufferAvailable > 0]] whileTrue]
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
   103
                        on: Incomplete do: [:incomplete |
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
   104
                                "We failed to decode a character, we've hit the end of the buffer and need to refill it. We rewind the buffer and leave the decoding loop
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
   105
                                 to return to the main loop where more data will be fetched in to our buffer."
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
   106
                                buffer readPosition: mark.
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
   107
                                encoder restoreState]].
fa5b4c9f582d - Xtreams::EncodeReadStream
joe
parents: 111
diff changeset
   108
        ^anInteger
10
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   109
! !
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   110
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   111
!EncodeReadStream methodsFor:'initialize-release'!
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   112
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   113
close
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   114
	super close.
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   115
	buffer recycle.
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   116
	buffer := nil
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   117
!
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   118
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   119
contentsSpecies
90
mkobetic
parents: 72
diff changeset
   120
        
mkobetic
parents: 72
diff changeset
   121
        ^encoder contentsSpecies
10
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   122
!
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   123
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   124
on: aSource encoding: anEncoding
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   125
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   126
	super on: aSource.
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   127
	encoder := Encoder for: anEncoding.
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   128
	buffer := RingBuffer new: DefaultBufferSize class: ByteArray.
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   129
	bufferReading := buffer reading.
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   130
	bufferWriting := buffer writing.
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   131
	transparent := false.
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   132
	crPreceeding := false.
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   133
! !
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   134
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   135
!EncodeReadStream methodsFor:'line-end'!
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   136
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   137
setLineEndAuto
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   138
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   139
	transparent := false
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   140
!
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   141
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   142
setLineEndTransparent
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   143
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   144
	transparent := true
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   145
! !
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   146
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   147
!EncodeReadStream class methodsFor:'documentation'!
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   148
111
44ac233b2f83 * removed namespace from pool references and stray extension methods
joe
parents: 97
diff changeset
   149
version_HG
44ac233b2f83 * removed namespace from pool references and stray extension methods
joe
parents: 97
diff changeset
   150
44ac233b2f83 * removed namespace from pool references and stray extension methods
joe
parents: 97
diff changeset
   151
    ^ '$Changeset: <not expanded> $'
44ac233b2f83 * removed namespace from pool references and stray extension methods
joe
parents: 97
diff changeset
   152
!
44ac233b2f83 * removed namespace from pool references and stray extension methods
joe
parents: 97
diff changeset
   153
10
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   154
version_SVN
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   155
    ^ '$Id$'
3813193bdf4e first cut
Martin Kobetic <mkobetic@gmail.com>
parents:
diff changeset
   156
! !
111
44ac233b2f83 * removed namespace from pool references and stray extension methods
joe
parents: 97
diff changeset
   157