HTMLUtilities.st
author Claus Gittinger <cg@exept.de>
Sat, 02 May 2020 21:40:13 +0200
changeset 5476 7355a4b11cb6
parent 5430 fa33520af010
permissions -rw-r--r--
#FEATURE by cg class: Socket class added: #newTCPclientToHost:port:domain:domainOrder:withTimeout: changed: #newTCPclientToHost:port:domain:withTimeout:
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
     1
"{ Encoding: utf8 }"
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
     2
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     3
"
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     4
 COPYRIGHT (c) 2007 by eXept Software AG
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     5
              All Rights Reserved
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     6
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     7
 This software is furnished under a license and may be used
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     8
 only in accordance with the terms of that license and with the
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     9
 inclusion of the above copyright notice.   This software may not
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    10
 be provided or otherwise made available to, or used by, any
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    11
 other person.  No title to or ownership of the software is
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    12
 hereby transferred.
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    13
"
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    14
"{ Package: 'stx:libbasic2' }"
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    15
3544
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
    16
"{ NameSpace: Smalltalk }"
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
    17
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    18
Object subclass:#HTMLUtilities
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    19
	instanceVariableNames:''
4930
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
    20
	classVariableNames:'AmpersandEscapes EscapeControlCharacters HtmlEntityToCharacter
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
    21
		MathAmpersandEscapes'
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    22
	poolDictionaries:''
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    23
	category:'Net-Communication-Support'
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    24
!
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    25
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    26
!HTMLUtilities class methodsFor:'documentation'!
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    27
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    28
copyright
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    29
"
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    30
 COPYRIGHT (c) 2007 by eXept Software AG
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    31
              All Rights Reserved
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    32
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    33
 This software is furnished under a license and may be used
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    34
 only in accordance with the terms of that license and with the
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    35
 inclusion of the above copyright notice.   This software may not
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    36
 be provided or otherwise made available to, or used by, any
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    37
 other person.  No title to or ownership of the software is
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    38
 hereby transferred.
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    39
"
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    40
!
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    41
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    42
documentation
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    43
"
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    44
    Collected support functions to deal with HTML.
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    45
    Used both by HTML generators (DocGenerator), HTMLParsers and the webServer.
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    46
    Therefore, it has been put into libbasic2.
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    47
"
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    48
! !
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    49
2442
db061ff41012 added: #openLauncherOnDisplay:
sr
parents: 2436
diff changeset
    50
!HTMLUtilities class methodsFor:'common actions'!
db061ff41012 added: #openLauncherOnDisplay:
sr
parents: 2436
diff changeset
    51
db061ff41012 added: #openLauncherOnDisplay:
sr
parents: 2436
diff changeset
    52
openLauncherOnDisplay:displayName
2458
8c1955020123 changed: #openLauncherOnDisplay:
sr
parents: 2442
diff changeset
    53
    <resource: #obsolete>
2442
db061ff41012 added: #openLauncherOnDisplay:
sr
parents: 2436
diff changeset
    54
2458
8c1955020123 changed: #openLauncherOnDisplay:
sr
parents: 2442
diff changeset
    55
    "obsolete - do not use"
2442
db061ff41012 added: #openLauncherOnDisplay:
sr
parents: 2436
diff changeset
    56
2458
8c1955020123 changed: #openLauncherOnDisplay:
sr
parents: 2442
diff changeset
    57
    self obsoleteMethodWarning.
8c1955020123 changed: #openLauncherOnDisplay:
sr
parents: 2442
diff changeset
    58
    Error handle:[:ex |
8c1955020123 changed: #openLauncherOnDisplay:
sr
parents: 2442
diff changeset
    59
        ^ ex description
8c1955020123 changed: #openLauncherOnDisplay:
sr
parents: 2442
diff changeset
    60
    ] do:[
8c1955020123 changed: #openLauncherOnDisplay:
sr
parents: 2442
diff changeset
    61
        NewLauncher openLauncherOnInitializedDisplayNamed:displayName
8c1955020123 changed: #openLauncherOnDisplay:
sr
parents: 2442
diff changeset
    62
    ]
2442
db061ff41012 added: #openLauncherOnDisplay:
sr
parents: 2436
diff changeset
    63
2458
8c1955020123 changed: #openLauncherOnDisplay:
sr
parents: 2442
diff changeset
    64
    "Modified: / 01-06-2010 / 11:25:12 / sr"
2442
db061ff41012 added: #openLauncherOnDisplay:
sr
parents: 2436
diff changeset
    65
! !
db061ff41012 added: #openLauncherOnDisplay:
sr
parents: 2436
diff changeset
    66
4517
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
    67
!HTMLUtilities class methodsFor:'constants'!
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
    68
4929
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    69
ampersandEscapes
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    70
    AmpersandEscapes isNil ifTrue:[
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    71
        AmpersandEscapes := IdentityDictionary new.
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    72
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    73
        #(
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    74
            #nbsp  160          "/ non-breakable space - do something magic...
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    75
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    76
            #emspace 160        "/ temporary
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    77
            #enspace 160
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    78
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    79
            #lt    $<
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    80
            #gt    $>
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    81
            #amp   $&
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    82
            #quot  $"
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    83
            #apos  $'
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    84
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    85
            #copy  169          "/ copyright
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    86
            #reg   174          "/ registered
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    87
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    88
            #cent   162
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    89
            #pound  163
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    90
            #yen    165
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    91
            #brvbar $|
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    92
            #sect   167
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    93
            #laquo  171
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    94
            #raquo  187
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    95
            #plusmn 177
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    96
            #micro  181
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    97
            #middot 183
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    98
            #frac14 188
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
    99
            #frac12 189
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   100
            #frac34 190
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   101
            #iquest 191
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   102
            #iexcl  16rA1
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   103
            #div    247
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   104
            #divide 247
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   105
            #not    16rAC
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   106
            #shy    16rAD
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   107
            #para   16rB6
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   108
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   109
            #deg   176
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   110
            #sup1  185
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   111
            #sup2  178
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   112
            #sup3  179
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   113
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   114
            #ordm   16rBA
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   115
            #ordf   16rAA
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   116
            #macr   16rAF
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   117
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   118
            #cedil  16rB8
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   119
            #uml    16rA8
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   120
            #acute  16rB4
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   121
            #curren 16rA4
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   122
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   123
            #Oslash 216
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   124
            #oslash 248
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   125
            #aring  229
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   126
            #Aring  197
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   127
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   128
            #ccedil 231
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   129
            #Ccedil 199
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   130
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   131
            #thorn  16rFE
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   132
            #THORN  16rDE
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   133
            #Thorn  15rDE
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   134
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   135
            #eth  16rF0
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   136
            #ETH  16rD0
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   137
            #Eth  16rD0
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   138
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   139
            #atilde 227
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   140
            #Atilde 195
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   141
            #ntilde 241
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   142
            #Ntilde 209
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   143
            #otilde 245
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   144
            #Otilde 213
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   145
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   146
            #auml  228
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   147
            #Auml  196
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   148
            #uuml  252
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   149
            #Uuml  220
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   150
            #ouml  246
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   151
            #Ouml  214
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   152
            #euml  235
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   153
            #Euml  203
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   154
            #iuml  239
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   155
            #Iuml  207
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   156
            #yuml  255
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   157
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   158
            #acirc  226
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   159
            #Acirc  194
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   160
            #icirc  238
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   161
            #Icirc  206
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   162
            #ecirc  234
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   163
            #Ecirc  202
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   164
            #ucirc  251
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   165
            #Ucirc  219
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   166
            #ocirc  244
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   167
            #Ocirc  212
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   168
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   169
            #agrave 224
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   170
            #Agrave 192
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   171
            #egrave 232
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   172
            #Egrave 200
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   173
            #igrave 236
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   174
            #Igrave 204
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   175
            #ograve 242
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   176
            #Ograve 210
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   177
            #ugrave 249
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   178
            #Ugrave 217
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   179
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   180
            #aacute 225
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   181
            #Aacute 193
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   182
            #eacute 233
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   183
            #Eacute 201
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   184
            #iacute 237
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   185
            #Iacute 205
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   186
            #oacute 243
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   187
            #Oacute 211
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   188
            #uacute 250
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   189
            #Uacute 218
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   190
            #yacute 16rFD
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   191
            #Yacute 16rDD
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   192
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   193
            #szlig  223
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   194
            #aelig  230
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   195
            #AElig  198
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   196
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   197
            "/ unicode
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   198
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   199
            #OElig   16r0152         "/ 8859-2 (latin2)
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   200
            #oelig   16r0153         "/ 8859-2 (latin2)
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   201
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   202
            #ljlig   16r01C9         "/ 8859-2 (latin2)
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   203
            #LJlig   16r01C7         "/ 8859-2 (latin2)
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   204
            #Ljlig   16r01C8         "/ 8859-2 (latin2)
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   205
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   206
            #Scaron  16r0160         "/ 8859-2 (latin2)
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   207
            #scaron  16r0161         "/ 8859-2 (latin2)
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   208
            #Yuml    16r0178         "/ 8859-2 (latin2)
4517
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   209
4929
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   210
            #Alpha    16r0391    "/ greek alpha
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   211
            #Beta     16r0392
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   212
            #Gamma    16r0393
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   213
            #Delta    16r0394
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   214
            #Epsilon  16r0395
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   215
            #Zeta     16r0396
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   216
            #Eta      16r0397
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   217
            #Theta    16r0398
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   218
            #Iota     16r0399
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   219
            #Kappa    16r039A
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   220
            #Lambda   16r039B
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   221
            #Mu       16r039C
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   222
            #Nu       16r039D
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   223
            #Xi       16r039E
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   224
            #Omicron  16r039F
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   225
            #Pi       16r03A0
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   226
            #Rho      16r03A1
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   227
            #Sigma    16r03A3
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   228
            #Tau      16r03A4
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   229
            #Upsilon  16r03A5
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   230
            #Phi      16r03A6
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   231
            #Chi      16r03A7      
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   232
            #Psi      16r03A8      
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   233
            #Omega    16r03A9
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   234
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   235
            #alpha    16r03B1    "/ greek alpha
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   236
            #beta     16r03B2
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   237
            #gamma    16r03B3
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   238
            #delta    16r03B4
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   239
            #epsilon  16r03B5
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   240
            #zeta     16r03B6
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   241
            #eta      16r03B7
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   242
            #theta    16r03B8
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   243
            #iota     16r03B9
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   244
            #kappa    16r03BA
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   245
            #lambda   16r03BB
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   246
            #mu       16r03BC
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   247
            #nu       16r03BD
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   248
            #xi       16r03BE
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   249
            #omicron  16r03BF
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   250
            #pi       16r03C0
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   251
            #rho      16r03C1
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   252
            #sigmaf   16r03C2
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   253
            #sigma    16r03C3
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   254
            #tau      16r03C4
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   255
            #upsilon  16r03C5
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   256
            #phi      16r03C6
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   257
            #chi      16r03C7
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   258
            #psi      16r03C8
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   259
            #omega    16r03C9
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   260
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   261
            #thetasym 16r03D1
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   262
            #upsih    16r03D2
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   263
            #piv      16r03D6
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   264
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   265
            #ensp     16r2002
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   266
            #emsp     16r2003
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   267
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   268
            #thinsp   16r2009         "/ thin space         
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   269
            #zwnj     16r200C         "/ zero width non-joiner         
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   270
            #zwj      16r200D         "/ zero width joiner         
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   271
            #lrm      16r200E         "/ left-to-right mark         
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   272
            #rlm      16r200F         "/ right-to-left mark         
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   273
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   274
            #ndash    16r2013
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   275
            #mdash    16r2014
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   276
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   277
            #lsquo    16r2018         "/ left single quot. mark
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   278
            #rsquo    16r2019         "/ right single quot. mark
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   279
            #sbquo    16r201A         "/ single low-9 quot. mark
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   280
            #ldquo    16r201C         "/ left double quot. mark
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   281
            #rdquo    16r201D         "/ right double quot. mark
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   282
            #bdquo    16r201E         "/ double low-9 quot. mark
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   283
            #dagger   16r2020
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   284
            #Dagger   16r2021         "/ double dagger
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   285
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   286
            #bull     16r2022
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   287
            #hellip   16r2026
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   288
            #prime    16r2032
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   289
            #Prime    16r2033
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   290
            #oline    16r203E
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   291
            #frasl    16r2044
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   292
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   293
            #euro     16r20AC         "/ 8859-16
4517
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   294
4929
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   295
            #weierp   16r2118
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   296
            #image    16r2111
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   297
            #real     16r211C
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   298
            #trade    16r2122
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   299
            #angst    16r212B      
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   300
            #alefsym  16r2135
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   301
            #larr     16r2190
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   302
            #uarr     16r2191
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   303
            #rarr     16r2192
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   304
            #darr     16r2193
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   305
            #harr     16r2194
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   306
            #crarr    16r21B5
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   307
            #lArr     16r21D0
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   308
            #uArr     16r21D1
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   309
            #rArr     16r21D2
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   310
            #dArr     16r21D3
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   311
            #hArr     16r21D4
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   312
            #forall   16r2200
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   313
            #part     16r2202
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   314
            #exist    16r2203
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   315
            #empty    16r2205
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   316
            #nabla    16r2207
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   317
            #isin     16r2208
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   318
            #notin    16r2209
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   319
            #ni       16r220B
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   320
            #prod     16r220F
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   321
            #sum      16r2211
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   322
            #minus    16r2212
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   323
            #lowast   16r2217
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   324
            #radic    16r221A
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   325
            #prop     16r221D
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   326
            #infin    16r221E
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   327
            #ang90    16r221F      
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   328
            #ang      16r2220
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   329
            #angmsd   16r2221      
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   330
            #angsph   16r2222      
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   331
            #and      16r2227
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   332
            #or       16r2228
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   333
            #cap      16r2229
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   334
            #cup      16r222A
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   335
            #int      16r222B
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   336
            #there4   16r2234
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   337
            #sim      16r223C
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   338
            #cong     16r2245
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   339
            #asymp    16r2248
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   340
            #ne       16r2260
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   341
            #equiv    16r2261
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   342
            #le       16r2264
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   343
            #ge       16r2265
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   344
            #sub      16r2282
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   345
            #sup      16r2283
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   346
            #nsub     16r2284
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   347
            #sube     16r2286
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   348
            #supe     16r2287
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   349
            #oplus    16r2295
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   350
            #otimes   16r2297
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   351
            #perp     16r22A5
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   352
            #sdot     16r22C5
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   353
            #lceil    16r2308
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   354
            #rceil    16r2309
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   355
            #lfloor   16r230A
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   356
            #rfloor   16r230B
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   357
            #lang     16r2329
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   358
            #rang     16r232A
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   359
            #loz      16r25CA
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   360
            #spades   16r2660
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   361
            #clubs    16r2663
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   362
            #hearts   16r2665
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   363
            #diams    16r2666
4517
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   364
4929
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   365
        ) pairWiseDo:[:key :val |
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   366
            |v|
4924
b171682381a1 #TUNING by cg
Claus Gittinger <cg@exept.de>
parents: 4737
diff changeset
   367
4929
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   368
            v := val.
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   369
            val isInteger ifTrue:[
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   370
                v := Character value:v
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   371
            ].
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   372
            AmpersandEscapes at:key put:v
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   373
        ].
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   374
    ].
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   375
    ^ AmpersandEscapes
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   376
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   377
    "Created: / 01-04-2019 / 14:34:25 / Claus Gittinger"
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   378
!
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   379
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   380
htmlEntityToCharacter
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   381
    ^ self ampersandEscapes
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   382
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   383
    "Modified: / 01-04-2019 / 14:36:41 / Claus Gittinger"
4930
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   384
!
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   385
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   386
mathAmpersandEscapes
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   387
    "these are obsolete now, as HTML4 added the missing stuff in the meantime."
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   388
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   389
    MathAmpersandEscapes isNil ifTrue:[
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   390
        MathAmpersandEscapes := IdentityDictionary new.
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   391
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   392
        #(
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   393
"/            #alpha    16r61      "/ greek alpha
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   394
"/            #beta     16r62      "/ greek beta
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   395
"/            #chi      16r63      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   396
"/            #delta    16r64     
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   397
"/            #epsilon  16r65      "/ symbol characterSet has no epsilon
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   398
            #vepsilon 16r65        
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   399
"/            #phi      16r66      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   400
"/            #gamma    16r67     
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   401
"/            #eta      16r68      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   402
"/            #iota     16r69      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   403
            #varphi   16r6A      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   404
"/            #kappa    16r6B      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   405
"/            #lambda   16r6C      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   406
"/            #mu       16r6D      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   407
"/            #nu       16r6E      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   408
"/            #omicron  16r6F      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   409
"/            #pi       16r70      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   410
"/            #theta    16r71      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   411
            #vtheta   16r71      "/ symbol characterSet has no vtheta  
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   412
"/            #rho      16r72      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   413
            #varrho   16r72      "/ symbol characterSet has no varrho  
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   414
"/            #sigma    16r73      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   415
            #vsigma   16r56
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   416
"/            #tau      16r74      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   417
"/            #upsilon  16r75      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   418
            #varpi    16r76     
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   419
"/            #omega    16r77      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   420
"/            #xi       16r78      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   421
"/            #psi      16r79      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   422
"/            #zeta     16r7A      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   423
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   424
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   425
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   426
"/            #Alpha    16r41      "/ greek alpha
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   427
"/            #Beta     16r42      "/ greek beta
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   428
"/            #Chi      16r43      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   429
"/            #Delta    16r44     
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   430
"/            #Epsilon  16r45     
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   431
"/            #Phi      16r46      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   432
"/            #Gamma    16r47      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   433
"/            #Eta      16r48      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   434
"/            #Iota     16r49      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   435
"/
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   436
"/            #Kappa    16r4B      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   437
"/            #Lambda   16r4C      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   438
"/            #Mu       16r4D      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   439
"/            #Nu       16r4E      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   440
"/            #Omicron  16r4F      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   441
"/            #Pi       16r50      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   442
"/            #Theta    16r51      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   443
"/            #Rho      16r52      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   444
"/            #Sigma    16r53      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   445
"/            #Tau      16r54      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   446
"/            #Upsilon  16rA1      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   447
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   448
"/            #Omega    16r57    
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   449
"/            #Xi       16r58      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   450
"/            #Psi      16r59      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   451
"/            #Zeta     16r5A      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   452
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   453
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   454
"/            #forall   16r22
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   455
            #exist    16r24
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   456
            #exists   16r24
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   457
            #aleph    16rC0      "/ no, this is not alf ;-)
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   458
            #Re       16rC2      "/ R fraktur
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   459
            #Im       16rC1      "/ I fraktur
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   460
            #infty    16rA5      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   461
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   462
            #leq      16rA3      "/ less-equal
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   463
            #geq      16rB3      "/ greater-equal
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   464
            #equiv    16rBA      "/ equivalent
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   465
            #approx   16rBB      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   466
            #cong     16r40      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   467
"/            #neq      16rB9      
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   468
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   469
"/            #plusmn   16rB1     
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   470
            #times    16rB4   
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   471
"/            #div      16rB8    
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   472
            #oplus    16rC5   
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   473
            #otimes   16rC4   
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   474
            #oslash   16rC5   
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   475
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   476
            #sum      16rE5   
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   477
            #prod     16rD5   
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   478
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   479
            #uparrow         16rAD   
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   480
            #leftarrow       16rAC   
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   481
            #downarrow       16rAF   
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   482
            #rightarrow      16rAE   
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   483
            #leftrightarrow  16rAB   
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   484
            #Uparrow         16rDD   
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   485
            #Leftarrow       16rDC   
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   486
            #Downarrow       16rDF   
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   487
            #Rightarrow      16rDE   
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   488
            #Leftrightarrow  16rDB   
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   489
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   490
            #supset          16rC9  
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   491
            #supseteq        16rCA 
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   492
            #subset          16rCC   
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   493
            #subseteq        16rCD   
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   494
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   495
            #vee             16rDA   
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   496
            #wedge           16rD9   
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   497
            #neg             16rD8   
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   498
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   499
            #ldots           16rBC   
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   500
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   501
"/            #lfloor          16rEB
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   502
"/            #rfloor          16rFB
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   503
"/            #lceil           16rE9
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   504
"/            #rceil           16rF9
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   505
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   506
        ) pairWiseDo:[:key :val |
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   507
            |v|
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   508
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   509
            v := val.
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   510
            val isInteger ifTrue:[
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   511
                v := Character value:v
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   512
            ].
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   513
            MathAmpersandEscapes at:key put:v
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   514
        ].
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   515
    ].
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   516
    ^ MathAmpersandEscapes
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   517
7cebffa8ddaa #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4929
diff changeset
   518
    "Created: / 01-04-2019 / 14:40:51 / Claus Gittinger"
4517
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   519
! !
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   520
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   521
!HTMLUtilities class methodsFor:'helpers'!
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   522
3557
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   523
characterFromHtmlEntityNamed:anHtmlEntityName
4929
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   524
    ^ self ampersandEscapes
4936
1fec124a7fc7 #BUGFIX by Maren
matilk
parents: 4930
diff changeset
   525
        at:anHtmlEntityName asSymbol
4517
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   526
        ifAbsent:[
4924
b171682381a1 #TUNING by cg
Claus Gittinger <cg@exept.de>
parents: 4737
diff changeset
   527
            self halt. 
b171682381a1 #TUNING by cg
Claus Gittinger <cg@exept.de>
parents: 4737
diff changeset
   528
            "/ where to get the mapping???
4517
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   529
            "/ Answer: It is a mess. A good start may be
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   530
            "/ https://www.w3.org/TR/html4/sgml/entities.html with 252 named entities.
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   531
            "/ I guess an actual lookup table would be adequate.
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   532
            $~
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   533
        ]
4924
b171682381a1 #TUNING by cg
Claus Gittinger <cg@exept.de>
parents: 4737
diff changeset
   534
4929
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   535
    "Modified: / 01-04-2019 / 14:36:18 / Claus Gittinger"
4936
1fec124a7fc7 #BUGFIX by Maren
matilk
parents: 4930
diff changeset
   536
    "Modified: / 04-04-2019 / 10:46:22 / Maren"
3557
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   537
!
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   538
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   539
controlCharacters
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   540
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   541
    EscapeControlCharacters isNil ifTrue:[
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   542
        EscapeControlCharacters := Dictionary new.
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   543
        EscapeControlCharacters at:$< put:'&lt;'.
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   544
        EscapeControlCharacters at:$> put:'&gt;'.
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   545
        EscapeControlCharacters at:$& put:'&amp;'.
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   546
        EscapeControlCharacters at:$" put:'&quot;'.
2436
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
   547
        "/ EscapeControlCharacters at:$' put:'&apos;'.
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   548
    ].
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   549
    ^ EscapeControlCharacters.
3544
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   550
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   551
    "Modified (comment): / 06-05-2015 / 16:17:31 / sr"
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   552
!
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   553
4517
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   554
copyReplaceCharactersWithHtmlEntitiesIn:aString
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   555
    |stream htmlEntity|
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   556
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   557
    stream := '' writeStream.
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   558
    (aString ? '') do:[:eachCharacter |
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   559
        htmlEntity := self htmlEntityForCharacter:eachCharacter.
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   560
        htmlEntity isNil ifTrue:[
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   561
            stream nextPut:eachCharacter.
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   562
        ] ifFalse:[
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   563
            stream
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   564
                nextPut:$&;
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   565
                nextPutAll:htmlEntity;
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   566
                nextPut:$;.           
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   567
        ].
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   568
    ].
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   569
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   570
    ^ stream contents
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   571
!
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   572
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   573
escapeCharacterEntities:aString
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   574
    "helper to escape invalid/dangerous characters in html strings.
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   575
     These are:
5068
19703bf9bdfb #DOCUMENTATION by stefan
Stefan Vogel <sv@exept.de>
parents: 4936
diff changeset
   576
        control characters, '<', '&' and space -> %XX ascii as hex digits
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   577
        %     -> %%
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   578
    "
2066
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
   579
    "/ TODO: this is similar to withSpecialHTMLCharactersEscaped.
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
   580
    "/ we should refactor this into one method only (can we do hex escapes always ?).
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
   581
    "/ Notice, that these two methods came into existance due to historic reasons
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
   582
    "/ and were developed independent of each other, but later moved to this common place.
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
   583
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
   584
3545
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   585
    ^self escapeCharacterEntities:aString andControlCharacters:self controlCharacters
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   586
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   587
    "
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   588
     self escapeCharacterEntities:'a<b'     
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
   589
     self escapeCharacterEntities:'aöb'     
3545
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   590
    "
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   591
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   592
    "Modified: / 06-05-2015 / 16:30:13 / sr"
5068
19703bf9bdfb #DOCUMENTATION by stefan
Stefan Vogel <sv@exept.de>
parents: 4936
diff changeset
   593
    "Modified (comment): / 26-07-2019 / 12:19:18 / Stefan Vogel"
3545
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   594
!
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   595
3545
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   596
escapeCharacterEntities:aString andControlCharacters:controlCharacters
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   597
    "helper to escape invalid/dangerous characters in html strings.
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   598
     These are:
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   599
        control characters, '<', '>', '&' and space -> %XX ascii as hex digits
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   600
        %     -> %%
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   601
    "
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   602
    "/ TODO: this is similar to withSpecialHTMLCharactersEscaped.
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   603
    "/ we should refactor this into one method only (can we do hex escapes always ?).
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   604
    "/ Notice, that these two methods came into existance due to historic reasons
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   605
    "/ and were developed independent of each other, but later moved to this common place.
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   606
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   607
4296
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   608
    ^ String 
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   609
        streamContents:[:ws |
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   610
            self escapeCharacterEntities:aString andControlCharacters:controlCharacters on:ws.
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   611
        ]
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   612
    
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   613
    "
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   614
     self escapeCharacterEntities:'a<b'     
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
   615
     self escapeCharacterEntities:'aöb'     
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   616
    "
3545
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   617
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   618
    "Created: / 06-05-2015 / 16:29:51 / sr"
4296
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   619
    "Modified (format): / 05-02-2017 / 17:59:32 / cg"
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   620
!
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   621
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   622
escapeCharacterEntities:aString andControlCharacters:controlCharacters on:aWriteStream
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   623
    "helper to escape invalid/dangerous characters in html strings.
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   624
     These are:
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   625
        control characters, '<', '>', '&' and space -> %XX ascii as hex digits
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   626
        %     -> %%
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   627
    "
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   628
    "/ TODO: this is similar to withSpecialHTMLCharactersEscaped.
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   629
    "/ we should refactor this into one method only (can we do hex escapes always ?).
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   630
    "/ Notice, that these two methods came into existance due to historic reasons
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   631
    "/ and were developed independent of each other, but later moved to this common place.
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   632
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   633
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   634
    |rs c controlString|
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   635
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   636
    rs := ReadStream on: aString.
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   637
    [ rs atEnd ] whileFalse: [
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   638
        c := rs next.
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   639
        controlString := controlCharacters notEmptyOrNil ifTrue:[controlCharacters at:c ifAbsent:nil] ifFalse:[nil].
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   640
        controlString notNil ifTrue:[
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   641
            aWriteStream nextPutAll:controlString.
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   642
        ] ifFalse:[
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   643
            c codePoint > 16r7F ifTrue:[
4333
2e428045cb82 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 4302
diff changeset
   644
                aWriteStream nextPutAll:'&#'.
2e428045cb82 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 4302
diff changeset
   645
                c codePoint printOn:aWriteStream.
2e428045cb82 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 4302
diff changeset
   646
                aWriteStream nextPut:$;.
4296
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   647
            ] ifFalse:[
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   648
                aWriteStream nextPut:c.
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   649
            ]
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   650
        ]
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   651
    ].
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   652
    
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   653
    "
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   654
     self escapeCharacterEntities:'a<b'     
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
   655
     self escapeCharacterEntities:'aöb'     
4296
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   656
    "
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   657
0da79cbe040b #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4287
diff changeset
   658
    "Created: / 05-02-2017 / 17:58:34 / cg"
4333
2e428045cb82 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 4302
diff changeset
   659
    "Modified: / 17-02-2017 / 10:34:20 / stefan"
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   660
!
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   661
4297
0908351381fd #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4296
diff changeset
   662
escapeCharacterEntities:aString on:aStream
0908351381fd #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4296
diff changeset
   663
    "helper to escape invalid/dangerous characters in html strings.
0908351381fd #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4296
diff changeset
   664
     These are:
5068
19703bf9bdfb #DOCUMENTATION by stefan
Stefan Vogel <sv@exept.de>
parents: 4936
diff changeset
   665
        control characters, '<', '&' and space -> %XX ascii as hex digits
4297
0908351381fd #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4296
diff changeset
   666
        %     -> %%
0908351381fd #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4296
diff changeset
   667
    "
0908351381fd #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4296
diff changeset
   668
    "/ TODO: this is similar to withSpecialHTMLCharactersEscaped.
0908351381fd #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4296
diff changeset
   669
    "/ we should refactor this into one method only (can we do hex escapes always ?).
0908351381fd #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4296
diff changeset
   670
    "/ Notice, that these two methods came into existance due to historic reasons
0908351381fd #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4296
diff changeset
   671
    "/ and were developed independent of each other, but later moved to this common place.
0908351381fd #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4296
diff changeset
   672
0908351381fd #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4296
diff changeset
   673
0908351381fd #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4296
diff changeset
   674
    ^self escapeCharacterEntities:aString andControlCharacters:self controlCharacters on:aStream
0908351381fd #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4296
diff changeset
   675
0908351381fd #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4296
diff changeset
   676
    "
0908351381fd #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4296
diff changeset
   677
     self escapeCharacterEntities:'a<b'     
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
   678
     self escapeCharacterEntities:'aöb'     
4297
0908351381fd #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4296
diff changeset
   679
    "
0908351381fd #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4296
diff changeset
   680
0908351381fd #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4296
diff changeset
   681
    "Created: / 05-02-2017 / 18:00:56 / cg"
5068
19703bf9bdfb #DOCUMENTATION by stefan
Stefan Vogel <sv@exept.de>
parents: 4936
diff changeset
   682
    "Modified (comment): / 26-07-2019 / 12:19:09 / Stefan Vogel"
4297
0908351381fd #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4296
diff changeset
   683
!
0908351381fd #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4296
diff changeset
   684
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   685
extractCharSetEncodingFromContentType:contentTypeLine
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   686
    |idx rest encoding|
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   687
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   688
    idx := contentTypeLine findString:'charset='.
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   689
    idx == 0 ifTrue:[
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   690
	^ nil
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   691
    ].
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   692
    rest := (contentTypeLine copyFrom:idx+'charset=' size) withoutSeparators.
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   693
    idx := (rest indexOfSeparator) min:(rest indexOf:$;).
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   694
    idx == 0 ifTrue:[
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   695
	encoding := rest
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   696
    ] ifFalse:[
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   697
	encoding := rest copyTo:idx-1.
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   698
    ].
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   699
    (encoding startsWith:$") ifTrue:[
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   700
	encoding := encoding copyFrom:2 to:(encoding indexOf:$" startingAt:3)-1.
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   701
    ].
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   702
    ^ encoding.
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   703
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   704
    "
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   705
     self extractCharSetEncodingFromContentType:'text/html; charset=ascii'
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   706
     self extractCharSetEncodingFromContentType:'text/html; charset='
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   707
     self extractCharSetEncodingFromContentType:'text/html; fooBar=bla'
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   708
     self extractCharSetEncodingFromContentType:'text/xml; charset=utf-8'
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   709
     self extractCharSetEncodingFromContentType:'text/xml; charset=utf-8; bla=fasel'
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   710
    "
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   711
!
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   712
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   713
extractMimeTypeFromContentType:contentTypeLine
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   714
    |idx mimeAndEncoding|
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   715
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   716
    idx := contentTypeLine indexOf:$:.
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   717
    mimeAndEncoding := (contentTypeLine copyFrom:idx+1) withoutSeparators.
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   718
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   719
    (mimeAndEncoding includes:$;) ifFalse:[
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   720
	^ mimeAndEncoding
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   721
    ].
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   722
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   723
    idx := mimeAndEncoding indexOf:$;.
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   724
    ^ mimeAndEncoding copyTo:idx-1
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   725
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   726
    "
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   727
     self extractMimeTypeFromContentType:'text/html; charset=ascii'
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   728
     self extractMimeTypeFromContentType:'text/html; '
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   729
     self extractMimeTypeFromContentType:'text/html'
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   730
     self extractMimeTypeFromContentType:'text/xml; charset=utf-8'
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   731
    "
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   732
!
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   733
4517
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   734
htmlEntityForCharacter:aCharacter
4924
b171682381a1 #TUNING by cg
Claus Gittinger <cg@exept.de>
parents: 4737
diff changeset
   735
    aCharacter == Character space ifTrue:[^ nil].
b171682381a1 #TUNING by cg
Claus Gittinger <cg@exept.de>
parents: 4737
diff changeset
   736
    aCharacter isLetterOrDigit ifTrue:[^ nil].
b171682381a1 #TUNING by cg
Claus Gittinger <cg@exept.de>
parents: 4737
diff changeset
   737
    
4929
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   738
    ^ self ampersandEscapes
4517
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   739
        keyAtValue:aCharacter
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   740
        ifAbsent:nil
4924
b171682381a1 #TUNING by cg
Claus Gittinger <cg@exept.de>
parents: 4737
diff changeset
   741
4929
6220f244a435 #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4924
diff changeset
   742
    "Modified: / 01-04-2019 / 14:36:25 / Claus Gittinger"
4517
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   743
!
5c92422a4187 #FEATURE by sr
sr
parents: 4494
diff changeset
   744
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   745
unEscape:aString
2522
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   746
    "Convert escaped characters in an urls arguments or post fields back to their proper characters.
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
   747
     Undoes the effect of #urlEncoded: and #urlEncoded2:.
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   748
     These are:
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   749
        + -> space
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   750
        %XX ascii as hex digits
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
   751
        %uXXXX unicode as hex digits   NOTE: %u is non-standard bit implemented in MS IIS
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   752
        %% -> %
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   753
    "
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   754
3544
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   755
    |rs ws c peekC isUnicodeEscaped|
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   756
4204
481e0286fce9 #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 3660
diff changeset
   757
    aString isNil ifTrue:[
481e0286fce9 #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 3660
diff changeset
   758
        ^ nil.
481e0286fce9 #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 3660
diff changeset
   759
    ].
481e0286fce9 #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 3660
diff changeset
   760
2522
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   761
    (aString includesAny:'+%') ifFalse:[        
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   762
        ^ aString
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   763
    ].
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   764
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   765
    rs := ReadStream on: aString.
3544
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   766
    ws := CharacterWriteStream on: ''.
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   767
    isUnicodeEscaped := false.
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   768
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   769
    [rs atEnd] whileFalse:[
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   770
        c := rs next.
3544
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   771
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   772
        isUnicodeEscaped ifTrue:[
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   773
            isUnicodeEscaped := false.
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   774
            c := (Integer readFrom:(rs nextAvailable:4) radix:16) asCharacter.
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   775
        ] ifFalse:[
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   776
            c == $+ ifTrue:[ 
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   777
                c := Character space.
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   778
            ] ifFalse:[
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   779
                c == $% ifTrue:[
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   780
                    peekC := rs peek.
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   781
                    (peekC notNil and:[peekC isHexDigit]) ifTrue:[
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   782
                        c := (Integer readFrom:(rs nextAvailable:2) radix:16) asCharacter. 
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   783
                    ] ifFalse:[
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   784
                        (peekC notNil and:[peekC == $u]) ifTrue:[
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   785
                            isUnicodeEscaped := true.
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   786
                            c := nil.
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   787
                        ] ifFalse:[
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   788
                            c := rs next.
3544
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   789
                        ].
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   790
                    ].
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   791
                ].
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   792
            ].
3544
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   793
        ].
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   794
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   795
        c notNil ifTrue:[ 
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   796
            ws nextPut:c.
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   797
        ].
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   798
    ].
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   799
    ^ ws contents
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   800
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   801
    "
2087
6a7385a63ce0 *** empty log message ***
sr
parents: 2067
diff changeset
   802
     self unEscape:'a%20b'   
6a7385a63ce0 *** empty log message ***
sr
parents: 2067
diff changeset
   803
     self unEscape:'a%%b'
6a7385a63ce0 *** empty log message ***
sr
parents: 2067
diff changeset
   804
     self unEscape:'a+b' 
6a7385a63ce0 *** empty log message ***
sr
parents: 2067
diff changeset
   805
     self unEscape:'a%+b' 
2179
c1cee8bbc1e5 unescape: care for invalid escape sequence (%, %singleDigit atEnd)
sr
parents: 2144
diff changeset
   806
     self unEscape:'a%' 
c1cee8bbc1e5 unescape: care for invalid escape sequence (%, %singleDigit atEnd)
sr
parents: 2144
diff changeset
   807
     self unEscape:'a%2' 
4287
7d7b30363fa8 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4217
diff changeset
   808
     self unEscape:'/Home/a%C3%A4%C3%B6%C3%BCa'
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   809
    "
2179
c1cee8bbc1e5 unescape: care for invalid escape sequence (%, %singleDigit atEnd)
sr
parents: 2144
diff changeset
   810
2522
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   811
    "Modified: / 09-01-2011 / 10:44:50 / cg"
3544
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   812
    "Modified (comment): / 06-05-2015 / 15:40:04 / sr"
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
   813
    "Modified (comment): / 03-02-2017 / 17:06:32 / stefan"
2522
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   814
!
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   815
3545
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   816
unescapeCharacterEntities:aString
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   817
    "helper to unescape character entities in a string.
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   818
     Normally, this is done by the HTMLParser when it scans text,
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   819
     but seems to be also used in post-data fields which contain non-ascii characters
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   820
     (for example: the login postdata of expeccALM).
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   821
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   822
     Sequences are:
3557
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   823
        &<specialName>;
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   824
        &#<decimal>;            
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   825
        &#x<hex>
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   826
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   827
     From Reference:
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   828
        http://wiki.selfhtml.org/wiki/Referenz:HTML/Zeichenreferenz#HTML-eigene_Zeichen
3545
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   829
    "
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   830
3557
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   831
    |rs ws c 
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   832
     entity entityNumberPart
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   833
     htmlEntityMatchingFailed characterFromHtmlEntity|
3545
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   834
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   835
    (aString includes:$&) ifFalse:[        
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   836
        ^ aString
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   837
    ].
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   838
3557
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   839
    rs := ReadStream on:aString.
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   840
    ws := CharacterWriteStream on:''.
3545
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   841
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   842
    [rs atEnd] whileFalse:[
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   843
        c := rs next.
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   844
        c == $& ifTrue:[
3557
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   845
            entity := rs upToMatching:[:ch | ch == $;].
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   846
            entity notEmpty ifTrue:[
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   847
                rs peek == $; ifTrue:[ "/ something between & and ; 
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   848
                    rs next. "/ read over semicolon
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   849
                    htmlEntityMatchingFailed := false.
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   850
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   851
                    entity first == $# ifTrue:[ "/ entity is determined as number
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   852
                        entityNumberPart := entity copyFrom:2.
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   853
                        entityNumberPart notEmpty ifTrue:[
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   854
                            entityNumberPart first == $x ifTrue:[
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   855
                                entityNumberPart := entityNumberPart copyFrom:2.
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   856
                                entityNumberPart notEmpty ifTrue:[
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   857
                                    ws nextPut:(Character value:(Integer readFrom:entityNumberPart radix:16)).
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   858
                                ] ifFalse:[
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   859
                                    htmlEntityMatchingFailed := true. 
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   860
                                ].
3545
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   861
                            ] ifFalse:[
3557
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   862
                                entityNumberPart isNumeric ifTrue:[
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   863
                                    ws nextPut:(Character value:(Integer readFrom:entityNumberPart)).
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   864
                                ] ifFalse:[
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   865
                                    htmlEntityMatchingFailed := true. 
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   866
                                ].
3545
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   867
                            ].
3557
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   868
                        ] ifFalse:[
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   869
                            htmlEntityMatchingFailed := true. 
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   870
                        ].
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   871
                    ] ifFalse:[
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   872
                        characterFromHtmlEntity := self characterFromHtmlEntityNamed:entity.
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   873
                        characterFromHtmlEntity notNil ifTrue:[
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   874
                            ws nextPut:characterFromHtmlEntity.
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   875
                        ] ifFalse:[
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   876
                            htmlEntityMatchingFailed := true. 
3545
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   877
                        ].
3557
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   878
                    ].
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   879
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   880
                    htmlEntityMatchingFailed ifTrue:[
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   881
                        ws nextPut:c.
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   882
                        ws nextPutAll:entity.
4333
2e428045cb82 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 4302
diff changeset
   883
                        ws nextPut:$;.
3557
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   884
                    ].
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   885
                ] ifFalse:[
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   886
                    ws nextPut:c.
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   887
                    ws nextPutAll:entity.
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   888
                ].
3545
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   889
            ] ifFalse:[
3557
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   890
                ws nextPut:c.
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   891
            ].
3545
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   892
        ] ifFalse:[
3557
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   893
            ws nextPut:c.
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   894
        ].
3545
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   895
    ].
3557
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   896
3545
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   897
    ^ ws contents
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   898
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   899
    "
3557
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   900
     self unescapeCharacterEntities:'&;'            
3545
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   901
     self unescapeCharacterEntities:'&16368;'            
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   902
     self unescapeCharacterEntities:'&16368;&16368'            
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   903
     self unescapeCharacterEntities:'&16368;&lt;'            
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   904
     self unescapeCharacterEntities:'&16368;&lt'            
3557
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   905
     self unescapeCharacterEntities:'&#xaffe;'    
3545
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   906
     self unescapeCharacterEntities:'&quot;&lt;foo'      
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   907
     self unescapeCharacterEntities:'&funny;&lt;foo'     
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   908
    "
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   909
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   910
    "Created: / 06-05-2015 / 16:56:14 / sr"
3557
21e099fb879e class: HTMLUtilities
sr
parents: 3545
diff changeset
   911
    "Modified: / 18-05-2015 / 12:13:35 / sr"
4333
2e428045cb82 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 4302
diff changeset
   912
    "Modified: / 17-02-2017 / 10:18:35 / stefan"
3545
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   913
!
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
   914
4712
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   915
urlDecoded:aString
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   916
    "Convert escaped characters in an urls arguments or post fields back to their proper characters.
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   917
     Undoes the effect of #urlEncoded: and #urlEncoded2:.
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   918
     These are:
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   919
        + -> space
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   920
        %XX ascii as hex digits
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   921
        %uXXXX unicode as hex digits   NOTE: %u is non-standard bit implemented in MS IIS
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   922
        %% -> %
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   923
    "
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   924
    ^ (self unEscape:aString) utf8Decoded
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   925
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   926
    "
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   927
     self urlDecoded:'a%20b'   
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   928
     self urlDecoded:'a%%b'
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   929
     self urlDecoded:'a+b' 
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   930
     self urlDecoded:'a%+b' 
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   931
     self urlDecoded:'a%' 
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   932
     self urlDecoded:'a%2' 
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   933
     self urlDecoded:'/Home/a%C3%A4%C3%B6%C3%BCa'
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   934
    "
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   935
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   936
    "Created: / 26-08-2018 / 12:49:24 / Claus Gittinger"
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   937
!
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   938
2522
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   939
urlEncode2:aStringOrStream on:ws
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
   940
    <resource: #obsolete>
2522
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   941
    "helper to escape invalid/dangerous characters in an urls arguments.
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   942
     Similar to urlEncode, but treats '*','~' and spaces differently.
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   943
     (some clients, such as bitTorrent seem to require this - time will tell...)
2523
cae6bc936653 changed: #urlEncode2:on:
Claus Gittinger <cg@exept.de>
parents: 2522
diff changeset
   944
     Any byte not in the set 0-9, a-z, A-Z, '.', '-', '_', is encoded using 
2522
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   945
     the '%nn' format, where nn is the hexadecimal value of the byte.
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   946
        see: RFC1738"
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   947
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   948
    |rs c space|
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   949
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   950
    space := Character space.
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   951
    rs := aStringOrStream readStream.
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   952
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   953
    [rs atEnd] whileFalse: [
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   954
        c := rs next.
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   955
2523
cae6bc936653 changed: #urlEncode2:on:
Claus Gittinger <cg@exept.de>
parents: 2522
diff changeset
   956
        (c isLetterOrDigit or:[ ('-_.' includes:c) ]) ifTrue:[
2522
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   957
            ws nextPut:c.
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   958
        ] ifFalse:[
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   959
            ws nextPut: $%.
3544
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   960
            c codePoint > 16rFF ifTrue:[
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   961
                ws nextPut: $u.
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   962
                c codePoint printOn:ws base:16 size:4 fill:$0.
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   963
            ] ifFalse:[
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   964
                c codePoint printOn:ws base:16 size:2 fill:$0.
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   965
            ]
2522
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   966
        ].
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   967
    ].
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   968
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   969
    "Created: / 09-01-2011 / 10:32:27 / cg"
2523
cae6bc936653 changed: #urlEncode2:on:
Claus Gittinger <cg@exept.de>
parents: 2522
diff changeset
   970
    "Modified: / 09-01-2011 / 13:11:17 / cg"
3544
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
   971
    "Modified: / 06-05-2015 / 15:43:39 / sr"
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   972
!
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   973
2500
Stefan Vogel <sv@exept.de>
parents: 2464
diff changeset
   974
urlEncode:aStringOrStream on:ws
5430
fa33520af010 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 5390
diff changeset
   975
    "helper to escape invalid/dangerous characters in an url's argument or post-fields.
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
   976
4712
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   977
     Any byte not in the set 0-9, a-z, A-Z, '.', '-', '_' and '~', 
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
   978
     is encoded using the '%nn' format, where nn is the hexadecimal value of the byte.
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
   979
     Characters outside the ASCII range are encoded into utf8 first.
2522
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   980
     Spaces are encoded as '+'.
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
   981
        see: application/x-www-form-urlencoded  
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
   982
        see: https://tools.ietf.org/html/rfc3986 (obsoletes RFC1738)"
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   983
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
   984
    |rs c|
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   985
2500
Stefan Vogel <sv@exept.de>
parents: 2464
diff changeset
   986
    rs := aStringOrStream readStream.
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   987
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
   988
    [(c := rs nextOrNil) notNil] whileTrue: [
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
   989
        |cp|
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   990
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
   991
        (c isLetterOrDigit or:['-_.~' includes:c]) ifTrue:[
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   992
            ws nextPut:c.
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   993
        ] ifFalse:[
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
   994
            c == Character space ifTrue:[
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   995
                ws nextPut:$+.
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   996
            ] ifFalse:[
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
   997
                cp := c codePoint.
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
   998
                cp > 16r7F ifTrue:[
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
   999
                    c utf8Encoded do:[:eachUtf8Char|
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
  1000
                        ws nextPut: $%.
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
  1001
                        eachUtf8Char codePoint printOn:ws base:16 size:2 fill:$0.
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
  1002
                    ].
3544
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
  1003
                ] ifFalse:[
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
  1004
                    ws nextPut: $%.
3544
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
  1005
                    cp printOn:ws base:16 size:2 fill:$0.
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
  1006
                ].
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1007
            ].
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1008
        ].
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1009
    ].
2522
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1010
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
  1011
    "
4712
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
  1012
     self urlEncoded:'hokus pokus fidibus*-/~'
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
  1013
     self urlEncoded:'Ützel Brötzel*-/~'
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
  1014
     self urlEncoded:'χαιρε'
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
  1015
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
  1016
     self urlDecoded:(self urlEncoded:'hokus pokus fidibus*-/~')
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
  1017
     self urlDecoded:(self urlEncoded:'Ützel Brötzel*-/~')
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
  1018
     self urlDecoded:(self urlEncoded:'χαιρε')
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
  1019
    "
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
  1020
2522
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1021
    "Modified: / 09-01-2011 / 10:43:30 / cg"
3544
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
  1022
    "Modified: / 06-05-2015 / 16:06:52 / sr"
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
  1023
    "Modified (comment): / 07-02-2017 / 14:51:42 / stefan"
4712
530912590b7f #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4517
diff changeset
  1024
    "Modified (comment): / 26-08-2018 / 12:50:04 / Claus Gittinger"
2522
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1025
!
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1026
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1027
urlEncoded2: aString
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
  1028
    <resource: #obsolete>
2522
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1029
    "helper to escape invalid/dangerous characters in an urls arguments or post-fields.
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1030
     Similar to urlEncoded, but treats '*','~' and spaces differently.
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1031
     (some clients, such as bitTorrent seem to require this - time will tell...)
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1032
     Any byte not in the set 0-9, a-z, A-Z, '.', '-', '_' and '~', is encoded using 
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1033
     the '%nn' format, where nn is the hexadecimal value of the byte.
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1034
        see: application/x-www-form-urlencoded  
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1035
        see: RFC1738"
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1036
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1037
    |ws|
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1038
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1039
    ws := String writeStreamWithInitialSize:aString size.
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1040
    self urlEncode2:aString on:ws.
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1041
    ^ ws contents
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1042
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1043
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1044
    "
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1045
      self unEscape:(self urlEncoded:'_-.*Frankfurt(Main) Hbf')
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1046
      self urlEncoded2:'_-.*Frankfurt(Main) Hbf'
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1047
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1048
      self unEscape:(self urlEncoded:'-_.*%exept;')
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1049
      self urlEncoded2:'-_.*%exept;'  
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1050
      self urlEncoded:'-_.*%exept;'    
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1051
    "
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1052
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1053
    "Created: / 09-01-2011 / 10:34:50 / cg"
2500
Stefan Vogel <sv@exept.de>
parents: 2464
diff changeset
  1054
!
Stefan Vogel <sv@exept.de>
parents: 2464
diff changeset
  1055
Stefan Vogel <sv@exept.de>
parents: 2464
diff changeset
  1056
urlEncoded: aString
Stefan Vogel <sv@exept.de>
parents: 2464
diff changeset
  1057
    "helper to escape invalid/dangerous characters in an urls arguments or post-fields.
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
  1058
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
  1059
     Any byte not in the set 0-9, a-z, A-Z, '.', '-', '_' and '~', is encoded using 
2522
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1060
     the '%nn' format, where nn is the hexadecimal value of the byte.
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
  1061
     Characters outside the ASCII range are encoded into utf8 first.
2522
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1062
     Spaces are encoded as '+'.
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1063
        see: application/x-www-form-urlencoded  
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
  1064
        see: https://tools.ietf.org/html/rfc3986 (obsoletes RFC1738)"
2500
Stefan Vogel <sv@exept.de>
parents: 2464
diff changeset
  1065
Stefan Vogel <sv@exept.de>
parents: 2464
diff changeset
  1066
    |ws|
Stefan Vogel <sv@exept.de>
parents: 2464
diff changeset
  1067
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
  1068
    ws := WriteStream on:(String new:aString size + 20).
2500
Stefan Vogel <sv@exept.de>
parents: 2464
diff changeset
  1069
    self urlEncode:aString on:ws.
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1070
    ^ ws contents
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1071
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1072
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1073
    "
2500
Stefan Vogel <sv@exept.de>
parents: 2464
diff changeset
  1074
      self unEscape:(self urlEncoded:'_-.*Frankfurt(Main) Hbf')
Stefan Vogel <sv@exept.de>
parents: 2464
diff changeset
  1075
      self urlEncoded:'_-.*Frankfurt(Main) Hbf'
Stefan Vogel <sv@exept.de>
parents: 2464
diff changeset
  1076
Stefan Vogel <sv@exept.de>
parents: 2464
diff changeset
  1077
      self unEscape:(self urlEncoded:'-_.*%exept;')
Stefan Vogel <sv@exept.de>
parents: 2464
diff changeset
  1078
      self urlEncoded:'-_.*%exept;'
5390
686a234472a7 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 5068
diff changeset
  1079
686a234472a7 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 5068
diff changeset
  1080
      self urlEncoded:'Не только в сервере, но и в ComSpec, чтобы дочерние КОНСОЛЬНЫЕ процессы могли пользоваться редиректами'
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1081
    "
2464
ebff59707514 Patch from CG for UBS
Stefan Vogel <sv@exept.de>
parents: 2458
diff changeset
  1082
2522
Claus Gittinger <cg@exept.de>
parents: 2500
diff changeset
  1083
    "Modified: / 09-01-2011 / 10:43:37 / cg"
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
  1084
    "Modified: / 07-02-2017 / 14:54:12 / stefan"
2066
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1085
!
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1086
2436
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1087
withAllSpecialHTMLCharactersEscaped:aStringOrCharacter
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1088
    "replace ampersand, less, greater and quotes by html-character escapes"
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1089
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1090
    "/ TODO: this is similar to escapeCharacterEntities.
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1091
    "/ we should refactor this into one method only (can we do hex escapes always ?).
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1092
    "/ Notice, that these two methods came into existance due to historic reasons
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1093
    "/ and were developed independent of each other, but later moved to this common place.
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1094
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1095
    |resultStream|
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1096
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1097
"/    orgs  := #( $&      $<     $>     $"   $').
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1098
"/    repls := #( '&amp;' '&lt;' '&gt;' &quot; &apos;).
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1099
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1100
    (aStringOrCharacter isString
3098
2ae8f1b57bc1 class: HTMLUtilities
sr
parents: 2866
diff changeset
  1101
    and:[ (aStringOrCharacter includesAny:'&<>''"') not ]) ifTrue:[^ aStringOrCharacter].
2436
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1102
3544
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
  1103
    resultStream := CharacterWriteStream on:''.
2436
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1104
    aStringOrCharacter asString do:[:eachCharacter |
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1105
        "/ huh - a switch. Sorry, but this method is used heavily.
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1106
        eachCharacter == $&
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1107
            ifTrue:[ resultStream nextPutAll:'&amp;' ]
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1108
            ifFalse:[
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1109
        eachCharacter == $<
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1110
            ifTrue:[ resultStream nextPutAll:'&lt;' ]
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1111
            ifFalse:[
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1112
        eachCharacter == $>
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1113
            ifTrue:[ resultStream nextPutAll:'&gt;' ]
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1114
            ifFalse:[
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1115
        eachCharacter == $"
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1116
            ifTrue:[ resultStream nextPutAll:'&quot;' ]
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1117
            ifFalse:[
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1118
        eachCharacter == $'
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1119
            ifTrue:[ resultStream nextPutAll:'&apos;' ]
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1120
            ifFalse:[
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1121
                resultStream nextPut:eachCharacter
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1122
            ]]]]].
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1123
    ].
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1124
    ^ resultStream contents
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1125
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1126
    "
3312
fe3d83508353 class: HTMLUtilities
sr
parents: 3098
diff changeset
  1127
     self withAllSpecialHTMLCharactersEscaped:'<>#&'     
fe3d83508353 class: HTMLUtilities
sr
parents: 3098
diff changeset
  1128
     self withAllSpecialHTMLCharactersEscaped:$<
fe3d83508353 class: HTMLUtilities
sr
parents: 3098
diff changeset
  1129
     self withAllSpecialHTMLCharactersEscaped:$#
2436
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1130
    "
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1131
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1132
    "Modified: / 05-12-2006 / 13:48:59 / cg"
3544
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
  1133
    "Modified: / 06-05-2015 / 15:41:06 / sr"
2436
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1134
!
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1135
2066
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1136
withSpecialHTMLCharactersEscaped:aStringOrCharacter
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1137
    "replace ampersand, less and greater by html-character escapes"
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1138
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1139
    "/ TODO: this is similar to escapeCharacterEntities.
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1140
    "/ we should refactor this into one method only (can we do hex escapes always ?).
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1141
    "/ Notice, that these two methods came into existance due to historic reasons
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1142
    "/ and were developed independent of each other, but later moved to this common place.
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1143
2866
259f841e2554 class: HTMLUtilities
Stefan Vogel <sv@exept.de>
parents: 2554
diff changeset
  1144
    |resultStream|
2066
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1145
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1146
"/    orgs  := #( $&      $<     $>     ).
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1147
"/    repls := #( '&amp;' '&lt;' '&gt;' ).
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1148
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1149
    (aStringOrCharacter isString
2866
259f841e2554 class: HTMLUtilities
Stefan Vogel <sv@exept.de>
parents: 2554
diff changeset
  1150
     and:[ (aStringOrCharacter isWideString not)
259f841e2554 class: HTMLUtilities
Stefan Vogel <sv@exept.de>
parents: 2554
diff changeset
  1151
     and:[ (aStringOrCharacter includesAny:'&<>') not ]]) ifTrue:[^ aStringOrCharacter].
2066
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1152
3544
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
  1153
    resultStream := CharacterWriteStream on:''.
2066
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1154
    aStringOrCharacter asString do:[:eachCharacter |
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1155
        "/ huh - a switch. Sorry, but this method is used heavily.
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1156
        eachCharacter == $&
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1157
            ifTrue:[ resultStream nextPutAll:'&amp;' ]
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1158
            ifFalse:[
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1159
        eachCharacter == $<
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1160
            ifTrue:[ resultStream nextPutAll:'&lt;' ]
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1161
            ifFalse:[
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1162
        eachCharacter == $>
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1163
            ifTrue:[ resultStream nextPutAll:'&gt;' ]
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1164
            ifFalse:[
2554
7cd0f7a16fad changed: #withSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2523
diff changeset
  1165
"/        eachCharacter codePoint > 16r7F
7cd0f7a16fad changed: #withSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2523
diff changeset
  1166
"/            ifTrue:[ 
7cd0f7a16fad changed: #withSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2523
diff changeset
  1167
"/                resultStream
7cd0f7a16fad changed: #withSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2523
diff changeset
  1168
"/                    nextPutAll:'&#';
7cd0f7a16fad changed: #withSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2523
diff changeset
  1169
"/                    nextPutAll:(eachCharacter codePoint printString);
7cd0f7a16fad changed: #withSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2523
diff changeset
  1170
"/                    nextPutAll:';']
7cd0f7a16fad changed: #withSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2523
diff changeset
  1171
"/            ifFalse:[
2066
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1172
                resultStream nextPut:eachCharacter
2554
7cd0f7a16fad changed: #withSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2523
diff changeset
  1173
"/            ]
2066
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1174
            ]]].
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1175
    ].
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1176
    ^ resultStream contents
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1177
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1178
    "
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1179
     self withSpecialHTMLCharactersEscaped:'<>#&'
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1180
     self withSpecialHTMLCharactersEscaped:$<
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1181
     self withSpecialHTMLCharactersEscaped:$#
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1182
    "
0ee2ef2d018c more common code
Claus Gittinger <cg@exept.de>
parents: 2058
diff changeset
  1183
2554
7cd0f7a16fad changed: #withSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2523
diff changeset
  1184
    "Modified: / 13-04-2011 / 23:13:32 / cg"
3544
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
  1185
    "Modified: / 06-05-2015 / 15:41:16 / sr"
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1186
! !
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1187
3647
738252558e04 #DOCUMENTATION
Claus Gittinger <cg@exept.de>
parents: 3640
diff changeset
  1188
!HTMLUtilities class methodsFor:'queries'!
738252558e04 #DOCUMENTATION
Claus Gittinger <cg@exept.de>
parents: 3640
diff changeset
  1189
738252558e04 #DOCUMENTATION
Claus Gittinger <cg@exept.de>
parents: 3640
diff changeset
  1190
isUtilityClass
738252558e04 #DOCUMENTATION
Claus Gittinger <cg@exept.de>
parents: 3640
diff changeset
  1191
    ^ self == HTMLUtilities
738252558e04 #DOCUMENTATION
Claus Gittinger <cg@exept.de>
parents: 3640
diff changeset
  1192
! !
738252558e04 #DOCUMENTATION
Claus Gittinger <cg@exept.de>
parents: 3640
diff changeset
  1193
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1194
!HTMLUtilities class methodsFor:'serving-helpers'!
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1195
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1196
escape:aString
2436
a5537ae7be4a added: #withAllSpecialHTMLCharactersEscaped:
Claus Gittinger <cg@exept.de>
parents: 2434
diff changeset
  1197
    "helper to escape invalid/dangerous characters in an url's arguments or post-fields.
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1198
     These are:
3456
8a3302fd3cce class: HTMLUtilities
Claus Gittinger <cg@exept.de>
parents: 3312
diff changeset
  1199
        control characters, dQuote, '+', ';', '?', '&' and space -> %XX ascii as hex digits
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1200
        %     -> %%
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1201
    "
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1202
3544
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
  1203
    | rs ws c cp|
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1204
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1205
    rs := ReadStream on: aString.
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1206
    ws := WriteStream on: ''.
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1207
    [ rs atEnd ] whileFalse: [
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1208
        c := rs next.
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1209
        c == $% ifTrue:[
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1210
            ws nextPutAll: '%%'.
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1211
        ] ifFalse:[
3544
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
  1212
            (((cp := c codePoint) < 16r7F)
3456
8a3302fd3cce class: HTMLUtilities
Claus Gittinger <cg@exept.de>
parents: 3312
diff changeset
  1213
             and:[ ('+;?&" ' includes:c) not ]) ifTrue: [ 
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1214
                ws nextPut: c.
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1215
            ] ifFalse:[
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1216
                ws nextPut: $%.
4217
1dac9014b77a #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4204
diff changeset
  1217
                cp printOn:ws base:16 size:(cp > 16rFF ifTrue:[4] ifFalse:[2]) fill:$0.
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1218
            ]
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1219
        ]
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1220
    ].
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1221
    ^ ws contents
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1222
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1223
    "
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1224
     self escape:'a b'      
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1225
     self escape:'a%b'    
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1226
     self escape:'a b'      
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1227
     self escape:'a+b'      
4302
f50a1263f3ce #BUGFIX by stefan
Stefan Vogel <sv@exept.de>
parents: 4297
diff changeset
  1228
     self escape:'aäüöb'      
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1229
    "
3544
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
  1230
73c88216a4f2 class: HTMLUtilities
sr
parents: 3456
diff changeset
  1231
    "Modified: / 06-05-2015 / 16:07:18 / sr"
4217
1dac9014b77a #REFACTORING by cg
Claus Gittinger <cg@exept.de>
parents: 4204
diff changeset
  1232
    "Modified: / 25-11-2016 / 16:37:53 / cg"
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1233
! !
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1234
2144
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1235
!HTMLUtilities class methodsFor:'text processing helpers'!
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1236
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1237
plainTextOfHTML:htmlString
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1238
    "given some HTML, extract the raw text. 
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1239
     Can be used to search for strings in some html text."
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1240
3545
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
  1241
    |parser doc s first|
2144
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1242
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1243
    parser := HTMLParser new.
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1244
    doc := parser parseText:htmlString.
3660
628279cf644c #REFACTORING
Stefan Vogel <sv@exept.de>
parents: 3659
diff changeset
  1245
    s := CharacterWriteStream on:(String new:100).
3545
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
  1246
    first := true.
2144
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1247
    doc markUpElementsDo:[:el |
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1248
        |t|
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1249
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1250
        el isTextElement ifTrue:[
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1251
            t := el text withoutSeparators.
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1252
            t notEmpty ifTrue:[
3545
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
  1253
                first ifFalse:[    
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
  1254
                    s space.
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
  1255
                ].
2144
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1256
                s nextPutAll:t.
3545
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
  1257
                first := false    
2144
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1258
            ].
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1259
        ] ifFalse:[
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1260
            "/ ignore non-text; however, we could care for text in info-titles
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1261
            "/ or scripts as well...
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1262
        ].
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1263
    ].
3659
a226a9108bce #REFACTORING
Stefan Vogel <sv@exept.de>
parents: 3647
diff changeset
  1264
    ^ s contents
2144
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1265
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1266
    "
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1267
     self plainTextOfHTML:'
4737
610d483cb00a #DOCUMENTATION by stefan
Stefan Vogel <sv@exept.de>
parents: 4712
diff changeset
  1268
            bla1 bla2 <br>bla3 <table><tr><td>bla4</td></tr></table> bla5<p>bla6'
610d483cb00a #DOCUMENTATION by stefan
Stefan Vogel <sv@exept.de>
parents: 4712
diff changeset
  1269
     self plainTextOfHTML:'Hello World'        
2144
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1270
    "
3545
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
  1271
070008476ef8 class: HTMLUtilities
sr
parents: 3544
diff changeset
  1272
    "Modified: / 06-05-2015 / 17:02:36 / sr"
2144
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1273
! !
c89258333f4d *** empty log message ***
Claus Gittinger <cg@exept.de>
parents: 2087
diff changeset
  1274
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1275
!HTMLUtilities class methodsFor:'documentation'!
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1276
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1277
version
3640
098175b79b25 #BUGFIX
sr
parents: 3557
diff changeset
  1278
    ^ '$Header$'
2434
5625df4b6119 comment/format in: #escapeCharacterEntities:
Claus Gittinger <cg@exept.de>
parents: 2179
diff changeset
  1279
!
5625df4b6119 comment/format in: #escapeCharacterEntities:
Claus Gittinger <cg@exept.de>
parents: 2179
diff changeset
  1280
5625df4b6119 comment/format in: #escapeCharacterEntities:
Claus Gittinger <cg@exept.de>
parents: 2179
diff changeset
  1281
version_CVS
3640
098175b79b25 #BUGFIX
sr
parents: 3557
diff changeset
  1282
    ^ '$Header$'
2058
f407ff58f780 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
  1283
! !
3098
2ae8f1b57bc1 class: HTMLUtilities
sr
parents: 2866
diff changeset
  1284