#OTHER by mawalch
authormawalch
Wed, 06 Jul 2016 16:38:45 +0200
changeset 20088 9d16b5a4a6ac
parent 20087 ac9df4df5981
child 20089 2468c0711e9a
#OTHER by mawalch spelling
Character.st
--- a/Character.st	Wed Jul 06 15:28:46 2016 +0200
+++ b/Character.st	Wed Jul 06 16:38:45 2016 +0200
@@ -1,5 +1,3 @@
-"{ Encoding: utf8 }"
-
 "
  COPYRIGHT (c) 1988 by Claus Gittinger
 	      All Rights Reserved
@@ -62,15 +60,15 @@
     Always compare using #= if there is any chance of a non-ascii character being involved.
 
     Once again (because beginners sometimes make this mistake):
-	This means: you may compare characters using #== ONLY IFF you are certain,
-	that the characters ranges is 0..255.
-	Otherwise, you HAVE TO compare using #=. (if in doubt, always compare using #=).
-	Sorry for this inconvenience, but it is (practically) impossible to keep
-	the possible maximum of 2^32 characters (Unicode) around, for that convenience alone.
+        This means: you may compare characters using #== ONLY IFF you are certain,
+        that the characters ranges is 0..255.
+        Otherwise, you HAVE TO compare using #=. (if in doubt, always compare using #=).
+        Sorry for this inconvenience, but it is (practically) impossible to keep
+        the possible maximum of 2^32 characters (Unicode) around, for that convenience alone.
 
     In ST/X, N is (currently) 1024. This means that all the latin characters and some others are
     kept as singleton in the CharacterTable class variable (which is also used by the VM when characters
-    are instanciated).
+    are instantiated).
 
     Methods marked as (JS) come from the manchester Character goody
     (CharacterComparing) by Jan Steinman, which allow Characters to be used as
@@ -79,7 +77,7 @@
     Some of these have been modified a bit.
 
     WARNING: characters are known by compiler and runtime system -
-	     do not change the instance layout.
+             do not change the instance layout.
 
     Also, although you can create subclasses of Character, the compiler always
     creates instances of Character for literals ...
@@ -88,43 +86,43 @@
     Therefore, it may not make sense to create a character-subclass.
 
     Case Mapping in Unicode:
-	There are a number of complications to case mappings that occur once the repertoire
-	of characters is expanded beyond ASCII.
-
-	* Because of the inclusion of certain composite characters for compatibility,
-	  such as U+01F1 'DZ' capital dz, there is a third case, called titlecase,
-	  which is used where the first letter of a word is to be capitalized
-	  (e.g. Titlecase, vs. UPPERCASE, or lowercase).
-	  For example, the title case of the example character is U+01F2 'Dz' capital d with small z.
-
-	* Case mappings may produce strings of different length than the original.
-	  For example, the German character U+00DF small letter sharp s expands when uppercased to
-	  the sequence of two characters 'SS'.
-	  This also occurs where there is no precomposed character corresponding to a case mapping.
-	  *** This is not yet implemented (in 5.2) ***
-
-	* Characters may also have different case mappings, depending on the context.
-	  For example, U+03A3 capital sigma lowercases to U+03C3 small sigma if it is not followed
-	  by another letter, but lowercases to 03C2 small final sigma if it is.
-	  *** This is not yet implemented (in 5.2) ***
-
-	* Characters may have case mappings that depend on the locale.
-	  For example, in Turkish the letter 0049 'I' capital letter i lowercases to 0131 small dotless i.
-	  *** This is not yet implemented (in 5.2) ***
-
-	* Case mappings are not, in general, reversible.
-	  For example, once the string 'McGowan' has been uppercased, lowercased or titlecased,
-	  the original cannot be recovered by applying another uppercase, lowercase, or titlecase operation.
+        There are a number of complications to case mappings that occur once the repertoire
+        of characters is expanded beyond ASCII.
+
+        * Because of the inclusion of certain composite characters for compatibility,
+          such as U+01F1 'DZ' capital dz, there is a third case, called titlecase,
+          which is used where the first letter of a word is to be capitalized
+          (e.g. Titlecase, vs. UPPERCASE, or lowercase).
+          For example, the title case of the example character is U+01F2 'Dz' capital d with small z.
+
+        * Case mappings may produce strings of different length than the original.
+          For example, the German character U+00DF small letter sharp s expands when uppercased to
+          the sequence of two characters 'SS'.
+          This also occurs where there is no precomposed character corresponding to a case mapping.
+          *** This is not yet implemented (in 5.2) ***
+
+        * Characters may also have different case mappings, depending on the context.
+          For example, U+03A3 capital sigma lowercases to U+03C3 small sigma if it is not followed
+          by another letter, but lowercases to 03C2 small final sigma if it is.
+          *** This is not yet implemented (in 5.2) ***
+
+        * Characters may have case mappings that depend on the locale.
+          For example, in Turkish the letter 0049 'I' capital letter i lowercases to 0131 small dotless i.
+          *** This is not yet implemented (in 5.2) ***
+
+        * Case mappings are not, in general, reversible.
+          For example, once the string 'McGowan' has been uppercased, lowercased or titlecased,
+          the original cannot be recovered by applying another uppercase, lowercase, or titlecase operation.
 
     Collation Sequence:
-	*** This is not yet implemented (in 5.2) ***
+        *** This is not yet implemented (in 5.2) ***
 
     [author:]
-	Claus Gittinger
+        Claus Gittinger
 
     [see also:]
-	String TwoByteString Unicode16String Unicode32String
-	StringCollection Text
+        String TwoByteString Unicode16String Unicode32String
+        StringCollection Text
 "
 ! !
 
@@ -313,6 +311,7 @@
     ^ self codePoint:anInteger
 ! !
 
+
 !Character class methodsFor:'accessing untypeable characters'!
 
 controlCharacter:char
@@ -359,6 +358,7 @@
     ^ self codePoint:41
 ! !
 
+
 !Character class methodsFor:'constants'!
 
 backspace
@@ -601,6 +601,9 @@
     "
 ! !
 
+
+
+
 !Character methodsFor:'Compatibility-Dolphin'!
 
 isAlphaNumeric
@@ -648,6 +651,8 @@
       or:[ (asciivalue == 247 ) ]]]]]
 ! !
 
+
+
 !Character methodsFor:'accessing'!
 
 codePoint
@@ -1498,7 +1503,7 @@
     ^ s contents
 
     "
-     'ä' utf8Encoded 
+     'ä' utf8Encoded 
      'a' utf8Encoded 
     "
 ! !
@@ -2637,9 +2642,9 @@
 
     "
      $e asNonDiacritical
-     $é asNonDiacritical
-     $ä asNonDiacritical
-     $Ã¥ asNonDiacritical
+     $é asNonDiacritical
+     $ä asNonDiacritical
+     $å asNonDiacritical
     "
 !