hg/stx-libbasic: LimitedPrecisionReal.st@b12f8693fe6f


"{ Encoding: utf8 }"

"
 COPYRIGHT (c) 1994 by Claus Gittinger
              All Rights Reserved

 This software is furnished under a license and may be used
 only in accordance with the terms of that license and with the
 inclusion of the above copyright notice.   This software may not
 be provided or otherwise made available to, or used by, any
 other person.  No title to or ownership of the software is
 hereby transferred.
"
"{ Package: 'stx:libbasic' }"

"{ NameSpace: Smalltalk }"

Number subclass:#LimitedPrecisionReal
	instanceVariableNames:''
	classVariableNames:''
	poolDictionaries:''
	category:'Magnitude-Numbers'
!

!LimitedPrecisionReal class methodsFor:'documentation'!

copyright
"
 COPYRIGHT (c) 1994 by Claus Gittinger
              All Rights Reserved

 This software is furnished under a license and may be used
 only in accordance with the terms of that license and with the
 inclusion of the above copyright notice.   This software may not
 be provided or otherwise made available to, or used by, any
 other person.  No title to or ownership of the software is
 hereby transferred.
"
!

documentation
"
    Abstract superclass for any-precision floating point numbers (i.e. IEEE floats and doubles).

    Short summary for beginners (find details in wikipedia):
    ========================================================
    
    Floating point numbers are represented with a mantissa and an exponent, and the number's value is: 
        mantissa * (2 raisedTo: exponent) 
    with (1 > mantissa >= 0) and exponent adjusted as required for the mantissa to be in that range
    (so called ''normalized'')
    
    therefore,
        13 asFloat mantissa -> 0.8125
        13 asFloat exponent ->  4  
        0.8125 * (2 raisedTo:4) -> 13

    and:    
        104 asFloat mantissa -> 0.8125
        104 asFloat exponent -> 7  
        0.8125 * (2 raisedTo:7) -> 104

    and:    
        0.1 mantissa -> 0.8
        0.1 exponent -> -3  
        0.8 * (2 raisedTo:-3) -> 0.1

    however:    
        (1 / 3.0) mantissa -> 0.666666666666667
        (1 / 3.0) exponent -> -1  
        0.666666666666667 * (2 raisedTo:-3) -> 0.1


    Danger in using Floats:
    =======================

    Beginners seem to forget (or never learn?) that flt. point numbers are always APPROXIMATIONs of some value.
    You may never ever use them when exact results are neeed (i.e. when computing money!!)
    Take a look at the FixedPoint class for that.
    See also 'Float comparison' below.
    
    
    The Float/Double confusion in ST/X:
    ===================================
    
    Due to historic reasons, ST/X's Floats are what Doubles are in VisualWorks.
    
    The reason is that in some Smalltalks, double floats are called Float, and no single float exists (VSE, V'Age),
    whereas in others, there are both Float and Double classes (VisualWorks).
    In order to allow code from both families to be loaded into ST/X without a missing class error, and without
    loosing precision, we decided to use IEEE doubles as the internal representation of Float 
    and make Double an alias to it.
    This should work for either family (except for the unexpected additional precision in some cases).
    
    If you really only want single precision floating point numbers, use ShortFloat instances.
    But be aware that there is usually no advantage (neither in memory usage, due to memory alignment restrictions,
    nor in speed), as these days, the CPUs are just as fast doing double precision operations.
    (There might be a noticable difference when doing bulk operations, and you should consider using FloatArray for those).

    
    Hardware supported precisions
    =============================

    The only really portable sizes are IEEE-single and IEEE-double floats (i.e. ShortFloat and Float instances).
    These are supported on all architectures.
    Some do provide an extended precision floating pnt. number,
    however, the downside is that CPU-architects did not agree on a common format and precision: 
    some use 80 bits, others 96 and others even 128.
    See the comments in the LongFloat class for more details.
    We recommend using Float (i.e. IEEE doubles) unless absolutely required,
    and care for machine dependencies in the code otherwise.
    For higher precision needs, you may also try the new QDouble class, which gives you >200bits (60digits) 
    of precision on all machines (at a noticable performance price, though).

    
    Range and Precision of Storage Formats:
    =======================================

      Format |   Class    |   Array Class   | Bits / Significant  | Smallest Pos Number | Largest Pos Number | Significant Digits
             |            |                 |      (Binary)       |                     |                    |     (Decimal)
      -------+------------+-----------------+---------------------+---------------------+--------------------+--------------------
      half   |     --     | HalfFloatArray  |    16 / 11          |  6.10.... x 10−5    | 6.55...  x 10+5    |      3.3
      -------+------------+-----------------+---------------------+---------------------+--------------------+--------------------
      single | ShortFloat | FloatArray      |    32 / 24          |  1.175... x 10-38   | 3.402... x 10+38   |      6-9
      -------+------------+-----------------+---------------------+---------------------+--------------------+--------------------
      double | Float      | DoubleArray     |    64 / 53          |  2.225... x 10-308  | 1.797... x 10+308  |     15-17
      -------+------------+-----------------+---------------------+---------------------+--------------------+--------------------
      double | LongFloat  |     --          |   128 / 113         |  3.362... x 10-4932 | 1.189... x 10+4932 |     33-36
      ext    |            |                 |                     |                     |                    |
      (SPARC)|            |                 |                     |                     |                    |
      -------+            |                 |---------------------+---------------------+--------------------+--------------------
      double |            |                 |    96 / 64          |  3.362... x 10-4932 | 1.189... x 10+4932 |     18-21
      ext    |            |                 |                     |                     |                    |
      (x86)  |            |                 |                     |                     |                    |
      -------+------------+-----------------+---------------------+---------------------+--------------------+--------------------
        --   | QDouble    |     --          |   256 / 212         |  2.225... x 10-308  | 1.797... x 10+308  |     >=60
      -------+------------+-----------------+---------------------+---------------------+--------------------+--------------------
        --   | LargeFloat |     --          |     arbitrary       |  arbitrarily small  |  arbitrarily large |     arbitrary
      -------+------------+-----------------+---------------------+---------------------+--------------------+--------------------

    HalfFloats are only supported in fixed array containers. 
    This was added for OpenGL and other graphic libraries which allow for texture, 
    and vertex data to be passed quickly in that format (see http://www.opengl.org/wiki/Small_Float_Formats).
    
    Long- and LargeFloat are not supported as array containers.
    These formats are seldom used for bulk data.

    QDoubles are special soft floats; slower in performance, but providing 4 times the precision of regular doubles.

    To see the differences in precision:
        
        '%60.58f' printf:{ 1 asShortFloat exp } -> '2.718281828459045*090795598298427648842334747314453125'          (32 bits)
        '%60.58f' printf:{ 1 asFloat exp }      -> '2.718281828459045*090795598298427648842334747314453125'          (64 bits)
        '%60.58f' printf:{ 1 asLongFloat exp }  -> '2.718281828459045235*4281681079939403389289509505033493041992'   (only 80 valid bits on x86)
        
        '%60.58f' printf:{ 1 asQDouble exp }    -> '2.71828182845904523536028747135266249775724709369995957496698'   (>200 bits)

        correct value is:                           2.71828182845904523536028747135266249775724709369995957496696762772407663035354759457138217852516642742746

    Bulk Containers:
    ================
    If you have a vector or matrix (and especially: large ones) of floating point numbers, the well known
    Array is a very inperformant choice. The reason is that it keeps pointers to each of its elements, and each element
    (if it is a float) is itself stored somewhere in the object memory.
    Thus, there is both a space overhead (every float object has an object header, for class and other information), and
    also a performance overhead (extra indirection, cache misses and alignment inefficiencies).
    For this, the bulk numeric containers are provided, which keep the elements unboxed and properly aligned.
    Use them for matrices and large numeric vectors. They also provide some optimized bulk operation methods,
    such as adding, multiplying etc.
    Take a look at FloatArray, DoubleArray, HalfFloatArray etc.

    
    Comparing Floats:
    =================
    Due to rounding errors (usually on the last bit(s)), you shalt not compare two floating point numbers
    using the #= operator. For example, the value 0.1 cannot be represented as a sum of powers-of-two fractions,
    and will therefore always be an approximation with a half bit error in the last bit of the mantissa.
    Usually, the print functions take this into consideration and return a (faked) '0.1'.
    However, this half bit error may accumulate, for example, when multiplying that by 0.1 then by 100, 
    the error may get large enough to be no longer pushed under the rug by the print function, 
    and you will get '0.9999999999999' from it.

    Also, comparing against a proper 1.0 (which is representable as an exact power of 2), 
    you will get a false result.
    i.e. (0.1 * 0.1 * 100 ~= 1.0) and (0.1 * 0.1 * 100 - 1.0) ~= 0.0
    This often confuses non-computer scientists (and occasionally even some of those).

    For this, you should always provide an epsilon value, when comparing two non-integer numbers. 
    The epsilon value is the distance you accept two number to be apart to be still considered equal. 
    Effectively the epsilon says are those nearer than this epsilon?.

    Now we could say is the delta between two numbers smaller than 0.00001,
    and get a reasonable answer for big numbers. But what if we compare two tiny numbers?
    Then a reasonable epsilon must also be much smaller!!

    Actually, the epsilon should always be computed dynamically depending on the two values compared.
    That is what the #isAlmostEqualTo:nEpsilon: method does for you. It does not take an absolute epsilon,
    but instead the number of distinct floating point numbers that the two compared floats may be apart.
    That is: the number of actually representable numbers between those two. 
    Effectively, that is the difference between the two mantissas, 
    when the numbers are scaled to the same exponent, taking the number of mantissa bits into account.

    [author:]
        Claus Gittinger

    [see also:]
        Fraction FixedPoint
"
! !

!LimitedPrecisionReal class methodsFor:'instance creation'!

fromInteger:anInteger
    "return a float with anInteger's value.
     Since floats have a limited precision, you usually loose bits when doing this
     with a large integer 
     (i.e. when numDigits is above the flt. pnt number's precision)
     (see Float decimalPrecision, LongFloat decimalPrecision."

    |newFloat sign absVal completeAbsVal delta mask|

    newFloat := self zero.
    sign := anInteger sign.
    (sign ~~ 0) ifTrue:[
        "be careful to round to nearest even floating point value -
         thanks to Nicolas Cellier for this code"

        absVal := anInteger abs.
        delta := absVal highBit - self precision.
        delta > 0 ifTrue: [
            completeAbsVal := absVal.
            "eliminate insignificant/trailing bits"
            absVal := absVal rightShift: delta.
            "inexact := trailingBits ~= 0.
             Round to nearest even"
            (completeAbsVal bitAt:delta) ~= 0 ifTrue:[
                mask := (1 bitShift:delta-1) - 1.
                ((completeAbsVal bitAnd:mask) ~= 0 or:[absVal odd]) ifTrue:[
                    absVal := absVal + 1
                ].
            ].
        ] ifFalse: [
            delta := 0
        ].
        absVal digitLength to:1 by:-1 do:[:i |
            newFloat := (newFloat * 256) + (absVal digitByteAt:i).
        ].
        delta ~~ 0 ifTrue: [
            newFloat := newFloat timesTwoPower:delta.
        ].
        (sign < 0) ifTrue:[
            newFloat := newFloat negated
        ]
    ].
    ^ newFloat


    "
     ShortFloat fromInteger:2
     12345678901234567890 asShortFloat            

     1234567890 asFloat                     
     1234567890 asFloat asInteger                    
     -1234567890 asFloat asInteger                    

     12345678901234567890 asFloat storeString            
     12345678901234567890 asFloat asInteger   
     -12345678901234567890 asFloat asInteger   

     12345678901234567890 asLongFloat           
     12345678901234567890 asLongFloat asInteger 
     -12345678901234567890 asLongFloat asInteger 

     123456789012345678901234567890 asLongFloat           
     123456789012345678901234567890 asLongFloat asInteger  
     -123456789012345678901234567890 asLongFloat asInteger  

     1234567890123456789012345678901234567890 asLongFloat           
     1234567890123456789012345678901234567890 asLongFloat asInteger  
     -1234567890123456789012345678901234567890 asLongFloat asInteger

     'this test is on 65 bits'.
     self assert: 16r1FFFFFFFFFFFF0801 asDouble ~= 16r1FFFFFFFFFFFF0800 asDouble.
     'this test is on 64 bits'.
     self assert: 16r1FFFFFFFFFFFF0802 asDouble ~= 16r1FFFFFFFFFFFF0800 asDouble.
     'nearest even is upper'.
     self assert: 16r1FFFFFFFFFFF1F800 asDouble = 16r1FFFFFFFFFFF20000 asDouble.
     'nearest even is lower'.
     self assert: 16r1FFFFFFFFFFFF0800 asDouble = 16r1FFFFFFFFFFFF0000 asDouble.
    "

    "Modified: / 25-08-2017 / 12:34:44 / cg"
    "Modified (format): / 08-06-2019 / 02:17:43 / Claus Gittinger"
!

fromLimitedPrecisionReal:anLPReal
    "return a float with anLPReals value.
     You might loose bits when doing this.
     Slow fallback."

    |fract|

    anLPReal isFinite ifFalse:[
        anLPReal isNaN ifTrue:[^ self NaN].
        anLPReal negative ifTrue:[^ self negativeInfinity].
        ^ self infinity
    ].

    fract := anLPReal asTrueFraction.
    ^ (self fromInteger:fract numerator) / (self fromInteger:fract denominator)
!

fromNumerator:numerator denominator:denominator
    "Create a limited precision real from a Rational.
     This version will answer the nearest flotaing point value,
     according to IEEE 754 round to nearest even default mode"

    |a b q r exponent floatExponent n ha hb hq q1 eMin precision nSubHq|

    a := numerator abs.
    b := denominator abs.
    ha := a highBit.
    hb := b highBit.    

    precision := self precision.

    "If both numerator and denominator are represented exactly in floating point number,
     then fastest thing to do is to use hardwired float division"
    (ha <= precision and:[hb <= precision]) ifTrue: [
        ^ (self coerce:numerator) / (self coerce:denominator)
    ].

    "Try and obtain a mantissa with 1 bit excess precision.
     First guess is rough, we might get one more bit or one less"
    exponent := ha - hb - precision - 1.
    exponent > 0
        ifTrue: [b := b bitShift: exponent]
        ifFalse: [a := a rightShift: exponent].
    q := a quo: b.
    r := a - (q * b).
    hq := q highBit.

    "check for gradual underflow, in which case we should use less bits"
    floatExponent := exponent + hq - 1.
    eMin := 2 - (1 bitShift: (self numBitsInExponent - 1)).
    n := floatExponent >= eMin
        ifTrue: [precision + 1]
        ifFalse: [precision + 1 + floatExponent - eMin].

    nSubHq := n - hq.

    nSubHq < 0 ifTrue: [
        exponent := exponent + hq - n.
        r := (q bitAnd: (1 bitShift:nSubHq) - 1) * b + r.
        q := q bitShift:nSubHq
    ] ifFalse:[nSubHq > 0 ifTrue: [
        exponent := exponent + hq - n.
        q1 := (r bitShift:nSubHq) quo: b.
        q := (q bitShift:nSubHq) bitAnd: q1.
        r := r - (q1 * b)
    ]].

    "check if we should round upward.
     The case of exact half (q bitAnd: 1) isZero not & (r isZero)
     will be handled by self fromInteger: conversion"
    ((q bitAnd:1) ~~ 0 and:[r ~~ 0]) ifTrue:[
        q := q + 1
    ].

    (numerator negative xor:denominator negative) ifTrue: [
        q := q negated.
    ].

    ^ (self coerce:q) timesTwoPower:exponent

    "
        Time millisecondsToRun:[
            1000000  timesRepeat:[
                Float fromNumerator:12345678901234567890 denominator:987654321
            ].
        ]

        |fraction|
        fraction := 12345678901234567890//987654321.
        Time millisecondsToRun:[
            1000000  timesRepeat:[
                fraction asFloat
            ].
        ]
    "

    "Modified: / 25-08-2017 / 12:34:55 / cg"
!

new:aNumber
    "catch this message - not allowed for floats/doubles"

    self error:'Floats/Doubles cannot be created with new:'
!

readFrom:aStringOrStream onError:exceptionBlock
    "read a float from a string"

    |num|

    num := super readFrom:aStringOrStream onError:[^ exceptionBlock value].
    ^ self coerce:num

    "
     Float readFrom:'.1'
     Float readFrom:'0.1'
     Float readFrom:'0'

     ShortFloat readFrom:'.1'
     ShortFloat readFrom:'0.1'
     ShortFloat readFrom:'0'

     LongFloat readFrom:'.1'
     LongFloat readFrom:'0.1'
     LongFloat readFrom:'0'

     LimitedPrecisionReal readFrom:'bla' onError:nil
     Float readFrom:'bla' onError:nil
     ShortFloat readFrom:'bla' onError:nil
    "

    "Created: / 07-01-1998 / 16:17:19 / cg"
    "Modified: / 23-11-2010 / 14:35:41 / cg"
! !

!LimitedPrecisionReal class methodsFor:'class initialization'!

initialize
    "initialize ANSI compliant float globals"

    Smalltalk at:#FloatE put:ShortFloat.
    Smalltalk at:#FloatD put:Float.
    Smalltalk at:#FloatQ put:LongFloat.

    "/ not defined in ANSI, but consistent in its naming
    Smalltalk at:#FloatQD put:QDouble.

    "
     self initialize
    "

    "Created: / 07-09-2001 / 14:02:45 / cg"
    "Modified (comment): / 21-06-2017 / 13:59:08 / cg"
! !

!LimitedPrecisionReal class methodsFor:'constants & defaults'!

NaN
    "return the constant NaN (not a Number) in my representation.
     Here, based on the assumption that division of zero by zero generates a NaN
     (which is defined as such in the IEEE standard).
     If a subclass does not, it has to redefine this method and generate a NaN differently"

    |zero|

    zero := self zero.
    ^ zero uncheckedDivide:zero.

    "
      ShortFloat NaN  
      Float NaN       
      LongFloat NaN   
      LargeFloat NaN   
      IEEEFloat NaN   
    "

    "Modified (comment): / 16-06-2017 / 11:03:28 / cg"
    "Modified (comment): / 06-06-2019 / 16:52:43 / Claus Gittinger"
!

computeEpsilon
    "compute the maximum relative spacing of instances of mySelf
     (i.e. the value-delta of the least significant bit).
     See https://en.wikipedia.org/wiki/Machine_epsilon"

    ^ (self coerce:self radix) raisedToInteger:(self precision - 1) negated.

    "
      Float radix 
      Float precision
      
      ShortFloat computeEpsilon -> 1.192093e-07 
      Float computeEpsilon      -> 2.22044604925031E-16
      LongFloat computeEpsilon  -> 1.084202172485504434E-19
      QDouble computeEpsilon    -> 7.77876909732643E-62    

      QuadFloat computeEpsilon   
      QuadFloat radix
      (QuadFloat coerce:QuadFloat radix)
      2 asQuadFloat
    "

    "Modified (comment): / 03-07-2017 / 12:12:06 / cg"
    "Modified: / 10-05-2018 / 01:19:25 / stefan"
    "Modified (comment): / 19-07-2019 / 17:26:30 / Claus Gittinger"
!

eBias
    "Answer the exponent's bias; 
     that is the offset of the zero exponent when stored.
     The computation below assumes standard IEEE format"

    ^ (1 << (self numBitsInExponent-1)) - 1

    "
     Float eBias       -> 1023
     ShortFloat eBias  -> 127
     LongFloat eBias   -> 16383
     QuadFloat eBias   -> 16383
     OctaFloat eBias   -> 262143
     QDouble eBias     -> 1023
    "
!

emax
    "return the largest exponent.
     The computation below assumes standard IEEE format"

    ^ (1 << (self numBitsInExponent-1))-1

    "
     Float emax       -> 1023
     ShortFloat emax  -> 127
     LongFloat emax   -> 16383
     QuadFloat emax   -> 16383
     OctaFloat emax   -> 262143
     QDouble emax     -> 1023
    "
!

emin
    "return the smallest exponent.
     The computation below assumes standard IEEE format"

    ^ 2 - (1 << (self numBitsInExponent-1))

    "
     Float emin       -> -1022
     ShortFloat emin  -> -126
     LongFloat emin   -> -16382
     QuadFloat emin   -> -16382
     OctaFloat emin   -> -262142
     QDouble emin     -> -1022
    "
!

epsilon
    "return the maximum relative spacing of instances of mySelf
     (i.e. the value-delta of the least significant bit)"

    ^ self radix asFloat raisedToInteger:(1 - self precision)

    "
     Float epsilon       -> 2.22044604925031E-16
     ShortFloat epsilon  -> 1.192093e-07
     LongFloat epsilon   -> 1.084202172485504434E-19
     QDouble epsilon     -> 7.778769097326426826491248689356e-62
    "

    "Modified (comment): / 12-06-2017 / 18:52:31 / cg"
    "Modified (comment): / 10-05-2018 / 01:08:48 / stefan"
!

fmax
    "The largest value allowed by instances of this class."

    |radix radixReal|

    radix := self radix.
    radixReal := (self fromInteger: radix).
    ^ ((self fromInteger: 1) -
            (radixReal timesTwoPower: self precision negated - 1)) * radix
            * (radixReal timesTwoPower: self emax - 1)

    "
     Float fmax       -> 1.79769313486232E+308
     ShortFloat fmax  -> 3.402823e+38
     LongFloat fmax   -> 1.189731495357231765E+4932
     QDouble fmax 
    "

    "Modified (comment): / 14-06-2017 / 19:17:39 / cg"
!

fmin
    "The smallest value allowed by instances of this class."

    ^ self subclassResponsibility
!

infinity
    "return an instance of myself which represents positive infinity (for my instances).
     Warning: do not compare equal against infinities;
     instead, check using isFinite or isInfinite"

    ^ self unity uncheckedDivide:self zero

    "
      ShortFloat infinity  
      Float infinity       
      LongFloat infinity   
      LargeFloat infinity   
      IEEEFloat infinity   
    "

    "Modified (comment): / 20-06-2017 / 13:46:52 / cg"
    "Modified (comment): / 09-06-2019 / 12:52:52 / Claus Gittinger"
!

maxSmallInteger
    "answer the largest possible SmallInteger value as instance of myself"

    ^ self fromInteger:SmallInteger maxVal.

    "
     Float maxSmallInteger.
     LongFloat maxSmallInteger.
     ShortFloat maxSmallInteger.
     QDouble maxSmallInteger.
    "

    "Modified (comment): / 20-06-2017 / 14:07:27 / cg"
    "Modified (comment): / 14-03-2018 / 20:11:28 / mawalch"
!

negativeInfinity
    "return an instance of myself which represents negative infinity (for my instances).
     Warning: do not compare equal against infinities;
     instead, check using isFinite or isInfinite"

    ^ self unity negated uncheckedDivide:self zero

    "
      ShortFloat negativeInfinity   
      Float negativeInfinity       
      LongFloat negativeInfinity   
      LargeFloat negativeInfinity   
      QDouble negativeInfinity   
      IEEEFloat negativeInfinity   
    "

    "Modified (comment): / 20-06-2017 / 13:46:42 / cg"
    "Modified (comment): / 09-06-2019 / 12:55:35 / Claus Gittinger"
! !

!LimitedPrecisionReal class methodsFor:'queries'!

decimalEmax
    "Answer how many digits of accuracy this class supports"

    ^ (self emax / (10.0 log:self radix))

    "
     ShortFloat emax   
     ShortFloat decimalEmax  

     Float emax  
     Float emin
     Float decimalEmax  

     LongFloat emax        
     LongFloat emin        
     LongFloat decimalEmax  
    "
!

decimalPrecision
    "return the number of valid decimal digits"

    ^ (self precision / (10.0 log:self radix)) 
        floor   "use floor, since the last decimal digit is usually wrong"

    "
     HalfFloat decimalPrecision  -> 3
     ShortFloat decimalPrecision -> 7
     Float decimalPrecision      -> 16 
     LongFloat decimalPrecision  -> 19
     QuadFloat decimalPrecision  -> 34
     OctaFloat decimalPrecision  -> 71
     QDouble decimalPrecision    -> 61
    "

    "Modified (comment): / 06-06-2019 / 13:21:04 / Claus Gittinger"
!

defaultPrintPrecision
    "return the number of decimal digits printed by default"

    ^ 6

    "
     ShortFloat defaultPrintPrecision -> 5 
     Float defaultPrintPrecision      -> 6 
     LongFloat defaultPrintPrecision  -> 8 
     QDouble defaultPrintPrecision    -> 10
     LargeFloat defaultPrintPrecision -> 12
    "

    "Created: / 17-06-2017 / 02:56:43 / cg"
!

denormalized
    "Return whether the instances of this class can
     represent values in denormalized format."

    ^ true
!

exactDecimalPrecision
    "return the exact number of decimal digits"

    ^ ((self precision) / (10.0 log:self radix))

    "
     HalfFloat exactDecimalPrecision  -> 3.612359947967774002
     ShortFloat exactDecimalPrecision -> 7.224719895935548004
     Float exactDecimalPrecision      -> 15.95458977019100184      
     LongFloat exactDecimalPrecision  -> 19.26591972249479468
     QuadFloat exactDecimalPrecision  -> 34.01638951002987185
     OctaFloat exactDecimalPrecision  -> 71.34410897236353654
     QDouble exactDecimalPrecision    -> 61.41011911545215804
    "

    "Created: / 06-06-2019 / 13:19:17 / Claus Gittinger"
!

hasSharedInstances
    "return true if this class can share instances when stored binary, 
     that is, instances with the same value can be stored by reference.
     Although not really shared, floats should be treated
     so, to be independent of the implementation of the arithmetic methods."

    ^ true
!

isAbstract
    "Return if this class is an abstract class.
     True is returned for LimitedPrecisionReal here; false for subclasses."

    ^ self == LimitedPrecisionReal

    "
     1.0 class isAbstract
    "
!

isIEEEFormat
    "return true, if this machine represents floats in IEEE format.
     Currently, no support is provided for non-ieee machines
     to convert their floats into this (which is only relevant,
     if such a machine wants to send floats as binary to some other
     machine).
     Machines with non-IEEE format are VAXen and IBM370-type systems
     (among others). Today, every system uses IEEE format floats."

%{
#ifdef __s390__
    RETURN(false);
#endif
%}.

    ^ true "/ this may be a lie
!

numBitsInExponent
    "return the number of bits in the exponent"

    ^ self subclassResponsibility
!

numBitsInIntegerPart
    "answer the number of bits in the integer part of the mantissa.
     I.e. 0 is returned if there is a hidden bit, 1 if not.
     Most floating point formats are normalized to get rid of the extra bit."

    ^ 0

    "Modified (comment): / 28-05-2019 / 08:58:11 / Claus Gittinger"
!

numBitsInMantissa
    "return the number of bits in the mantissa (the significant)
     Typically the precision is 1 more than the significant due to the hidden bit"

    ^ self subclassResponsibility
!

precision
    "return the number of valid mantissa bits"

    ^ self numBitsInMantissa + 1 - self numBitsInIntegerPart 

    "
      HalfFloatArray precision  
      ShortFloat precision  
      Float precision       
      LongFloat precision   
      QDouble precision  
    "

    "Modified (comment): / 12-06-2017 / 18:47:54 / cg"
    "Modified (comment): / 06-06-2019 / 13:25:00 / Claus Gittinger"
!

radix
    "answer the radix of my instance's exponent"

    self subclassResponsibility

    "Created: / 7.9.2001 / 14:08:20 / cg"
! !

!LimitedPrecisionReal methodsFor:'accessing'!

at:index
    "redefined to prevent access to individual bytes in a real."

    self error:'not allowed for floats/doubles'
!

at:index put:aValue
    "redefined to prevent access to individual bytes in a real"

    self error:'not allowed for floats/doubles'
! !

!LimitedPrecisionReal methodsFor:'arithmetic'!

* aNumber
    "return the product of the receiver and the argument."

"/ as soon as Float are float & Double are doubles,
"/ use:
"/    ^ aNumber productFromDouble:self asDouble

    ^ aNumber productFromFloat:self asFloat

    "Modified: 17.4.1996 / 12:35:36 / cg"
!

+ aNumber
    "return the sum of the receiver and the argument, aNumber"

"/ as soon as Float are float & Double are doubles,
"/ use:
"/    ^ aNumber sumFromDouble:self asDouble

    ^ aNumber sumFromFloat:self asFloat

    "Modified: 17.4.1996 / 12:35:55 / cg"
!

- aNumber
    "return the difference of the receiver and the argument, aNumber"

"/ as soon as Float are float & Double are doubles,
"/ use:
"/    ^ aNumber differenceFromDouble:self asDouble

    ^ aNumber differenceFromFloat:self asFloat

    "Modified: 17.4.1996 / 12:36:07 / cg"
!

/ aNumber
    "return the quotient of the receiver and the argument, aNumber"

    ((aNumber == 0) or:[aNumber = 0.0]) ifTrue:[
        ^ ZeroDivide raiseRequestWith:thisContext.
    ].
"/ as soon as Float are float & Double are doubles,
"/ use:
"/    ^ aNumber quotientFromDouble:self asDouble

    ^ aNumber quotientFromFloat:self asFloat

    "Modified: / 17.4.1996 / 12:36:21 / cg"
    "Modified: / 26.7.1999 / 10:46:11 / stefan"
!

// aNumber
    "return the integer quotient of dividing the receiver by aNumber with
     truncation towards negative infinity."

    ^ (self / aNumber) floor asInteger

    "Modified: 5.11.1996 / 11:45:37 / cg"
!

ceiling
    ^ self asTrueFraction ceiling
!

floor
    ^ self asTrueFraction floor
!

timesTwoPower:anInteger
    "multiply self by a power of two.
     Implementation takes care of preserving class and avoiding overflow/underflow
     Thanks to Nicolas Cellier for this code"

    |half two h1 h2|

    two := self coerce:2.0.

    anInteger abs highBit >= (self numBitsInExponent - 2) ifTrue:[
        half := anInteger // 2.

        anInteger even ifTrue:[
            "/ halves are equal
            h2 := h1 := (two raisedToInteger: half).
        ] ifFalse:[
            h1 := (two raisedToInteger: half).
            anInteger > 0 ifTrue:[
                h2 := h1 * two.
            ] ifFalse:[
                h2 := (two raisedToInteger: anInteger - half).
            ].
        ].
        ^ (self * h1) * h2
    ] ifFalse: [
        ^ self * (two raisedToInteger: anInteger).
    ]

    "
     (1 asShortFloat timesTwoPower: 3) class = ShortFloat.
     (1 asLongFloat timesTwoPower: 1024).
     (1 asFloat timesTwoPower: -1024) timesTwoPower: 1024.
     (1 asLongFloat timesTwoPower: -1024) timesTwoPower: 1024.

     (2.0 asShortFloat timesTwoPower: -150) timesTwoPower: 150    
     (2.0 asLongFloat timesTwoPower: -150) timesTwoPower: 150   
     (2.0 asFloat timesTwoPower: -150) timesTwoPower: 150       

     (2.0 asShortFloat timesTwoPower: -149) timesTwoPower: 149  
     (2.0 asLongFloat timesTwoPower: -149) timesTwoPower: 149    
     (2.0 asFloat timesTwoPower: -149) timesTwoPower: 149        

     Time millisecondsToRun:[
        1000000 timesRepeat:[
            (2.0 timesTwoPower: 150)
        ]
     ]  
    "

    "Modified (comment): / 28-05-2019 / 09:08:25 / Claus Gittinger"
! !

!LimitedPrecisionReal methodsFor:'bytes access'!

digitBytes
    "answer the float's digit bytes imnIEEE format.
     Use the native machine byte ordering."

    |sz "{ Class:SmallInteger }"
     bytes|

    sz := self basicSize.
    bytes := ByteArray new:sz.
    1 to:sz do:[:i|
        bytes at:i put:(self basicAt:i).
    ].
    ^ bytes

    "
        Float pi digitBytes
        ShortFloat pi digitBytes
    "
!

digitBytesMSB:msb
    "answer the float's digit bytes im IEEE format.
     If msb == true, use MSB byte order, otherwise LSB byte order."

    |sz "{ Class:SmallInteger }"
     bytes|

    self class isIEEEFormat ifFalse:[self error:'unsupported operation'].

    sz := self basicSize.
    bytes := ByteArray new:sz.

    (UninterpretedBytes isBigEndian == msb) ifTrue:[
        1 to:sz do:[:i|
            bytes at:i put:(self basicAt:i).
        ].
    ] ifFalse:[
        "reverse the bytes"
        1 to:sz do:[:i|
            bytes at:sz-i+1 put:(self basicAt:i).
        ].
    ].
    ^ bytes

    "
        Float pi digitBytesMSB:false
        Float pi digitBytesMSB:true
        ShortFloat pi digitBytesMSB:false
        ShortFloat pi digitBytesMSB:true
    "
! !

!LimitedPrecisionReal methodsFor:'coercing & converting'!

asFloat
    ^ Float fromLimitedPrecisionReal:self
!

asFraction
    "Answer a rational number (Integer or Fraction) representing the receiver.
     This conversion uses the continued fraction method to approximate 
     a floating point number.
     In contrast to #asTrueFraction, which returns exactly the value of the float,
     this rounds in the last significant bit of the floating point number.
    "

    |num1 denom1 num2 denom2 int frac newD temp limit|

    self isFinite ifFalse:[
"/        ^ self asMetaNumber.
        ^ self class
            raise:#domainErrorSignal
            receiver:self
            selector:#asFraction
            arguments:#()
            errorString:'Cannot represent non-finite number as fraction'
    ].

    limit := 10 raisedToInteger:self defaultNumberOfDigits.
    num1 := self truncated. 
    denom1 := 1.            "The first of two alternating denominators"
    num2 := 1.              "The second numerator"
    denom2 := 0.            "The second denominator--will update"
    int := num1.            "The integer part of self"
    frac := self fractionPart.             
    [frac = 0] whileFalse:[                
        newD := 1.0 / frac.                   
        int := newD truncated.        
        frac := newD fractionPart.      "save the fractional part for next time"
        temp := num2.                   "old numerator and save it"
        num2 := num1.                         
        num1 := num1 * int + temp.      "Update first numerator"
        temp := denom2.                 "old denominator and save it"
        denom2 := denom1.                    
        denom1 := int * denom1 + temp.  "Update first denominator"
        limit < denom1 ifTrue:[
            "Is ratio past float precision?  If so, pick which of the two ratios to use"
            num2 = 0.0 ifTrue:[
                "Is second denominator 0?"
                ^ Fraction numerator:num1 denominator:denom1
            ].
            ^ Fraction numerator:num2 denominator:denom2
        ]
    ].

    "If fractional part is zero, return the first ratio"
    denom1 = 1 ifTrue:[
        "Am i really an Integer?"
        ^ num1 "Yes, return Integer result"
    ].
    "Otherwise return Fraction result"
    ^ Fraction numerator:num1 denominator:denom1

    "
     1.1 asFraction      
     1.2 asFraction      
     0.3 asFraction   
     0.5 asFraction  
     (1/5) asFloat asFraction  
     (1/8) asFloat asFraction  
     (1/13) asFloat asFraction 
     (1/10) asFloat asFraction 
     (1/10) asFloat asTrueFraction asFixedPoint scale:20 
     3.14159 asFixedPoint scale:20        
     3.14159 storeString       
     3.14159 asFraction asFloat storeString       
     1.3 asFraction            
     1.0 asFraction            
     1E6 asFraction            
     1E-6 asFraction            
    "

    "Modified: / 25.10.1997 / 16:41:19 / cg"
!

asInteger
    "return an integer with same value - might truncate"

    |max maxF|

    self isFinite ifFalse:[
"/        ^ self asMetaNumber.
        ^ self class
            raise:#domainErrorSignal
            receiver:self
            selector:#asInteger
            arguments:#()
            errorString:'Cannot represent non-finite number as integer'
    ].

    self abs < 2e16 ifTrue:[
        "/ NOTICE: this must be redefined in float
        "/ subclasses to handle the smallinteger range;
        "/ i.e. this may only be invoked for reals
        "/ which are NOT within the smallInt range.
        "/ otherwise, endless recursion is the consequence.

        max := SmallInteger maxVal // 2 + 1.
        maxF := max asFloat.

        ^ (self quo:maxF) * max + (self rem:maxF) truncated
    ].
    ^ self asTrueFraction

    "
     12345.0 asInteger     
     1e15 asInteger        
     1e33 asInteger asFloat
     1e303 asInteger asFloat
    "

    "Modified: / 16.11.2001 / 14:15:33 / cg"
!

asLargeFloat
    "return a large float with (approximately) my value.
     If the largeFloat class is not present, a regular float is returned"

    ^ (LargeFloat ? LongFloat ? Float) fromLimitedPrecisionReal:self

    "Modified (comment): / 12-06-2017 / 20:57:05 / cg"
    "Modified: / 07-06-2018 / 16:28:03 / Claus Gittinger"
!

asLargeFloatPrecision:n
    "return a large float with (approximately) my value.
     If the largeFloat class is not present, a regular float is returned"

    LargeFloat notNil ifTrue:[
        ^ (LargeFloat fromLimitedPrecisionReal:self) precision:n
    ].
    ^ (LongFloat ? Float) fromLimitedPrecisionReal:self

    "
     1.0 asLargeFloatPrecision:10
    "

    "Created: / 26-05-2019 / 04:01:22 / Claus Gittinger"
    "Modified (format): / 26-05-2019 / 11:02:53 / Claus Gittinger"
!

asLimitedPrecisionReal
    "return a float of any precision with same value"

   ^ self
!

asLongFloat
    ^ LongFloat fromLimitedPrecisionReal:self
!

asRational
    "Answer a Rational number--Integer or Fraction--representing the receiver.
     Same as asFraction fro st-80 compatibility."

    ^ self asFraction

    "
     1.1 asRational      
     1.2 asRational      
     0.3 asRational   
     0.5 asRational 
     (1/5) asFloat asRational
     (1/8) asFloat asRational  
     (1/13) asFloat asRational 
     3.14159 asRational        
     3.14159 asRational asFloat       
     1.3 asRational  
     1.0 asRational  
    "
!

asShortFloat
    ^ ShortFloat fromLimitedPrecisionReal:self
!

asTrueFraction
    "Answer a fraction or integer that EXACTLY represents self,
     an any-precision IEEE floating point number, consisting of:
        numMantissaBits bits of normalized mantissa (i.e. with hidden leading 1-bit)
        optional numExtraBits between mantissa and exponent (normalized flag for ext-real)
        numExponentBits bits of 2s complement exponent
        1 sign bit.
     Taken from Float's asTrueFraction"

    |shifty sign expPart exp fraction fractionPart result zeroBitsCount
     numBytes numBits numBitsInMantissa maskMantissa numBitsInExponent maskExponent
     biasExponent numIntegerBits|

    self isFinite ifFalse:[
        ^ self asMetaNumber
"/        ^ self class
"/            raise:#domainErrorSignal
"/            receiver:self
"/            selector:#asTrueFraction
"/            arguments:#()
"/            errorString:'Cannot represent non-finite float as a fraction'.
    ].

    "Extract the bits of an IEEE anySize float "
    numBytes := self basicSize.
    numIntegerBits := self numBitsInIntegerPart.
    numBitsInMantissa := self numBitsInMantissa. maskMantissa := (1 bitShift:numBitsInMantissa) - 1.
    numBitsInExponent := self numBitsInExponent. maskExponent := (1 bitShift:numBitsInExponent) - 1.
    numBits := numBitsInMantissa + numBitsInExponent. "not (numBytes * 8 - 1) -- there could be 80 float bits in 96 overall bits"
    biasExponent := maskExponent bitShift:-1.

    shifty := LargeInteger basicNew numberOfDigits:numBytes.
    UninterpretedBytes isBigEndian ifTrue:[
        1 to:numBytes do:[:i | shifty digitAt:(numBytes+1-i) put:(self basicAt:i)].
    ] ifFalse:[
        1 to:numBytes do:[:i | shifty digitAt:i put:(self basicAt:i)].
    ].

    " Extract the sign and the biased exponent "
    sign := (shifty bitAt:numBits+1) == 0 ifTrue: [1] ifFalse: [-1].
    expPart := (shifty rightShift:numBitsInMantissa) bitAnd: maskExponent.

    " Extract fractional part; answer 0 if this is a true 0.0 value "
    fractionPart := shifty bitAnd:maskMantissa.
    ( expPart=0 and: [ fractionPart=0 ] ) ifTrue: [ ^ 0  ].

    numIntegerBits == 0 ifTrue:[
        " Replace omitted leading 1 in fraction (Notice: quadIEEE format does not do this)"
        fraction := fractionPart bitOr: (maskMantissa + 1).
    ] ifFalse:[
        fraction := fractionPart.
    ].

    "Unbias exponent"
    exp := biasExponent - expPart + (numBitsInMantissa - numIntegerBits).

    " Form the result. When exp>fractionalPrecision, the exponent is adjusted by
      the number of trailing zero bits in the fraction to minimize
      the (huge) time otherwise spent in #gcd: of fraction handling code."
    exp negative ifTrue: [
        result := sign * (fraction rightShift: exp) 
    ] ifFalse:[
        zeroBitsCount := fraction lowBit - 1.
        exp := exp - zeroBitsCount.
        exp <= 0 ifTrue: [
            zeroBitsCount := zeroBitsCount + exp.
            result := sign * (fraction rightShift:zeroBitsCount) 
        ] ifFalse: [
            result := Fraction
                    numerator: (sign * (fraction rightShift: zeroBitsCount))
                    denominator: (1 bitShift:exp) 
        ] 
    ].

    "Low cost validation omitted after extensive testing"
    "(result asFloat = self) ifFalse: [self error: 'asTrueFraction validation failed']."

    ^ result 

    "
     0.3 asFloat asTrueFraction   
     0.3 asShortFloat asTrueFraction  
     0.3 asLongFloat asTrueFraction   

     1.25 asTrueFraction     
     1.25 asShortFloat asTrueFraction     
     0.25 asTrueFraction     
     -0.25 asTrueFraction    
     3e37 asTrueFraction     

     LongFloat NaN asTrueFraction              
     LongFloat infinity asTrueFraction          
     LongFloat negativeInfinity asTrueFraction 
    "

    "Modified: / 25-08-2017 / 12:34:08 / cg"
    "Modified: / 28-05-2019 / 09:04:52 / Claus Gittinger"
    "Modified (comment): / 08-06-2019 / 13:59:40 / Claus Gittinger"
! !

!LimitedPrecisionReal methodsFor:'comparing'!

< aNumber
    "return true, if the argument is greater"

    ^ aNumber lessFromFloat:self asFloat

    "Modified: 17.4.1996 / 13:34:50 / cg"
! !

!LimitedPrecisionReal methodsFor:'double dispatching'!

differenceFromFraction:aFraction
    "sent when a fraction does not know how to subtract the receiver"

    |d|

    d := aFraction denominator.
    ^ (aFraction numerator - (self * d)) / d

    "Modified (comment): / 07-06-2019 / 02:19:52 / Claus Gittinger"
!

productFromFraction:aFraction
    "sent when a fraction does not know how to multiply the receiver"

    ^ self * aFraction numerator / aFraction denominator

    "Modified (comment): / 07-06-2019 / 02:19:47 / Claus Gittinger"
!

quotientFromFraction:aFraction
    "Return the quotient of the argument, aFraction and the receiver.
     Sent when aFraction does not know how to divide by the receiver."

    ^ aFraction numerator / (self * aFraction denominator)
!

sumFromFraction:aFraction
    "sent when a fraction does not know how to add the receiver"

    |d|

    d := aFraction denominator.
    ^ (self * d + aFraction numerator) / d

    "Modified (comment): / 07-06-2019 / 02:19:35 / Claus Gittinger"
!

sumFromTimestamp:aTimestamp
    "I am to be interpreted as seconds, return the timestamp this number of seconds
     after aTimestamp"

    ^ aTimestamp addMilliseconds:(self * 1000) truncated.

    "
     Timestamp now sumFromTimestamp:aTimestamp   
     100.0 sumFromTimestamp:Timestamp now 

     |t1 t2|
     t1 := Timestamp now. 
     t2 := 1.5 sumFromTimestamp:t1.
     t1 inspect. t2 inspect.
    "
! !


!LimitedPrecisionReal methodsFor:'printing'!

printStringScientific
    "return a 'user friendly' scientific printString.
     Notice: this returns a Text object with superscript digits,
     which requires a font capapble of displaying it correctly.
     Also: the returned string is not meant to be read back - purely for GUIs"

    |s idx|

    s := PrintfScanf printf:'%e' argument:self.
    idx := s indexOfAny:'eE'.
    idx == 0 ifTrue:[^ s].
    ^ (s copyTo:idx-1) , '×10' , ((s copyFrom:idx+1) asText emphasisAllAdd:#superscript)

    "
     1.23456 printString            -> '1.23456'
     1.23456 printStringScientific  1.23456×100 (with superscript zero at end)
     1.23e14 printStringScientific  1.23×1014
     PrintfScanf printf:'%e' argument:1.23456
    "

    "Created: / 05-10-2018 / 13:38:44 / Claus Gittinger"
! !

!LimitedPrecisionReal methodsFor:'queries'!

decimalEmax
    "Answer how many digits of exponent-accuracy this class supports"

    ^ self class decimalEmax

    "
     1.0 asShortFloat emax   
     1.0 asShortFloat decimalEmax  

     1.0 asFloat emax  
     1.0 asFloat emin
     1.0 asFloat decimalEmax  

     1.0 asLongFloat emax        
     1.0 asLongFloat emin        
     1.0 asLongFloat decimalEmax  
    "

    "Created: / 10-10-2017 / 15:46:43 / cg"
!

decimalPrecision
    "Answer how many significant decimal digits (accuracy) this class supports"

    ^ (self precision / (10.0 log:self radix)) 
        floor   "use floor, since the last decimal digit is usually wrong"

    "
     1.0 asShortFloat decimalPrecision
     1.0 asFloat decimalPrecision   
     1.0 asLongFloat decimalPrecision     
     1.0 asQDouble decimalPrecision
     1.0 asLargeFloat decimalPrecision
    "

    "Created: / 10-10-2017 / 15:46:43 / cg"
!

defaultNumberOfDigits
    "Answer how many digits of accuracy this class supports"

    ^ self class decimalPrecision 

    "
        Float new defaultNumberOfDigits
        LongFloat new defaultNumberOfDigits
        ShortFloat new defaultNumberOfDigits
        QDouble new defaultNumberOfDigits
    "

    "Modified (comment): / 20-06-2017 / 12:58:12 / cg"
!

eBias
    "Answer the exponent's bias;
     that is the offset of the zero exponent when stored"

    ^ (1 << (self numBitsInExponent-1)) - 1
!

emax
    "return the largest exponent.
     The computation below assumes standard IEEE format"

    ^ (1 << (self numBitsInExponent-1)) - 1

    "
     Float emax       -> 1023
     ShortFloat emax  -> 127
     LongFloat emax   -> 16383
     QuadFloat emax   -> 16383
     OctaFloat emax   -> 262143
     QDouble emax     -> 1023
    "
!

emin
    ^ 1 - self emax

!

exponent
    "extract a normalized float's (unbiased) exponent.
     This is a fallback for systems which do not provide frexp in their math lib,
     and also for error reporting (NaN or Inf).
     The returned value depends on the float-representation of
     the underlying machine and is therefore highly unportable.
     This is not for general use.
     This assumes that the mantissa is normalized to
     0.5 .. 1.0 and the float's value is mantissa * 2^exp"

    |shifty expPart exp numBytes numBitsInMantissa maskMantissa numBitsInExponent maskExponent
     biasExponent numIntegerBits fractionPart|

    (self isFinite) ifFalse:[
        ^ self class
            raise:#domainErrorSignal
            receiver:self
            selector:#exponent
            arguments:#()
            errorString:'bad receiver (Inf or NaN) in exponent'
    ].    

    "Extract the bits of an IEEE anySize float "
    numBytes := self basicSize.
    numBitsInMantissa := self numBitsInMantissa. maskMantissa := (1 bitShift:numBitsInMantissa) - 1.
    numBitsInExponent := self numBitsInExponent. maskExponent := (1 bitShift:numBitsInExponent) - 1.
    numIntegerBits := self numBitsInIntegerPart.
    biasExponent := maskExponent bitShift:-1.

    shifty := LargeInteger basicNew numberOfDigits:numBytes.
    UninterpretedBytes isBigEndian ifTrue:[
        1 to:numBytes do:[:i | shifty digitAt:(numBytes+1-i) put:(self basicAt:i)].
    ] ifFalse:[
        1 to:numBytes do:[:i | shifty digitAt:i put:(self basicAt:i)].
    ].

    " Extract the sign and the biased exponent "
    expPart := (shifty rightShift:numBitsInMantissa) bitAnd: maskExponent.

    " Extract fractional part; answer 0 if this is a true 0.0 value "
    fractionPart := shifty bitAnd:maskMantissa.
    ( expPart=0 and: [ fractionPart=0 ] ) ifTrue: [ ^ 0  ].

    exp := expPart - biasExponent + 1.
    ^ exp

    "
     0.3 asFloat exponent  
     0.3 asShortFloat exponent  
     0.3 asLongFloat exponent  

     0.0 exponent2      0
     1.0 exponent2      1
     2.0 exponent2      2
     3.0 exponent2      2
     4.0 exponent2      3
     0.5 exponent2      0
     0.4 exponent2      -1
     0.25 exponent2     -1
     0.00000011111 exponent2  -23 
    "

    "Modified: / 25-08-2017 / 12:34:23 / cg"
    "Modified: / 28-05-2019 / 09:05:02 / Claus Gittinger"
!

exponentBits
    "extract the biased exponentBits"
     
    |shifty numBytes numBitsInMantissa exponent anyBit|

    (self isNaN or:[self isInfinite]) ifTrue:[
        ^ self class
            raise:#domainErrorSignal
            receiver:self
            selector:#mantissa
            arguments:#()
            errorString:'bad receiver (Inf or NaN) in mantissa'
    ].    

    "Extract the bits of an IEEE anySize float "
    numBytes := self basicSize.
    shifty := LargeInteger basicNew numberOfDigits:numBytes.
    anyBit := 0.
    UninterpretedBytes isBigEndian ifTrue:[
        1 to:numBytes do:[:i | |byte| shifty digitAt:(numBytes+1-i) put:(byte := self basicAt:i). anyBit := anyBit bitOr:byte].
    ] ifFalse:[
        1 to:numBytes do:[:i | |byte| shifty digitAt:i put:(byte := self basicAt:i). anyBit := anyBit bitOr:byte].
    ].
    anyBit == 0 ifTrue:[^ 0].

    numBitsInMantissa := self numBitsInMantissa. 
    exponent := shifty rightShift:numBitsInMantissa.
    exponent := exponent bitAnd:((1 bitShift:self numBitsInExponent)-1).
    ^ exponent
    
    "
     0.0 mantissaBits    
     0.0 exponentBits 

     1.0 mantissaBits  hexPrintString '10000000000000' 
     1.0 exponentBits                 1023 16r3FF 

     2.0 mantissaBits  hexPrintString '10000000000000' 
     2.0 exponentBits                 1024 16r400

     3.0 mantissaBits  hexPrintString '18000000000000'
     3.0 exponentBits                 1024 16r400

     4.0 mantissaBits  hexPrintString '10000000000000'
     4.0 exponentBits                 1025 16r401

     5.0 mantissaBits  hexPrintString '14000000000000'
     5.0 exponentBits                 1025 16r401

    -5.0 mantissaBits  hexPrintString '14000000000000'
    -5.0 exponentBits                 1025 16r401

     0.1 mantissaBits  hexPrintString '1999999999999A'
     0.1 exponentBits                 1019 16r3FB

     0.3 mantissaBits  hexPrintString '13333333333333'
     0.3 exponentBits                 1021 16r3FD

     0.3 asShortFloat mantissaBits    10066330 16r99999A
     0.3 asShortFloat exponentBits    125 16r7D

     0.3 asLongFloat mantissaBits     11068046444225730560 16r9999999999999800
     0.3 asLongFloat exponentBits     16381 16r3FFD

     0.3 asQDouble mantissaBits  
    "
    "Modified: / 28-05-2019 / 09:05:10 / Claus Gittinger"
!

fmax
    ^ self class fmax
!

fmin
    ^ self class fmin
!

fractionalPart
    "This has been renamed to #fractionPart for ST80 compatibility.

     extract the after-decimal fraction part.
     the floats value is 
        float truncated + float fractionalPart"

    <resource:#obsolete>

    self obsoleteMethodWarning:'please use #fractionPart'.
    ^ self fractionPart

    "Modified: / 28.10.1998 / 17:10:12 / cg"
    "Created: / 28.10.1998 / 17:10:32 / cg"
!

mantissa
    "extract a normalized float's mantissa.
     That is a float of the same type as the receiver,
     such that:
        (2 ^ f exponent) * f mantissa = f
     This is a fallback for systems which do not provide frexp in their math lib,
     and also for error reporting (NaN or Inf)."
     
    |m numBitsInMantissa numBitsInExponent maskExponent numIntegerBits
     bits "maskMantissa shifty expPart numBytes 
     biasExponent  fractionPart"|

    (self isNaN or:[self isInfinite]) ifTrue:[
        ^ self class
            raise:#domainErrorSignal
            receiver:self
            selector:#mantissa
            arguments:#()
            errorString:'bad receiver (Inf or NaN) in mantissa'
    ].
    numBitsInExponent := self numBitsInExponent. 
    numBitsInMantissa := self numBitsInMantissa.
    numIntegerBits := self numBitsInIntegerPart.

    maskExponent := (1 bitShift:numBitsInExponent) - 1.

    "/ Extract the bits of the IEEE float, mask out everything except the characteristic
    "/ and the sign; set the exponent to eBias
    m := self copy.
    bits := LargeInteger digitBytes:self digitBytes.
    "/ mask out the exponent
    bits := bits bitClear:(maskExponent bitShift:numBitsInMantissa + numIntegerBits).
    "/ set to zero
    bits := bits bitOr:(self eBias - 1 bitShift:numBitsInMantissa + numIntegerBits).
    1 to:self basicSize do:[:i | m basicAt:i put:(bits digitAt:i)].
    ^ m.

"/    "Extract the bits of an IEEE anySize float "
"/    numBytes := self basicSize.
"/    numBitsInMantissa := maskMantissa := (1 bitShift:numBitsInMantissa) - 1.
"/    biasExponent := maskExponent bitShift:-1.
"/
"/    shifty := LargeInteger basicNew numberOfDigits:numBytes.
"/    UninterpretedBytes isBigEndian ifTrue:[
"/        1 to:numBytes do:[:i | shifty digitAt:(numBytes+1-i) put:(self basicAt:i)].
"/    ] ifFalse:[
"/        1 to:numBytes do:[:i | shifty digitAt:i put:(self basicAt:i)].
"/    ].
"/
"/    expPart := (shifty rightShift:numBitsInMantissa) bitAnd: maskExponent.
"/
"/    "/ Extract fractional part; answer 0 if this is a true 0.0 value
"/    fractionPart := shifty bitAnd:maskMantissa.
"/    fractionPart == 0 ifTrue:[
"/        expPart == 0 ifTrue:[
"/            ^ 0
"/        ]
"/    ].
"/
"/    "/ or in any hidden bit
"/    numIntegerBits == 0 ifTrue:[
"/        fractionPart := fractionPart bitOr:(1 bitShift:numBitsInMantissa).
"/    ].
"/    "/ should I make it a limited precision real?
"/    ^ fractionPart / (1 bitShift:numBitsInMantissa).
    
    "
     0.3 asFloat mantissa  
     0.3 asShortFloat mantissa  
     0.3 asLongFloat mantissa  
     0.3 asQDouble mantissa  
    "

    "Created: / 20-06-2017 / 11:36:51 / cg"
    "Modified: / 25-08-2017 / 12:34:34 / cg"
    "Modified: / 28-05-2019 / 09:05:10 / Claus Gittinger"
!

mantissaBits
    "extract a normalized float's mantissaBits (incl. any hidden bit)."
     
    |shifty numBytes numBitsInMantissa maskMantissa fractionPart hiddenBit anyBit|

    (self isNaN or:[self isInfinite]) ifTrue:[
        ^ self class
            raise:#domainErrorSignal
            receiver:self
            selector:#mantissa
            arguments:#()
            errorString:'bad receiver (Inf or NaN) in mantissa'
    ].    

    "Extract the bits of an IEEE anySize float "
    numBytes := self basicSize.
    shifty := LargeInteger basicNew numberOfDigits:numBytes.
    anyBit := 0.
    UninterpretedBytes isBigEndian ifTrue:[
        1 to:numBytes do:[:i | |byte| shifty digitAt:(numBytes+1-i) put:(byte := self basicAt:i). anyBit := anyBit bitOr:byte].
    ] ifFalse:[
        1 to:numBytes do:[:i | |byte| shifty digitAt:i put:(byte := self basicAt:i). anyBit := anyBit bitOr:byte].
    ].
    anyBit == 0 ifTrue:[^ 0].

    numBitsInMantissa := self numBitsInMantissa. maskMantissa := (1 bitShift:numBitsInMantissa) - 1.
    (self numBitsInIntegerPart == 0) ifTrue:[
        hiddenBit := (1 bitShift:numBitsInMantissa).
    ] ifFalse:[
        hiddenBit := 0.
    ].

    " Extract fractional part; answer 0 if this is a true 0.0 value "
    fractionPart := shifty bitAnd:maskMantissa.
    ^ fractionPart bitOr:hiddenBit.
    
    "
     0.0 mantissaBits    
     1.0 mantissaBits  hexPrintString '10000000000000' 
     2.0 mantissaBits  hexPrintString '10000000000000' 
     3.0 mantissaBits  hexPrintString '18000000000000'
     4.0 mantissaBits  hexPrintString '10000000000000'
     5.0 mantissaBits  hexPrintString '14000000000000'
     0.1 mantissaBits  hexPrintString '1999999999999A'
     0.3 mantissaBits  hexPrintString '13333333333333'
     0.3 asShortFloat mantissaBits    10066330              16r99999A
     0.3 asLongFloat mantissaBits     11068046444225730560  16r9999999999999800
    "
    "Modified: / 28-05-2019 / 09:05:10 / Claus Gittinger"
!

numBitsInExponent
    "answer the number of bits in the exponent
     11 for double precision:
        seeeeeee eeeemmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm
     8 for single precision:
        seeeeeee emmmmmmm mmmmmmmm mmmmmmmm
     15 for long floats (x86):
        00000000 00000000 seeeeeee eeeeeeee immmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm
     15 for long floats (sparc):
        seeeeeee eeeeeeee mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm...
     15 for quad floats:
        seeeeeee eeeeeeee mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm...
     15 for octuple floats:
        seeeeeee eeeeeeee eeeemmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm...
     other for LargeFloats"

    ^ self class numBitsInExponent

    "Created: / 28-05-2019 / 08:52:44 / Claus Gittinger"
!

numBitsInIntegerPart
    "answer the number of bits in the integer part of the mantissa.
     Most floating point formats are normalized to get rid of the extra bit.
     (i.e. except for LongFloats and LargeFloats,
      instances are normalized to exclude any integer bit"

    ^ self class numBitsInIntegerPart

    "Created: / 28-05-2019 / 08:52:52 / Claus Gittinger"
!

numBitsInMantissa
    "answer the number of bits in the mantissa (the significant)
     any hidden bits are not counted.
     11 for half precision:
        seeeemmm mmmmmmmm
     23 for single precision:
        seeeeeee emmmmmmm mmmmmmmm mmmmmmmm
     52 for double precision:
        seeeeeee eeeemmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm
     64 for longfloat precision (x86):
        00000000 00000000 seeeeeee eeeeeeee immmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm
     112 for longfloat precision (sparc):
        seeeeeee eeeeeeee mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm...
     112 for quadfloat precision:
        seeeeeee eeeeeeee mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm...
    "

    ^ self class numBitsInMantissa

    "
     1.0 numBitsInMantissa
     1.0 asShortFloat numBitsInMantissa
     1.0 asLongFloat numBitsInMantissa
    "

    "Created: / 28-05-2019 / 08:52:34 / Claus Gittinger"
!

precision
    "return the number of valid mantissa bits.
     Should be redefined in classes which allow per-instance precision specification"

    ^ self class precision
!

radix
    "answer the radix of the exponent
     Typically, but not required to be, this will be 2 
     (as floats ary usually represented as IEEE binary floats)"

    ^ self class radix
!

size
   "redefined since reals are kludgy (ByteArry)"

   ^ 0
! !


!LimitedPrecisionReal methodsFor:'special access'!

partValues:aBlock
    "invoke aBlock with sign, exponent and abs(mantissa)"

    ^ aBlock 
        value:self sign
        value:self exponent
        value:self mantissa abs

    "
     1.0 partValues:[:sign :exp :mantissa | Transcript showCR:'%1/%2/%3' with:sign with:exp with:mantissa].
     2.0 partValues:[:sign :exp :mantissa | Transcript showCR:'%1/%2/%3' with:sign with:exp with:mantissa].
     -1.0 partValues:[:sign :exp :mantissa | Transcript showCR:'%1/%2/%3' with:sign with:exp with:mantissa].
     -2.0 partValues:[:sign :exp :mantissa | Transcript showCR:'%1/%2/%3' with:sign with:exp with:mantissa].

     1.0 asShortFloat partValues:[:sign :exp :mantissa | Transcript showCR:'%1/%2/%3' with:sign with:exp with:mantissa].
     1.0 asLongFloat partValues:[:sign :exp :mantissa | Transcript showCR:'%1/%2/%3' with:sign with:exp with:mantissa].
     1.0 asLargeFloat partValues:[:sign :exp :mantissa | Transcript showCR:'%1/%2/%3' with:sign with:exp with:mantissa].
    "

    "Created: / 26-05-2019 / 03:11:28 / Claus Gittinger"
    "Modified (comment): / 29-05-2019 / 03:49:20 / Claus Gittinger"
! !

!LimitedPrecisionReal methodsFor:'testing'!

isFinite
    "return true, if the receiver is a finite float (not NaN and not +/-INF)"

   ^ self subclassResponsibility

    "Created: / 7.1.1998 / 12:02:06 / stefan"
!

isFloat
    "return true, if the receiver is some kind of floating point number;
     false is returned here.
     Same as #isLimitedPrecisionReal, but a better name ;-)"

    ^ true

    "Created: / 14.11.2001 / 14:57:55 / cg"
!

isInfinite
    "return true, if the receiver is an infinite float (+Inf or -Inf).
     These are not created by ST/X float operations (they raise an exception);
     however, inline C-code could produce them."

    ^ (self isFinite or:[self isNaN]) not.

    "
        1.0 isInfinite
        (0.0 uncheckedDivide: 0.0) isInfinite
        (1.0 uncheckedDivide: 0.0) isInfinite
    "

    "Modified: / 7.1.1998 / 12:01:30 / stefan"
!

isLimitedPrecisionReal
    "return true, if the receiver is some kind of limited precision real (i.e. floating point) number;
     true is returned here - the method is redefined from Object."

    ^ true
!

isNaN
    "return true, if the receiver is an invalid float (NaN - not a number).
     These are not created by ST/X float operations (they raise an exception);
     however, inline C-code could produce them."

   ^ self subclassResponsibility

    "Modified: 12.2.1997 / 16:45:27 / cg"
!

isNegativeZero
    "many systems have two float.Pnt zeros"

    ^ (self isZero) and:[self printString first == $-]

    "
     0.0 asLongFloat isNegativeZero     
     -0.0 asLongFloat isNegativeZero       
     -1.0 asLongFloat isNegativeZero       
     1.0 asLongFloat isNegativeZero       

     0.0 asLargeFloat isNegativeZero     
     -0.0 asLargeFloat isNegativeZero       
    "
    "
     0.0 asLongFloat isZero                      
     -0.0 asLongFloat isZero       

     0.0 > 0   
     -0.0 < 0       
    "
!

isZero
    "return true, if the receiver is zero"

    ^ self = self class zero

    "Created: / 10-06-2019 / 21:58:45 / Claus Gittinger"
!

numberOfBits
    "return the size (in bits) of the real;
     typically, this is 64 for Floats and 32 for ShortFloats,
     but who knows ..."

    self subclassResponsibility
!

positive
    "return true if the receiver is greater or equal to zero (not negative)"

    ^ self sign >= 0

    "Modified: 17.4.1996 / 13:35:10 / cg"
!

sign
    "return the sign of the receiver (-1, 0 or 1)"

    self strictlyPositive  ifTrue:[^ 1].
    self negative ifTrue:[^ -1].
    ^ 0

    "
     -1.0 sign
     -0.0 sign
     1.0 sign
     0.0 sign
     Infinity infinity sign
     Infinity infinity negated sign
    "
! !

!LimitedPrecisionReal methodsFor:'truncation & rounding'!

ceilingAsFloat
    ^ self coerce:(self ceiling).

    "
     0.4 asLongFloat ceilingAsFloat
    "
!

floorAsFloat
    ^ self coerce:(self floor).

    "
     0.4 asLongFloat floorAsFloat
    "
!

integerAndFractionParts
    "return the integer and the fraction part of the receiver as a pair
     of floats (i.e. the result of the modf function).
     Adding the parts gives the original value"

    |iPart|

    iPart := self truncatedAsFloat.
    ^ { iPart . (self - iPart) } 
!

integerPart
    "return a float with value from digits before the decimal point
     (i.e. the truncated value)"

    ^ self truncatedAsFloat
    "/ ^ self truncated asFloat

    "
     1234.56789 integerPart  
     1.2345e6 integerPart
     12.5 integerPart
     -12.5 integerPart
     (5/3) integerPart
     (-5/3) integerPart
     (5/3) truncated
     (-5/3) truncated
    "

    "Created: / 28.10.1998 / 17:14:56 / cg"
    "Modified: / 5.11.2001 / 17:54:22 / cg"
!

roundedAsFloat
    self negative ifTrue:[ ^ (self - 0.5) ceilingAsFloat ].
    ^ (self + 0.5) floorAsFloat
!

truncatedAsFloat
    self positive ifTrue:[
        ^ self floorAsFloat
    ].
    ^ self ceilingAsFloat
    "/ ^ self coerce:(self truncated).

    "
     0.4 asLongFloat truncatedAsFloat
    "
!

truncatedToPrecision
    "truncates to the precision of the float.
     This is slightly different from truncated.
     Taking for example 1e32,
     the printed representation will be 1e32,
     but the actual value, when truncating to an integer
     would be 100000003318135351409612647563264.

     This is due to the inaccuracy in the least significant bits,
     and the way the print-converter compensates for this.
     This method tries to generate an integer value which corresponds
     to what is seen in the float's printString.

     Here, a slow fallback (generating and rescanning the printString)
     is provided, which should work on any float number.
     Specialized versions in subclasses may be added for more performance
     (however, this is probably only used rarely)"

    |pow10 printString intVal s sign nFract ch expSign exp pI|

    pow10 := #(10 100 1000 10000 100000 1000000 10000000 100000000 1000000000).

    printString := self printString.
    intVal := 0.
    sign := 1.
    nFract := 0.
    exp := 0.

    "/ fetch the mantissa
    s := printString readStream.
    s peek == $- ifTrue:[
        sign := -1.
        s next.
    ].
    intVal := s nextDecimalInteger.
    s peek == $. ifTrue:[
        s next.
        [(ch := s peek) notNil and:[ch isDigit]] whileTrue:[
            intVal := intVal * 10 + (s next digitValue).
            nFract := nFract + 1.
        ].
    ].
    ('eEdDqQ' includes:s peek) ifTrue:[
        expSign := 1.
        s next.
        (ch := s peek) == $- ifTrue:[
            expSign := -1.
            s next.
        ] ifFalse:[
            ch == $+ ifTrue:[
                s next.
            ].
        ].
        exp := s nextDecimalInteger.
        exp := exp * expSign.
    ].
    exp := exp - nFract.
    exp < 0 ifTrue:[
        [exp < 0] whileTrue:[
            pI := (exp negated) min:pow10 size.
            intVal := intVal / (pow10 at:pI).
            exp := exp + pI.
        ].
        intVal := intVal asInteger.
    ] ifFalse:[
        [exp > 0] whileTrue:[
            pI := exp min:pow10 size.
            intVal := intVal * (pow10 at:pI).
            exp := exp - pI.
        ].
    ].
    ^ intVal

    "
     1e32 asShortFloat truncated
     1e32 asShortFloat truncatedToPrecision
     1.234e10 asShortFloat truncatedToPrecision
     1234e-1 asShortFloat truncatedToPrecision

     1e32 truncated
     1e32 truncatedToPrecision
     1.234e10 truncatedToPrecision
     1234e-1 truncatedToPrecision

     1e32 asLongFloat truncated
     1e32 asLongFloat truncatedToPrecision
     1.234e10 asLongFloat truncatedToPrecision
     1234e-1 asLongFloat truncatedToPrecision
    "
! !

!LimitedPrecisionReal methodsFor:'visiting'!

acceptVisitor:aVisitor with:aParameter
    "dispatch for visitor pattern; send #visitFloat:with: to aVisitor."

    ^ aVisitor visitFloat:self with:aParameter
! !

!LimitedPrecisionReal class methodsFor:'documentation'!

version
    ^ '$Header$'
!

version_CVS
    ^ '$Header$'
! !


LimitedPrecisionReal initialize!
author	Claus Gittinger <cg@exept.de>
	Mon, 27 Jan 2020 13:47:24 +0100
changeset 25204	b12f8693fe6f
parent 25068	54fe36ff8729
permissions	-rw-r--r--