FuzzyMatcher.st
author Claus Gittinger <cg@exept.de>
Fri, 14 Jul 2017 13:42:18 +0200
changeset 4475 2e19c5a7452a
parent 4474 98208d107b52
child 4476 65686f14ebf4
permissions -rw-r--r--
#DOCUMENTATION by cg class: FuzzyMatcher comment/format in: #indexScore #initialize #pattern: changed: #match:ifScored: class: FuzzyMatcher class added: #allWithScoresSortedByScoreMatching:in:by: comment/format in: #allMatching:in:by: #allSortedByScoreMatching:in: changed: #allSortedByScoreMatching:in:by: category of: #pattern:
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     1
"{ Package: 'stx:libbasic2' }"
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     2
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     3
"{ NameSpace: Smalltalk }"
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     4
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     5
Object subclass:#FuzzyMatcher
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     6
	instanceVariableNames:'pattern lowercasePattern indexes'
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     7
	classVariableNames:''
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     8
	poolDictionaries:''
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     9
	category:'Collections-Text-Support'
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    10
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    11
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    12
!FuzzyMatcher class methodsFor:'documentation'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    13
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    14
documentation
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    15
"
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
    16
    FuzzyMatcher is an approximate string matching algorithm that can determine if a string includes a given pattern.
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    17
    For example, the string 'axby' matches both the pattern 'ab' and, 'ay', but not 'ba'. 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    18
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    19
    The algorithm is based on lib_fts[1], and includes an optional scoring algorithm 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    20
    that can be used to sort all the matches based on their similarity to the pattern.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    21
    It is used in the sublime text editor.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    22
    
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    23
    [see also:]
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    24
        https://blog.forrestthewoods.com/reverse-engineering-sublime-text-s-fuzzy-match-4cffeed33fdb
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    25
        https://github.com/forrestthewoods/lib_fts
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    26
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    27
"
4468
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    28
!
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    29
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    30
example
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    31
"
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    32
    |top lv list field patternHolder names|
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    33
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    34
    patternHolder := '' asValue.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    35
    list := List new.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    36
    
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    37
    top := StandardSystemView new.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    38
    lv := ListView origin:(0.0@30) corner:(1.0@1.0) in:top.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    39
    lv model:list.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    40
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    41
    field := EditField origin:(0.0@0.0) corner:(1.0@30) in:top.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    42
    field model:patternHolder.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    43
    field immediateAccept:true.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    44
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    45
    names := Smalltalk allClasses collect:#name.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    46
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    47
    patternHolder 
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    48
        onChangeEvaluate:[
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    49
            |matcher pattern matches|
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    50
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    51
            pattern := patternHolder value.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    52
            pattern notEmpty ifTrue:[
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    53
                matcher := FuzzyMatcher pattern:pattern.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    54
                
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    55
                matches := OrderedCollection new.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    56
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    57
                names do:[:eachClassName | 
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    58
                    matcher 
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    59
                        match:eachClassName
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    60
                        ifScored: [ :score | matches add: eachClassName -> score ] 
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    61
                ].
4470
5825ccc0dabf #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4469
diff changeset
    62
                matches sort:[:a :b |
5825ccc0dabf #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4469
diff changeset
    63
                        a value < b value
5825ccc0dabf #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4469
diff changeset
    64
                        or:[ a value = b value and:[ a key > b key]]
5825ccc0dabf #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4469
diff changeset
    65
                ].
4468
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    66
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    67
                list removeAll.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    68
                list addAllReversed:(matches 
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    69
                                collect:[:nameScoreAssoc | 
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    70
                                    '[%1] %2' bindWith:nameScoreAssoc value with:nameScoreAssoc key])
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    71
            ].    
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    72
        ].    
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    73
    top open.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    74
    patternHolder value:'mph'.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    75
"
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    76
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    77
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    78
!FuzzyMatcher class methodsFor:'instance creation'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    79
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    80
new
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    81
    "return an initialized instance"
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    82
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    83
    ^ self basicNew initialize.
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
    84
!
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    85
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    86
pattern: aString
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    87
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    88
	^self new pattern: aString
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    89
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    90
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    91
!FuzzyMatcher class methodsFor:'utilities api'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    92
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    93
allMatching: aPattern in: aCollection
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
    94
    "Assumes that the collection is a collection of Strings"
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
    95
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
    96
    | matcher |
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    97
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
    98
    matcher := self pattern: aPattern.
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    99
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   100
    ^ aCollection select: [ :each | matcher matches: each ]
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   101
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   102
    "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   103
     self 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   104
        allMatching:'clu' 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   105
        in:(Smalltalk allClasses collect:#name)
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   106
    "
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   107
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   108
    "Modified (comment): / 14-07-2017 / 12:19:05 / cg"
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   109
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   110
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   111
allMatching: aPattern in: aCollection by: aBlockReturningString
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   112
    "selects matching elements from aCollection.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   113
     aBlockReturningString is applied to elements to get the string representation
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   114
     (can be used eg. to sort classes)"
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   115
     
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   116
    | matcher |
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   117
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   118
    matcher := self pattern: aPattern.
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   119
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   120
    ^ aCollection select: [ :each | matcher matches: (aBlockReturningString value: each) ]
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   121
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   122
        "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   123
         self 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   124
            allMatching:'clu' 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   125
            in:(Smalltalk allClasses)
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   126
            by:[:cls | cls name]
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   127
        "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   128
        "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   129
         self 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   130
            allMatching:'clu' 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   131
            in:(Smalltalk allClasses)
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   132
            by:#name
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   133
        "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   134
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   135
    "Modified (comment): / 14-07-2017 / 12:21:40 / cg"
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   136
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   137
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   138
allSortedByScoreMatching: aPattern in: aCollection
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   139
    "Assumes that the collection is a collection of Strings;
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   140
     returns matching strings sorted by score (level of similarity)"
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   141
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   142
    ^ self allSortedByScoreMatching: aPattern in: aCollection by: [ :each | each ]
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   143
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   144
    "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   145
     self 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   146
        allSortedByScoreMatching:'clu' 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   147
        in:(Smalltalk allClasses collect:#name)
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   148
    "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   149
    "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   150
     self 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   151
        allSortedByScoreMatching:'nary' 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   152
        in:(Smalltalk allClasses collect:#name)
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   153
    "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   154
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   155
    "Modified (comment): / 14-07-2017 / 12:22:14 / cg"
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   156
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   157
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   158
allSortedByScoreMatching: aPattern in: aCollection by: aBlockReturningString
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   159
    "selects matching elements from aCollection.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   160
     aBlockReturningString is applied to elements to get the string representation.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   161
     Returns them sorted by score (i.e. similarity).
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   162
     (can be used eg. to sort classes)"
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   163
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   164
    | matchesAndScores |
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   165
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   166
    matchesAndScores := self allWithScoresSortedByScoreMatching: aPattern in: aCollection by: aBlockReturningString.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   167
    ^ matchesAndScores collect: [ :each | each value ]
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   168
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   169
    "
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   170
     self 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   171
        allSortedByScoreMatching:'' 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   172
        in:(Smalltalk allClasses)
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   173
        by:[:cls | cls name]
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   174
    "
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   175
    "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   176
     self 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   177
        allSortedByScoreMatching:'nary' 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   178
        in:(Smalltalk allClasses)
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   179
        by:[:cls | cls name]
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   180
    "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   181
    "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   182
     self 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   183
        allSortedByScoreMatching:'nary' 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   184
        in:(Smalltalk allClasses)
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   185
        by:#name
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   186
    "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   187
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   188
    "Modified: / 14-07-2017 / 12:43:14 / cg"
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   189
!
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   190
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   191
allWithScoresSortedByScoreMatching: aPattern in: aCollection by: aBlockReturningString
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   192
    "selects matching elements from aCollection.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   193
     aBlockReturningString is applied to elements to get the string representation.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   194
     Returns them sorted by score (i.e. similarity) associated to their scores.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   195
     (can be used eg. to sort classes)"
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   196
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   197
    |matcher matches|
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   198
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   199
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   200
    matcher := self pattern: aPattern.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   201
    matches := OrderedCollection new: aCollection size // 2.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   202
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   203
    aCollection do: [ :each | 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   204
        matcher 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   205
            match: (aBlockReturningString value: each) 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   206
            ifScored: [ :score | matches add: score -> each ] 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   207
    ].
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   208
    matches sort: [ :a :b | a key > b key].
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   209
    ^ matches asArray
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   210
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   211
    "
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   212
     self 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   213
        allWithScoresSortedByScoreMatching:'' 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   214
        in:(Smalltalk allClasses)
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   215
        by:[:cls | cls name]
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   216
    "
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   217
    "
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   218
     self 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   219
        allWithScoresSortedByScoreMatching:'OC' 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   220
        in:(Smalltalk allClasses)
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   221
        by:[:cls | cls name]
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   222
    "
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   223
    "
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   224
     self 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   225
        allWithScoresSortedByScoreMatching:'nary' 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   226
        in:(Smalltalk allClasses)
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   227
        by:[:cls | cls name]
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   228
    "
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   229
    "
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   230
     self 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   231
        allWithScoresSortedByScoreMatching:'nary' 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   232
        in:(Smalltalk allClasses)
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   233
        by:#name
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   234
    "
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   235
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   236
    "Created: / 14-07-2017 / 12:25:19 / cg"
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   237
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   238
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   239
!FuzzyMatcher methodsFor:'accessing'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   240
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   241
pattern
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   242
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   243
	^ pattern 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   244
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   245
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   246
pattern: aString
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   247
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   248
        pattern := aString.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   249
        lowercasePattern := pattern asLowercase.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   250
        indexes := Array new: pattern size.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   251
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   252
    "Modified (format): / 14-07-2017 / 12:59:15 / cg"
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   253
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   254
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   255
!FuzzyMatcher methodsFor:'comparing'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   256
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   257
match: aString ifScored: aBlock
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   258
        
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   259
        | score |
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   260
        
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   261
        pattern ifEmpty: [ aBlock value: "0" aString size negated. ^ self ].
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   262
        (self matches: aString) ifFalse: [ ^ self ].
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   263
        
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   264
        score := self firstScore: aString at: indexes first.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   265
        
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   266
        2 to: pattern size do: [ :pix | 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   267
                score := score + (self score: aString at: (indexes at: pix) patternAt: pix)
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   268
        ].
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   269
                
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   270
        score := score + self indexScore + ((aString size - pattern size) * self unmatchedLetterPenalty).
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   271
                
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   272
        aBlock value: score.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   273
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   274
    "Modified: / 14-07-2017 / 12:44:50 / cg"
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   275
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   276
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   277
matches: aString
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   278
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   279
	| idx |
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   280
	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   281
	pattern size > aString size ifTrue: [ ^ false ].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   282
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   283
	idx := 0.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   284
	pattern withIndexDo: [ :each :i |
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   285
		idx := aString 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   286
			findString: each asString 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   287
			startingAt: idx + 1 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   288
			caseSensitive: false. 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   289
		
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   290
		idx == 0 ifTrue: [ ^ false ].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   291
		indexes at: i put: idx.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   292
	].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   293
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   294
	^ true
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   295
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   296
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   297
!FuzzyMatcher methodsFor:'initialization'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   298
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   299
initialize
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   300
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   301
        super initialize.
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   302
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   303
        pattern := lowercasePattern := ''.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   304
        indexes := #().
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   305
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   306
    "Modified (format): / 14-07-2017 / 13:23:26 / cg"
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   307
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   308
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   309
!FuzzyMatcher methodsFor:'private'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   310
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   311
firstScore: aString at: anIndex
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   312
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   313
	| score |
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   314
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   315
	score := (aString at: anIndex) = pattern first 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   316
		ifTrue: [ self caseEqualBonus ]
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   317
		ifFalse: [ 0 ].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   318
	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   319
	anIndex = 1 	ifTrue: [ ^ score + self firstLetterBonus ].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   320
		
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   321
	score := score + (((anIndex - 1) * self leadingLetterPenalty) max: self maxLeadingLetterPenalty).
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   322
				
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   323
	^ score 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   324
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   325
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   326
indexScore 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   327
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   328
        | sum ramp |
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   329
        
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   330
        ramp := 1.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   331
        sum := 0.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   332
        
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   333
        1 to: indexes size - 1 do: [ :ix |
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   334
                ramp := (indexes at: ix) + 1 = (indexes at: ix + 1) 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   335
                        ifTrue: [ ramp + (ramp * self adjacencyIncrease) ]
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   336
                        ifFalse: [ 1 ].                 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   337
                
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   338
                sum := sum + ramp - 1
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   339
        ].
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   340
        
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   341
        ^ sum rounded
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   342
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   343
    "Modified (format): / 14-07-2017 / 13:24:07 / cg"
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   344
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   345
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   346
isSeparator: aCharacter
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   347
        
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   348
        ^  aCharacter = $_ or: [ aCharacter = $: ]
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   349
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   350
    "Created: / 13-07-2017 / 13:30:34 / cg"
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   351
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   352
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   353
isSeperator: aCharacter
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   354
        <resource: #obsolete>
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   355
        ^ self isSeparator: aCharacter
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   356
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   357
    "Modified: / 13-07-2017 / 13:31:18 / cg"
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   358
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   359
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   360
score: aString at: stringIndex patternAt: patternIndex
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   361
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   362
        | score prev |
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   363
        
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   364
        prev := (aString at: stringIndex - 1).
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   365
        
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   366
        score := (self isSeparator: prev) 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   367
                ifTrue: [ self separatorBonus ]
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   368
                ifFalse: [ (prev asLowercase = (lowercasePattern at: patternIndex - 1))
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   369
                        ifTrue: [ 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   370
                                self adjacencyBonus + 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   371
                                ((prev = (pattern at: patternIndex - 1)) ifTrue: [ self adjacentCaseEqualBonus ] ifFalse: [ 0 ]) 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   372
                        ] 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   373
                        ifFalse: [ 0 ] 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   374
                ].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   375
        
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   376
        (aString at: stringIndex) = (pattern at: patternIndex) ifTrue: [ 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   377
                score := score + self caseEqualBonus.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   378
        ].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   379
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   380
        ^ score
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   381
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   382
    "Modified: / 13-07-2017 / 13:30:57 / cg"
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   383
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   384
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   385
!FuzzyMatcher methodsFor:'scoring-bonus'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   386
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   387
adjacencyBonus
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   388
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   389
	^ 5
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   390
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   391
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   392
adjacencyIncrease
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   393
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   394
	^ 1.2
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   395
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   396
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   397
adjacentCaseEqualBonus
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   398
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   399
	^ 3
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   400
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   401
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   402
caseEqualBonus
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   403
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   404
	^ 7
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   405
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   406
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   407
firstLetterBonus
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   408
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   409
	^ 12
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   410
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   411
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   412
separatorBonus
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   413
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   414
	^ 5
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   415
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   416
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   417
!FuzzyMatcher methodsFor:'scoring-penalty'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   418
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   419
leadingLetterPenalty
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   420
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   421
	^ -3
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   422
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   423
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   424
maxLeadingLetterPenalty
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   425
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   426
	^ -9
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   427
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   428
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   429
unmatchedLetterPenalty
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   430
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   431
	^ -1
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   432
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   433
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   434
!FuzzyMatcher class methodsFor:'documentation'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   435
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   436
version
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   437
    ^ '$Header$'
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   438
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   439
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   440
version_CVS
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   441
    ^ '$Header$'
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   442
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   443