FuzzyMatcher.st
author Claus Gittinger <cg@exept.de>
Sat, 15 Jul 2017 15:28:25 +0200
changeset 4477 99941fe21a09
parent 4476 65686f14ebf4
child 4492 05def04efc34
permissions -rw-r--r--
#FEATURE by cg class: FuzzyMatcher added: #indexes
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     1
"{ Package: 'stx:libbasic2' }"
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     2
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     3
"{ NameSpace: Smalltalk }"
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     4
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     5
Object subclass:#FuzzyMatcher
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     6
	instanceVariableNames:'pattern lowercasePattern indexes'
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     7
	classVariableNames:''
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     8
	poolDictionaries:''
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     9
	category:'Collections-Text-Support'
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    10
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    11
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    12
!FuzzyMatcher class methodsFor:'documentation'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    13
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    14
documentation
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    15
"
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
    16
    FuzzyMatcher is an approximate string matching algorithm that can determine if a string includes a given pattern.
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    17
    For example, the string 'axby' matches both the pattern 'ab' and, 'ay', but not 'ba'. 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    18
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    19
    The algorithm is based on lib_fts[1], and includes an optional scoring algorithm 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    20
    that can be used to sort all the matches based on their similarity to the pattern.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    21
    It is used in the sublime text editor.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    22
    
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    23
    [see also:]
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    24
        https://blog.forrestthewoods.com/reverse-engineering-sublime-text-s-fuzzy-match-4cffeed33fdb
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    25
        https://github.com/forrestthewoods/lib_fts
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    26
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    27
"
4468
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    28
!
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    29
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    30
example
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    31
"
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    32
    |top lv list field patternHolder names|
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    33
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    34
    patternHolder := '' asValue.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    35
    list := List new.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    36
    
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    37
    top := StandardSystemView new.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    38
    lv := ListView origin:(0.0@30) corner:(1.0@1.0) in:top.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    39
    lv model:list.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    40
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    41
    field := EditField origin:(0.0@0.0) corner:(1.0@30) in:top.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    42
    field model:patternHolder.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    43
    field immediateAccept:true.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    44
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    45
    names := Smalltalk allClasses collect:#name.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    46
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    47
    patternHolder 
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    48
        onChangeEvaluate:[
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    49
            |matcher pattern matches|
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    50
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    51
            pattern := patternHolder value.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    52
            pattern notEmpty ifTrue:[
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    53
                matcher := FuzzyMatcher pattern:pattern.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    54
                
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    55
                matches := OrderedCollection new.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    56
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    57
                names do:[:eachClassName | 
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    58
                    matcher 
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    59
                        match:eachClassName
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    60
                        ifScored: [ :score | matches add: eachClassName -> score ] 
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    61
                ].
4470
5825ccc0dabf #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4469
diff changeset
    62
                matches sort:[:a :b |
5825ccc0dabf #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4469
diff changeset
    63
                        a value < b value
5825ccc0dabf #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4469
diff changeset
    64
                        or:[ a value = b value and:[ a key > b key]]
5825ccc0dabf #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4469
diff changeset
    65
                ].
4468
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    66
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    67
                list removeAll.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    68
                list addAllReversed:(matches 
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    69
                                collect:[:nameScoreAssoc | 
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    70
                                    '[%1] %2' bindWith:nameScoreAssoc value with:nameScoreAssoc key])
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    71
            ].    
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    72
        ].    
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    73
    top open.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    74
    patternHolder value:'mph'.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    75
"
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    76
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    77
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    78
!FuzzyMatcher class methodsFor:'instance creation'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    79
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    80
new
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    81
    "return an initialized instance"
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    82
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    83
    ^ self basicNew initialize.
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
    84
!
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    85
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    86
pattern: aString
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    87
4476
65686f14ebf4 #BUGFIX by cg
Claus Gittinger <cg@exept.de>
parents: 4475
diff changeset
    88
    ^ self new pattern: aString
65686f14ebf4 #BUGFIX by cg
Claus Gittinger <cg@exept.de>
parents: 4475
diff changeset
    89
65686f14ebf4 #BUGFIX by cg
Claus Gittinger <cg@exept.de>
parents: 4475
diff changeset
    90
    "
65686f14ebf4 #BUGFIX by cg
Claus Gittinger <cg@exept.de>
parents: 4475
diff changeset
    91
     (self pattern:'mrp') matches:'ButtonMorph'
65686f14ebf4 #BUGFIX by cg
Claus Gittinger <cg@exept.de>
parents: 4475
diff changeset
    92
    "
65686f14ebf4 #BUGFIX by cg
Claus Gittinger <cg@exept.de>
parents: 4475
diff changeset
    93
65686f14ebf4 #BUGFIX by cg
Claus Gittinger <cg@exept.de>
parents: 4475
diff changeset
    94
    "Modified (comment): / 14-07-2017 / 15:02:43 / cg"
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    95
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    96
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    97
!FuzzyMatcher class methodsFor:'utilities api'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    98
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    99
allMatching: aPattern in: aCollection
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   100
    "Assumes that the collection is a collection of Strings"
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   101
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   102
    | matcher |
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   103
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   104
    matcher := self pattern: aPattern.
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   105
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   106
    ^ aCollection select: [ :each | matcher matches: each ]
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   107
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   108
    "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   109
     self 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   110
        allMatching:'clu' 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   111
        in:(Smalltalk allClasses collect:#name)
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   112
    "
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   113
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   114
    "Modified (comment): / 14-07-2017 / 12:19:05 / cg"
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   115
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   116
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   117
allMatching: aPattern in: aCollection by: aBlockReturningString
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   118
    "selects matching elements from aCollection.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   119
     aBlockReturningString is applied to elements to get the string representation
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   120
     (can be used eg. to sort classes)"
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   121
     
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   122
    | matcher |
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   123
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   124
    matcher := self pattern: aPattern.
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   125
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   126
    ^ aCollection select: [ :each | matcher matches: (aBlockReturningString value: each) ]
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   127
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   128
        "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   129
         self 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   130
            allMatching:'clu' 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   131
            in:(Smalltalk allClasses)
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   132
            by:[:cls | cls name]
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   133
        "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   134
        "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   135
         self 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   136
            allMatching:'clu' 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   137
            in:(Smalltalk allClasses)
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   138
            by:#name
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   139
        "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   140
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   141
    "Modified (comment): / 14-07-2017 / 12:21:40 / cg"
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   142
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   143
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   144
allSortedByScoreMatching: aPattern in: aCollection
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   145
    "Assumes that the collection is a collection of Strings;
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   146
     returns matching strings sorted by score (level of similarity)"
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   147
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   148
    ^ self allSortedByScoreMatching: aPattern in: aCollection by: [ :each | each ]
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   149
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   150
    "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   151
     self 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   152
        allSortedByScoreMatching:'clu' 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   153
        in:(Smalltalk allClasses collect:#name)
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   154
    "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   155
    "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   156
     self 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   157
        allSortedByScoreMatching:'nary' 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   158
        in:(Smalltalk allClasses collect:#name)
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   159
    "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   160
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   161
    "Modified (comment): / 14-07-2017 / 12:22:14 / cg"
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   162
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   163
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   164
allSortedByScoreMatching: aPattern in: aCollection by: aBlockReturningString
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   165
    "selects matching elements from aCollection.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   166
     aBlockReturningString is applied to elements to get the string representation.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   167
     Returns them sorted by score (i.e. similarity).
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   168
     (can be used eg. to sort classes)"
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   169
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   170
    | matchesAndScores |
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   171
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   172
    matchesAndScores := self allWithScoresSortedByScoreMatching: aPattern in: aCollection by: aBlockReturningString.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   173
    ^ matchesAndScores collect: [ :each | each value ]
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   174
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   175
    "
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   176
     self 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   177
        allSortedByScoreMatching:'' 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   178
        in:(Smalltalk allClasses)
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   179
        by:[:cls | cls name]
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   180
    "
4474
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   181
    "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   182
     self 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   183
        allSortedByScoreMatching:'nary' 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   184
        in:(Smalltalk allClasses)
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   185
        by:[:cls | cls name]
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   186
    "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   187
    "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   188
     self 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   189
        allSortedByScoreMatching:'nary' 
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   190
        in:(Smalltalk allClasses)
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   191
        by:#name
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   192
    "
98208d107b52 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4470
diff changeset
   193
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   194
    "Modified: / 14-07-2017 / 12:43:14 / cg"
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   195
!
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   196
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   197
allWithScoresSortedByScoreMatching: aPattern in: aCollection by: aBlockReturningString
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   198
    "selects matching elements from aCollection.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   199
     aBlockReturningString is applied to elements to get the string representation.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   200
     Returns them sorted by score (i.e. similarity) associated to their scores.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   201
     (can be used eg. to sort classes)"
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   202
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   203
    |matcher matches|
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   204
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   205
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   206
    matcher := self pattern: aPattern.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   207
    matches := OrderedCollection new: aCollection size // 2.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   208
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   209
    aCollection do: [ :each | 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   210
        matcher 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   211
            match: (aBlockReturningString value: each) 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   212
            ifScored: [ :score | matches add: score -> each ] 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   213
    ].
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   214
    matches sort: [ :a :b | a key > b key].
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   215
    ^ matches asArray
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   216
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   217
    "
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   218
     self 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   219
        allWithScoresSortedByScoreMatching:'' 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   220
        in:(Smalltalk allClasses)
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   221
        by:[:cls | cls name]
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   222
    "
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   223
    "
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   224
     self 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   225
        allWithScoresSortedByScoreMatching:'OC' 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   226
        in:(Smalltalk allClasses)
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   227
        by:[:cls | cls name]
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   228
    "
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   229
    "
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   230
     self 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   231
        allWithScoresSortedByScoreMatching:'nary' 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   232
        in:(Smalltalk allClasses)
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   233
        by:[:cls | cls name]
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   234
    "
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   235
    "
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   236
     self 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   237
        allWithScoresSortedByScoreMatching:'nary' 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   238
        in:(Smalltalk allClasses)
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   239
        by:#name
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   240
    "
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   241
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   242
    "Created: / 14-07-2017 / 12:25:19 / cg"
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   243
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   244
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   245
!FuzzyMatcher methodsFor:'accessing'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   246
4477
99941fe21a09 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4476
diff changeset
   247
indexes
99941fe21a09 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4476
diff changeset
   248
    "only valid inside the match callback block"
99941fe21a09 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4476
diff changeset
   249
    
99941fe21a09 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4476
diff changeset
   250
    ^ indexes
99941fe21a09 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4476
diff changeset
   251
99941fe21a09 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4476
diff changeset
   252
    "Created: / 15-07-2017 / 14:57:10 / cg"
99941fe21a09 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4476
diff changeset
   253
!
99941fe21a09 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4476
diff changeset
   254
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   255
pattern
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   256
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   257
	^ pattern 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   258
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   259
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   260
pattern: aString
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   261
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   262
        pattern := aString.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   263
        lowercasePattern := pattern asLowercase.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   264
        indexes := Array new: pattern size.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   265
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   266
    "Modified (format): / 14-07-2017 / 12:59:15 / cg"
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   267
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   268
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   269
!FuzzyMatcher methodsFor:'comparing'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   270
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   271
match: aString ifScored: aBlock
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   272
        
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   273
        | score |
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   274
        
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   275
        pattern ifEmpty: [ aBlock value: "0" aString size negated. ^ self ].
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   276
        (self matches: aString) ifFalse: [ ^ self ].
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   277
        
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   278
        score := self firstScore: aString at: indexes first.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   279
        
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   280
        2 to: pattern size do: [ :pix | 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   281
                score := score + (self score: aString at: (indexes at: pix) patternAt: pix)
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   282
        ].
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   283
                
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   284
        score := score + self indexScore + ((aString size - pattern size) * self unmatchedLetterPenalty).
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   285
                
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   286
        aBlock value: score.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   287
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   288
    "Modified: / 14-07-2017 / 12:44:50 / cg"
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   289
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   290
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   291
matches: aString
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   292
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   293
	| idx |
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   294
	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   295
	pattern size > aString size ifTrue: [ ^ false ].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   296
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   297
	idx := 0.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   298
	pattern withIndexDo: [ :each :i |
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   299
		idx := aString 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   300
			findString: each asString 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   301
			startingAt: idx + 1 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   302
			caseSensitive: false. 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   303
		
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   304
		idx == 0 ifTrue: [ ^ false ].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   305
		indexes at: i put: idx.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   306
	].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   307
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   308
	^ true
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   309
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   310
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   311
!FuzzyMatcher methodsFor:'initialization'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   312
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   313
initialize
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   314
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   315
        super initialize.
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   316
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   317
        pattern := lowercasePattern := ''.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   318
        indexes := #().
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   319
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   320
    "Modified (format): / 14-07-2017 / 13:23:26 / cg"
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   321
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   322
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   323
!FuzzyMatcher methodsFor:'private'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   324
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   325
firstScore: aString at: anIndex
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   326
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   327
	| score |
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   328
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   329
	score := (aString at: anIndex) = pattern first 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   330
		ifTrue: [ self caseEqualBonus ]
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   331
		ifFalse: [ 0 ].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   332
	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   333
	anIndex = 1 	ifTrue: [ ^ score + self firstLetterBonus ].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   334
		
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   335
	score := score + (((anIndex - 1) * self leadingLetterPenalty) max: self maxLeadingLetterPenalty).
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   336
				
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   337
	^ score 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   338
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   339
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   340
indexScore 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   341
4475
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   342
        | sum ramp |
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   343
        
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   344
        ramp := 1.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   345
        sum := 0.
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   346
        
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   347
        1 to: indexes size - 1 do: [ :ix |
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   348
                ramp := (indexes at: ix) + 1 = (indexes at: ix + 1) 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   349
                        ifTrue: [ ramp + (ramp * self adjacencyIncrease) ]
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   350
                        ifFalse: [ 1 ].                 
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   351
                
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   352
                sum := sum + ramp - 1
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   353
        ].
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   354
        
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   355
        ^ sum rounded
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   356
2e19c5a7452a #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4474
diff changeset
   357
    "Modified (format): / 14-07-2017 / 13:24:07 / cg"
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   358
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   359
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   360
isSeparator: aCharacter
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   361
        
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   362
        ^  aCharacter = $_ or: [ aCharacter = $: ]
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   363
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   364
    "Created: / 13-07-2017 / 13:30:34 / cg"
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   365
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   366
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   367
isSeperator: aCharacter
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   368
        <resource: #obsolete>
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   369
        ^ self isSeparator: aCharacter
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   370
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   371
    "Modified: / 13-07-2017 / 13:31:18 / cg"
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   372
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   373
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   374
score: aString at: stringIndex patternAt: patternIndex
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   375
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   376
        | score prev |
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   377
        
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   378
        prev := (aString at: stringIndex - 1).
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   379
        
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   380
        score := (self isSeparator: prev) 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   381
                ifTrue: [ self separatorBonus ]
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   382
                ifFalse: [ (prev asLowercase = (lowercasePattern at: patternIndex - 1))
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   383
                        ifTrue: [ 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   384
                                self adjacencyBonus + 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   385
                                ((prev = (pattern at: patternIndex - 1)) ifTrue: [ self adjacentCaseEqualBonus ] ifFalse: [ 0 ]) 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   386
                        ] 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   387
                        ifFalse: [ 0 ] 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   388
                ].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   389
        
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   390
        (aString at: stringIndex) = (pattern at: patternIndex) ifTrue: [ 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   391
                score := score + self caseEqualBonus.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   392
        ].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   393
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   394
        ^ score
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   395
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   396
    "Modified: / 13-07-2017 / 13:30:57 / cg"
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   397
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   398
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   399
!FuzzyMatcher methodsFor:'scoring-bonus'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   400
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   401
adjacencyBonus
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   402
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   403
	^ 5
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   404
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   405
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   406
adjacencyIncrease
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   407
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   408
	^ 1.2
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   409
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   410
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   411
adjacentCaseEqualBonus
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   412
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   413
	^ 3
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   414
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   415
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   416
caseEqualBonus
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   417
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   418
	^ 7
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   419
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   420
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   421
firstLetterBonus
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   422
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   423
	^ 12
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   424
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   425
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   426
separatorBonus
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   427
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   428
	^ 5
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   429
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   430
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   431
!FuzzyMatcher methodsFor:'scoring-penalty'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   432
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   433
leadingLetterPenalty
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   434
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   435
	^ -3
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   436
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   437
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   438
maxLeadingLetterPenalty
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   439
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   440
	^ -9
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   441
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   442
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   443
unmatchedLetterPenalty
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   444
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   445
	^ -1
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   446
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   447
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   448
!FuzzyMatcher class methodsFor:'documentation'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   449
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   450
version
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   451
    ^ '$Header$'
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   452
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   453
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   454
version_CVS
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   455
    ^ '$Header$'
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   456
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   457