FuzzyMatcher.st
author Claus Gittinger <cg@exept.de>
Thu, 13 Jul 2017 20:48:51 +0200
changeset 4469 b6f641dce0ac
parent 4468 4da827f70608
child 4470 5825ccc0dabf
permissions -rw-r--r--
#FEATURE by cg class: FuzzyMatcher class removed: #examples
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     1
"{ Package: 'stx:libbasic2' }"
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     2
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     3
"{ NameSpace: Smalltalk }"
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     4
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     5
Object subclass:#FuzzyMatcher
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     6
	instanceVariableNames:'pattern lowercasePattern indexes'
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     7
	classVariableNames:''
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     8
	poolDictionaries:''
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     9
	category:'Collections-Text-Support'
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    10
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    11
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    12
!FuzzyMatcher class methodsFor:'documentation'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    13
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    14
documentation
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    15
"
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    16
    FuzzyMatcher is an approximate string matching algroithm that can determine if a string includes a given pattern.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    17
    For example, the string 'axby' matches both the pattern 'ab' and, 'ay', but not 'ba'. 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    18
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    19
    The algorithm is based on lib_fts[1], and includes an optional scoring algorithm 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    20
    that can be used to sort all the matches based on their similarity to the pattern.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    21
    It is used in the sublime text editor.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    22
    
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    23
    [see also:]
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    24
        https://blog.forrestthewoods.com/reverse-engineering-sublime-text-s-fuzzy-match-4cffeed33fdb
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    25
        https://github.com/forrestthewoods/lib_fts
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    26
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    27
"
4468
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    28
!
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    29
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    30
example
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    31
"
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    32
    |top lv list field patternHolder names|
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    33
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    34
    patternHolder := '' asValue.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    35
    list := List new.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    36
    
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    37
    top := StandardSystemView new.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    38
    lv := ListView origin:(0.0@30) corner:(1.0@1.0) in:top.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    39
    lv model:list.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    40
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    41
    field := EditField origin:(0.0@0.0) corner:(1.0@30) in:top.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    42
    field model:patternHolder.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    43
    field immediateAccept:true.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    44
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    45
    names := Smalltalk allClasses collect:#name.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    46
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    47
    patternHolder 
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    48
        onChangeEvaluate:[
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    49
            |matcher pattern matches|
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    50
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    51
            pattern := patternHolder value.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    52
            pattern notEmpty ifTrue:[
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    53
                matcher := FuzzyMatcher pattern:pattern.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    54
                
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    55
                matches := OrderedCollection new.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    56
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    57
                names do:[:eachClassName | 
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    58
                    matcher 
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    59
                        match:eachClassName
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    60
                        ifScored: [ :score | matches add: eachClassName -> score ] 
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    61
                ].
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    62
                matches sortBySelector:#value.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    63
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    64
                list removeAll.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    65
                list addAllReversed:(matches 
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    66
                                collect:[:nameScoreAssoc | 
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    67
                                    '[%1] %2' bindWith:nameScoreAssoc value with:nameScoreAssoc key])
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    68
            ].    
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    69
        ].    
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    70
    top open.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    71
    patternHolder value:'mph'.
4da827f70608 #FEATURE by cg
Claus Gittinger <cg@exept.de>
parents: 4459
diff changeset
    72
"
4459
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    73
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    74
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    75
!FuzzyMatcher class methodsFor:'instance creation'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    76
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    77
new
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    78
    "return an initialized instance"
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    79
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    80
    ^ self basicNew initialize.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    81
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    82
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    83
!FuzzyMatcher class methodsFor:'construction'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    84
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    85
pattern: aString
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    86
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    87
	^self new pattern: aString
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    88
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    89
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    90
!FuzzyMatcher class methodsFor:'utilities api'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    91
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    92
allMatching: aPattern in: aCollection
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    93
        "Assumes that the collection is a collection of Strings"
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    94
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    95
        | matcher |
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    96
        
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    97
        matcher := self pattern: aPattern.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    98
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    99
        ^ aCollection select: [ :each | matcher matches: each ]
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   100
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   101
        "
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   102
         self 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   103
            allMatching:'clu' 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   104
            in:(Smalltalk allClasses collect:#name)
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   105
        "
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   106
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   107
    "Modified (comment): / 13-07-2017 / 13:29:11 / cg"
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   108
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   109
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   110
allMatching: aPattern in: aCollection by: aBlockReturningString
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   111
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   112
	| matcher |
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   113
	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   114
	matcher := self pattern: aPattern.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   115
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   116
	^ aCollection select: [ :each | matcher matches: (aBlockReturningString value: each) ]
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   117
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   118
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   119
allSortedByScoreMatching: aPattern in: aCollection
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   120
	"Assumes that the collection is a collection of Strings"
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   121
	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   122
	^ self allSortedByScoreMatching: aPattern in: aCollection by: [ :each | each ]
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   123
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   124
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   125
allSortedByScoreMatching: aPattern in: aCollection by: aBlockReturningString
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   126
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   127
	| matcher matches |
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   128
	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   129
	aPattern isEmpty ifTrue: [ ^ aCollection asArray ].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   130
	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   131
	matcher := self pattern: aPattern.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   132
	matches := OrderedCollection new: aCollection size // 2.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   133
	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   134
	aCollection do: [ :each | 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   135
		matcher 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   136
			match: (aBlockReturningString value: each) 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   137
			ifScored: [ :score | matches add: score -> each ] 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   138
	].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   139
	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   140
	matches sort: [ :a :b | a key >= b key ].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   141
	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   142
	^ matches collect: [ :each | each value ] as: Array
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   143
	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   144
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   145
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   146
!FuzzyMatcher methodsFor:'accessing'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   147
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   148
pattern
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   149
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   150
	^ pattern 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   151
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   152
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   153
pattern: aString
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   154
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   155
	pattern := aString.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   156
	lowercasePattern := pattern asLowercase.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   157
	indexes := Array new: pattern size.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   158
	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   159
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   160
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   161
!FuzzyMatcher methodsFor:'comparing'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   162
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   163
match: aString ifScored: aBlock
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   164
	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   165
	| score |
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   166
	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   167
	score := 0.	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   168
		
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   169
	pattern ifEmpty: [ aBlock value: score. ^ self ].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   170
	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   171
	(self matches: aString) ifFalse: [ ^ self ].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   172
	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   173
	score := self firstScore: aString at: indexes first.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   174
	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   175
	2 to: pattern size do: [ :pix | 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   176
		score := score + (self score: aString at: (indexes at: pix) patternAt: pix)
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   177
	].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   178
		
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   179
	score := score + self indexScore + ((aString size - pattern size) * self unmatchedLetterPenalty).
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   180
		
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   181
	aBlock value: score.	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   182
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   183
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   184
matches: aString
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   185
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   186
	| idx |
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   187
	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   188
	pattern size > aString size ifTrue: [ ^ false ].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   189
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   190
	idx := 0.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   191
	pattern withIndexDo: [ :each :i |
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   192
		idx := aString 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   193
			findString: each asString 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   194
			startingAt: idx + 1 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   195
			caseSensitive: false. 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   196
		
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   197
		idx == 0 ifTrue: [ ^ false ].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   198
		indexes at: i put: idx.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   199
	].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   200
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   201
	^ true
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   202
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   203
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   204
!FuzzyMatcher methodsFor:'initialization'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   205
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   206
initialize
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   207
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   208
	super initialize.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   209
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   210
	pattern := lowercasePattern := ''.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   211
	indexes := #().
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   212
	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   213
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   214
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   215
!FuzzyMatcher methodsFor:'private'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   216
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   217
firstScore: aString at: anIndex
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   218
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   219
	| score |
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   220
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   221
	score := (aString at: anIndex) = pattern first 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   222
		ifTrue: [ self caseEqualBonus ]
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   223
		ifFalse: [ 0 ].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   224
	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   225
	anIndex = 1 	ifTrue: [ ^ score + self firstLetterBonus ].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   226
		
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   227
	score := score + (((anIndex - 1) * self leadingLetterPenalty) max: self maxLeadingLetterPenalty).
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   228
				
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   229
	^ score 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   230
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   231
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   232
indexScore 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   233
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   234
	| sum ramp |
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   235
	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   236
	ramp := 1.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   237
	sum := 0.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   238
	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   239
	1 to: indexes size - 1 do: [ :ix |
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   240
		ramp := (indexes at: ix) + 1 = (indexes at: ix + 1) 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   241
			ifTrue: [ ramp + (ramp * self adjacencyIncrease) ]
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   242
			ifFalse: [ 1 ].		
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   243
		
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   244
		sum := sum + ramp - 1
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   245
	].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   246
	
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   247
	^ sum rounded
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   248
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   249
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   250
isSeparator: aCharacter
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   251
        
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   252
        ^  aCharacter = $_ or: [ aCharacter = $: ]
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   253
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   254
    "Created: / 13-07-2017 / 13:30:34 / cg"
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   255
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   256
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   257
isSeperator: aCharacter
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   258
        <resource: #obsolete>
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   259
        ^ self isSeparator: aCharacter
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   260
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   261
    "Modified: / 13-07-2017 / 13:31:18 / cg"
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   262
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   263
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   264
score: aString at: stringIndex patternAt: patternIndex
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   265
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   266
        | score prev |
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   267
        
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   268
        prev := (aString at: stringIndex - 1).
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   269
        
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   270
        score := (self isSeparator: prev) 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   271
                ifTrue: [ self separatorBonus ]
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   272
                ifFalse: [ (prev asLowercase = (lowercasePattern at: patternIndex - 1))
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   273
                        ifTrue: [ 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   274
                                self adjacencyBonus + 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   275
                                ((prev = (pattern at: patternIndex - 1)) ifTrue: [ self adjacentCaseEqualBonus ] ifFalse: [ 0 ]) 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   276
                        ] 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   277
                        ifFalse: [ 0 ] 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   278
                ].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   279
        
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   280
        (aString at: stringIndex) = (pattern at: patternIndex) ifTrue: [ 
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   281
                score := score + self caseEqualBonus.
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   282
        ].
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   283
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   284
        ^ score
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   285
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   286
    "Modified: / 13-07-2017 / 13:30:57 / cg"
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   287
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   288
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   289
!FuzzyMatcher methodsFor:'scoring-bonus'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   290
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   291
adjacencyBonus
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   292
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   293
	^ 5
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   294
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   295
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   296
adjacencyIncrease
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   297
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   298
	^ 1.2
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   299
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   300
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   301
adjacentCaseEqualBonus
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   302
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   303
	^ 3
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   304
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   305
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   306
caseEqualBonus
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   307
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   308
	^ 7
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   309
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   310
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   311
firstLetterBonus
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   312
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   313
	^ 12
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   314
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   315
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   316
separatorBonus
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   317
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   318
	^ 5
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   319
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   320
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   321
!FuzzyMatcher methodsFor:'scoring-penalty'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   322
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   323
leadingLetterPenalty
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   324
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   325
	^ -3
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   326
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   327
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   328
maxLeadingLetterPenalty
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   329
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   330
	^ -9
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   331
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   332
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   333
unmatchedLetterPenalty
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   334
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   335
	^ -1
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   336
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   337
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   338
!FuzzyMatcher class methodsFor:'documentation'!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   339
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   340
version
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   341
    ^ '$Header$'
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   342
!
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   343
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   344
version_CVS
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   345
    ^ '$Header$'
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   346
! !
cfbde361fe34 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   347