TSTree.st
author Claus Gittinger <cg@exept.de>
Sat, 02 May 2020 21:40:13 +0200
changeset 5476 7355a4b11cb6
parent 4734 5f69ef333e9d
permissions -rw-r--r--
#FEATURE by cg class: Socket class added: #newTCPclientToHost:port:domain:domainOrder:withTimeout: changed: #newTCPclientToHost:port:domain:withTimeout:
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
2282
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     1
"{ Package: 'stx:libbasic2' }"
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     2
3668
cd42dff33054 #DOCUMENTATION
Claus Gittinger <cg@exept.de>
parents: 3253
diff changeset
     3
"{ NameSpace: Smalltalk }"
cd42dff33054 #DOCUMENTATION
Claus Gittinger <cg@exept.de>
parents: 3253
diff changeset
     4
4364
928db7fbeeb8 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 4121
diff changeset
     5
KeyedCollection subclass:#TSTree
2282
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     6
	instanceVariableNames:'root'
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     7
	classVariableNames:''
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     8
	poolDictionaries:''
2762
0532634862eb category changes
Claus Gittinger <cg@exept.de>
parents: 2468
diff changeset
     9
	category:'Collections-Ordered-Trees'
2282
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    10
!
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    11
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    12
!TSTree class methodsFor:'documentation'!
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    13
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    14
documentation
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    15
"
4121
cf558f5c22f9 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 3842
diff changeset
    16
    BTree and TSTree
cf558f5c22f9 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 3842
diff changeset
    17
3668
cd42dff33054 #DOCUMENTATION
Claus Gittinger <cg@exept.de>
parents: 3253
diff changeset
    18
    A bunch of collection classes that are useful for building large indices of things. 
cd42dff33054 #DOCUMENTATION
Claus Gittinger <cg@exept.de>
parents: 3253
diff changeset
    19
    It's especially geared towards people using OODBs like GOODS, but can be used it in the image too: 
cd42dff33054 #DOCUMENTATION
Claus Gittinger <cg@exept.de>
parents: 3253
diff changeset
    20
    the BTree class is great for when you need to select numeric keys by range, 
cd42dff33054 #DOCUMENTATION
Claus Gittinger <cg@exept.de>
parents: 3253
diff changeset
    21
    and TSTree makes a solid basis for full-text search. 
cd42dff33054 #DOCUMENTATION
Claus Gittinger <cg@exept.de>
parents: 3253
diff changeset
    22
    TreeSet has an interesting optimized #intersection: that lets you compare two collections without 
cd42dff33054 #DOCUMENTATION
Claus Gittinger <cg@exept.de>
parents: 3253
diff changeset
    23
    looking at every item of either. 
cd42dff33054 #DOCUMENTATION
Claus Gittinger <cg@exept.de>
parents: 3253
diff changeset
    24
3253
5c005be2975e Added TSMultiTree, a variant of TSTree that allows for multiple values to be stored
Jan Vrany <jan.vrany@fit.cvut.cz>
parents: 3042
diff changeset
    25
    TSTree has similar properties as BTree for String  keys. 
5c005be2975e Added TSMultiTree, a variant of TSTree that allows for multiple values to be stored
Jan Vrany <jan.vrany@fit.cvut.cz>
parents: 3042
diff changeset
    26
    So as well as keeping them sorted, you can do efficient lookups of all 
2282
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    27
    the keys with a given prefix.  One other neat trick TSTree can do is a certain 
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    28
    amount of fuzzy matching (eg find all keys with an edit distance of 3 from 
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    29
    'foo') which makes it especially useful for spell checking and similar 
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    30
    applications.
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    31
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    32
    [author:]
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    33
        Avi Bryant
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    34
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    35
    [license:]
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    36
        Dual licensed under both SqueakL and MIT. 
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    37
        This enables both base Squeak inclusion and 100% reuse.
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    38
"
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    39
! !
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    40
3812
2b4268814831 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 3668
diff changeset
    41
!TSTree class methodsFor:'queries'!
2b4268814831 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 3668
diff changeset
    42
2b4268814831 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 3668
diff changeset
    43
treeNodeClass
2b4268814831 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 3668
diff changeset
    44
    ^ TSTreeNode
2b4268814831 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 3668
diff changeset
    45
! !
2b4268814831 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 3668
diff changeset
    46
2282
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    47
!TSTree methodsFor:'accessing'!
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    48
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    49
at: aString ifAbsent: exceptionBlock
3842
11e10753a2a9 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 3812
diff changeset
    50
        |ret|
11e10753a2a9 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 3812
diff changeset
    51
        
11e10753a2a9 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 3812
diff changeset
    52
        aString isEmpty ifTrue: [^ exceptionBlock value].
11e10753a2a9 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 3812
diff changeset
    53
        root isNil ifTrue: [^ exceptionBlock value].
11e10753a2a9 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 3812
diff changeset
    54
        
11e10753a2a9 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 3812
diff changeset
    55
        ret := root lookupString: aString startingAt: 1.
11e10753a2a9 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 3812
diff changeset
    56
        ret notNil ifTrue:[^ ret].
11e10753a2a9 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 3812
diff changeset
    57
        ^ exceptionBlock value
2282
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    58
!
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    59
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    60
at: aString ifAbsentPut: exceptionBlock
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    61
	^ self at: aString ifAbsent: [self at: aString put: exceptionBlock value]
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    62
!
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    63
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    64
at: aString put: anObject
3253
5c005be2975e Added TSMultiTree, a variant of TSTree that allows for multiple values to be stored
Jan Vrany <jan.vrany@fit.cvut.cz>
parents: 3042
diff changeset
    65
        aString isEmpty ifTrue: [self error: 'Keys cannot be empty'].
3812
2b4268814831 #REFACTORING by stefan
Stefan Vogel <sv@exept.de>
parents: 3668
diff changeset
    66
        root isNil ifTrue: [root := self class treeNodeClass key: aString first].
3253
5c005be2975e Added TSMultiTree, a variant of TSTree that allows for multiple values to be stored
Jan Vrany <jan.vrany@fit.cvut.cz>
parents: 3042
diff changeset
    67
        root lookupString: aString startingAt: 1 insert: anObject.
5c005be2975e Added TSMultiTree, a variant of TSTree that allows for multiple values to be stored
Jan Vrany <jan.vrany@fit.cvut.cz>
parents: 3042
diff changeset
    68
        ^ anObject
5c005be2975e Added TSMultiTree, a variant of TSTree that allows for multiple values to be stored
Jan Vrany <jan.vrany@fit.cvut.cz>
parents: 3042
diff changeset
    69
5c005be2975e Added TSMultiTree, a variant of TSTree that allows for multiple values to be stored
Jan Vrany <jan.vrany@fit.cvut.cz>
parents: 3042
diff changeset
    70
    "Modified: / 26-04-2014 / 11:23:02 / Jan Vrany <jan.vrany@fit.cvut.cz>"
2282
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    71
!
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    72
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    73
removeKey: aString
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    74
	^ self removeKey: aString ifAbsent: [self error: 'Could not find key ', aString printString]
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    75
!
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    76
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    77
removeKey: aString ifAbsent: errorBlock
3842
11e10753a2a9 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 3812
diff changeset
    78
        | val |
11e10753a2a9 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 3812
diff changeset
    79
        val := root removeString:aString startingAt:1.
11e10753a2a9 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 3812
diff changeset
    80
        root canBeCulled ifTrue: [root := nil].
11e10753a2a9 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 3812
diff changeset
    81
        val notNil ifTrue:[^ val].
11e10753a2a9 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 3812
diff changeset
    82
        ^ errorBlock value
2282
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    83
!
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    84
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    85
values
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    86
	^ Array streamContents: [:s | self do: [:ea | s nextPut: ea]]
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    87
! !
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    88
3012
bc770831871d added isFixedSize query
Claus Gittinger <cg@exept.de>
parents: 2762
diff changeset
    89
!TSTree methodsFor:'enumerating'!
bc770831871d added isFixedSize query
Claus Gittinger <cg@exept.de>
parents: 2762
diff changeset
    90
bc770831871d added isFixedSize query
Claus Gittinger <cg@exept.de>
parents: 2762
diff changeset
    91
do: aBlock
bc770831871d added isFixedSize query
Claus Gittinger <cg@exept.de>
parents: 2762
diff changeset
    92
        root notNil ifTrue: [root do: aBlock]
bc770831871d added isFixedSize query
Claus Gittinger <cg@exept.de>
parents: 2762
diff changeset
    93
bc770831871d added isFixedSize query
Claus Gittinger <cg@exept.de>
parents: 2762
diff changeset
    94
    "Modified: / 08-08-2010 / 15:16:53 / cg"
bc770831871d added isFixedSize query
Claus Gittinger <cg@exept.de>
parents: 2762
diff changeset
    95
! !
bc770831871d added isFixedSize query
Claus Gittinger <cg@exept.de>
parents: 2762
diff changeset
    96
bc770831871d added isFixedSize query
Claus Gittinger <cg@exept.de>
parents: 2762
diff changeset
    97
!TSTree methodsFor:'searching'!
2282
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    98
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    99
matchesForPrefix: aString
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   100
	^ Array streamContents:
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   101
		[:s |
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   102
		self matchesForPrefix: aString do:
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   103
			[:match |
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   104
			s nextPut: match]]
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   105
!
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   106
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   107
matchesForPrefix: aString do: aBlock
2468
36c571b96063 changed:
Claus Gittinger <cg@exept.de>
parents: 2282
diff changeset
   108
        aString isEmpty
36c571b96063 changed:
Claus Gittinger <cg@exept.de>
parents: 2282
diff changeset
   109
                ifTrue: [self do: aBlock]
36c571b96063 changed:
Claus Gittinger <cg@exept.de>
parents: 2282
diff changeset
   110
                ifFalse: [root notNil ifTrue: [root matchesForPrefix: aString startingAt: 1 do: aBlock]]
36c571b96063 changed:
Claus Gittinger <cg@exept.de>
parents: 2282
diff changeset
   111
36c571b96063 changed:
Claus Gittinger <cg@exept.de>
parents: 2282
diff changeset
   112
    "Modified: / 08-08-2010 / 15:17:01 / cg"
2282
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   113
!
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   114
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   115
matchesForString: aString distance: aNumber
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   116
	^ Array streamContents:
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   117
		[:s |
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   118
		self matchesForString: aString distance: aNumber do:
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   119
			[:match |
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   120
			s nextPut: match]]
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   121
!
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   122
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   123
matchesForString: aString distance: aNumber do: aBlock
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   124
	self matchesForString: aString distance: aNumber limitNodes: nil do: aBlock
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   125
!
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   126
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   127
matchesForString: aString distance: aNumber limitNodes: maxNodes do: aBlock
3842
11e10753a2a9 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 3812
diff changeset
   128
        | nodeCount |
11e10753a2a9 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 3812
diff changeset
   129
        nodeCount := 0.
11e10753a2a9 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 3812
diff changeset
   130
        aString isEmpty ifTrue: [^ self].
11e10753a2a9 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 3812
diff changeset
   131
        root isNil ifTrue: [^ self].
11e10753a2a9 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 3812
diff changeset
   132
        root matchesForString: aString startingAt: 1 distance: aNumber do: aBlock nodesDo:
11e10753a2a9 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 3812
diff changeset
   133
                [:ea |
11e10753a2a9 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 3812
diff changeset
   134
                nodeCount := nodeCount + 1.
11e10753a2a9 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 3812
diff changeset
   135
                nodeCount = maxNodes ifTrue: [^ self]]
2282
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   136
! !
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   137
3012
bc770831871d added isFixedSize query
Claus Gittinger <cg@exept.de>
parents: 2762
diff changeset
   138
!TSTree methodsFor:'testing'!
2282
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   139
3012
bc770831871d added isFixedSize query
Claus Gittinger <cg@exept.de>
parents: 2762
diff changeset
   140
isFixedSize
bc770831871d added isFixedSize query
Claus Gittinger <cg@exept.de>
parents: 2762
diff changeset
   141
    "return true if the receiver cannot grow"
2468
36c571b96063 changed:
Claus Gittinger <cg@exept.de>
parents: 2282
diff changeset
   142
3012
bc770831871d added isFixedSize query
Claus Gittinger <cg@exept.de>
parents: 2762
diff changeset
   143
    ^ false
2282
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   144
! !
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   145
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   146
!TSTree class methodsFor:'documentation'!
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   147
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   148
version_CVS
3668
cd42dff33054 #DOCUMENTATION
Claus Gittinger <cg@exept.de>
parents: 3253
diff changeset
   149
    ^ '$Header$'
2282
a723f74eaae4 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   150
! !
3012
bc770831871d added isFixedSize query
Claus Gittinger <cg@exept.de>
parents: 2762
diff changeset
   151