KeywordInContextIndexBuilder.st
author Claus Gittinger <cg@exept.de>
Mon, 14 Feb 2011 18:39:30 +0100
changeset 2536 8907a20de2dc
parent 1375 e034d3e027f2
child 3184 27271594c7d8
permissions -rw-r--r--
changed: #examples
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     1
"
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     2
 COPYRIGHT (c) 2003 by eXept Software AG
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     3
              All Rights Reserved
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     4
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     5
 This software is furnished under a license and may be used
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     6
 only in accordance with the terms of that license and with the
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     7
 inclusion of the above copyright notice.   This software may not
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     8
 be provided or otherwise made available to, or used by, any
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     9
 other person.  No title to or ownership of the software is
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    10
 hereby transferred.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    11
"
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    12
"{ Package: 'stx:libbasic2' }"
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    13
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    14
Object subclass:#KeywordInContextIndexBuilder
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    15
	instanceVariableNames:'keywordToLinesMapping excluded separatorAlgorithm'
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    16
	classVariableNames:''
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    17
	poolDictionaries:''
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    18
	category:'Collections-Support'
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    19
!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    20
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    21
!KeywordInContextIndexBuilder class methodsFor:'documentation'!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    22
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    23
copyright
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    24
"
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    25
 COPYRIGHT (c) 2003 by eXept Software AG
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    26
              All Rights Reserved
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    27
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    28
 This software is furnished under a license and may be used
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    29
 only in accordance with the terms of that license and with the
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    30
 inclusion of the above copyright notice.   This software may not
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    31
 be provided or otherwise made available to, or used by, any
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    32
 other person.  No title to or ownership of the software is
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    33
 hereby transferred.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    34
"
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    35
!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    36
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    37
documentation
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    38
"
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    39
    A support class for building a KWIC (Keyword in Context) indices.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    40
    (for example, to build a KWIC index on html pages or class documentation).
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    41
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    42
    [author:]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    43
        Claus Gittinger (cg@alan)
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    44
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    45
    [instance variables:]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    46
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    47
    [class variables:]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    48
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    49
    [see also:]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    50
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    51
"
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    52
!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    53
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    54
examples
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    55
"
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    56
                                                                [exBegin]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    57
    |kwic|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    58
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    59
    kwic := KeywordInContextIndexBuilder new.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    60
    kwic excluded:#('the' 'and' 'a' 'an').
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    61
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    62
    kwic addLine:'bla bla bla' reference:1.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    63
    kwic addLine:'one two three' reference:2.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    64
    kwic addLine:'a cat and a dog' reference:3.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    65
    kwic addLine:'the man in the middle' reference:4.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    66
    kwic addLine:'the man with the dog' reference:5.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    67
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    68
    kwic 
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    69
        entriesDo:[:word :left :right :ref |
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    70
            Transcript 
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    71
                show:((left contractTo:20) leftPaddedTo:20);
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    72
                space;
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    73
                show:((word contractTo:10) leftPaddedTo:10);
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    74
                space;
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    75
                show:((right contractTo:20) leftPaddedTo:20);
2536
8907a20de2dc changed: #examples
Claus Gittinger <cg@exept.de>
parents: 1375
diff changeset
    76
                space;
8907a20de2dc changed: #examples
Claus Gittinger <cg@exept.de>
parents: 1375
diff changeset
    77
                show:'['; show:ref; show:']';
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    78
                cr    
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    79
        ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    80
                                                                [exEnd]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    81
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    82
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    83
  KWIC index over method selector components:
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    84
                                                                [exBegin]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    85
    |kwic|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    86
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    87
    kwic := KeywordInContextIndexBuilder new.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    88
    Smalltalk allClassesDo:[:eachClass |
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    89
        eachClass instAndClassSelectorsAndMethodsDo:[:sel :mthd |
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    90
            kwic addLine:sel reference:mthd.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    91
        ]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    92
    ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    93
    kwic 
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    94
                                                                [exEnd]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    95
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    96
  KWIC index over method selector components, with word separation:
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    97
                                                                [exBegin]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    98
    |kwic|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    99
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   100
    kwic := KeywordInContextIndexBuilder forMethodSelectorIndex.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   101
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   102
    Smalltalk allClassesDo:[:eachClass |
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   103
        eachClass instAndClassSelectorsAndMethodsDo:[:sel :mthd |
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   104
            kwic addLine:sel reference:mthd.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   105
        ]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   106
    ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   107
    kwic
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   108
                                                                [exEnd]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   109
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   110
  KWIC index over method comments:
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   111
                                                                [exBegin]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   112
    |kwic|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   113
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   114
    kwic := KeywordInContextIndexBuilder forMethodComments.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   115
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   116
    Smalltalk allClassesDo:[:eachClass |
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   117
        eachClass instAndClassSelectorsAndMethodsDo:[:sel :mthd |
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   118
            |comment|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   119
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   120
            (sel == #documentation) ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   121
                comment := mthd comment.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   122
                comment notNil ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   123
                    kwic addLine:comment reference:mthd mclass ignoreCase:true.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   124
                ]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   125
            ] ifFalse:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   126
                (sel ~~ #examples
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   127
                and:[ sel ~~ #copyright
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   128
                and:[ sel ~~ #version]]) ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   129
                    comment := mthd comment.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   130
                    comment notNil ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   131
                        kwic addLine:comment reference:mthd ignoreCase:true.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   132
                    ]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   133
                ]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   134
            ]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   135
        ]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   136
    ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   137
    kwic
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   138
                                                                [exEnd]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   139
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   140
  KWIC index over class comments:
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   141
                                                                [exBegin]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   142
    |kwic|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   143
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   144
    kwic := KeywordInContextIndexBuilder forMethodComments.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   145
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   146
    Smalltalk allClassesDo:[:eachClass |
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   147
        |mthd comment|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   148
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   149
        mthd := eachClass theMetaclass compiledMethodAt:#documentation.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   150
        mthd notNil ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   151
            comment := mthd comment.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   152
            comment notNil ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   153
                kwic addLine:comment reference:eachClass theNonMetaclass ignoreCase:true.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   154
            ]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   155
        ]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   156
    ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   157
    kwic
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   158
                                                                [exEnd]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   159
"
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   160
! !
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   161
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   162
!KeywordInContextIndexBuilder class methodsFor:'instance creation'!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   163
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   164
forMethodComments
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   165
    "return an indexer for method comments"
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   166
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   167
    |sepChars sep kwic|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   168
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   169
    sepChars := '.,;:_ !![]()''"#?<>|' , Character return, Character lf, Character tab.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   170
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   171
    sep := [:lines | lines asString asCollectionOfSubstringsSeparatedByAny:sepChars].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   172
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   173
    kwic := self new.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   174
    kwic separatorAlgorithm:sep.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   175
    kwic excluded:#('the' 'and' 'a' 'an' 'for' 'with' 'no').
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   176
    ^ kwic
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   177
!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   178
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   179
forMethodSelectorIndex
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   180
    "return an indexer for method selector components, with word separation at case boundaries"
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   181
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   182
    |sep kwic sepUCWords|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   183
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   184
    sepUCWords := [:word :keyWords| 
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   185
                    |s w c lastC last2C frag|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   186
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   187
                    word asLowercase = word ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   188
                        keyWords add:word.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   189
                    ] ifFalse:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   190
                        s := word readStream.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   191
                        w := '' writeStream.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   192
                        [s atEnd] whileFalse:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   193
                            c := s next.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   194
                            (c isUppercase) ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   195
                                (lastC notNil and:[lastC isUppercase not]) ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   196
                                    keyWords add:w contents.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   197
                                    w := '' writeStream.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   198
                                ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   199
                            ] ifFalse:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   200
                                (last2C notNil and:[last2C isUppercase and:[lastC isUppercase]]) ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   201
                                    c isLetter ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   202
                                        frag := w contents.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   203
                                        w := '' writeStream.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   204
                                        w nextPut:(frag last).
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   205
                                        keyWords add:(frag allButLast).
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   206
                                    ] ifFalse:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   207
                                       ' frag := w contents.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   208
                                        w := '' writeStream.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   209
                                        keyWords add:frag. '.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   210
                                    ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   211
                                ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   212
                            ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   213
                            w nextPut:c.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   214
                            last2C := lastC.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   215
                            lastC := c.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   216
                        ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   217
                    ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   218
                  ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   219
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   220
    sep := [:line | 
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   221
                |words keyWords|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   222
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   223
                words := line asCollectionOfSubstringsSeparatedByAny:'.,;:_ '.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   224
                keyWords := OrderedCollection new.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   225
                words do:[:eachWord | sepUCWords value:eachWord value:keyWords].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   226
                keyWords
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   227
            ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   228
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   229
    kwic := self new.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   230
    kwic separatorAlgorithm:sep.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   231
    ^ kwic
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   232
!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   233
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   234
new
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   235
    ^ self basicNew initialize
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   236
! !
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   237
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   238
!KeywordInContextIndexBuilder methodsFor:'accessing'!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   239
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   240
excluded:something
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   241
    excluded := something asSet.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   242
!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   243
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   244
separatorAlgorithm:something
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   245
    separatorAlgorithm := something.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   246
! !
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   247
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   248
!KeywordInContextIndexBuilder methodsFor:'building'!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   249
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   250
addLine:aLine reference:opaqueReference
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   251
    self addLine:aLine reference:opaqueReference ignoreCase:false
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   252
!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   253
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   254
addLine:aLine reference:opaqueReference ignoreCase:ignoreCase
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   255
    (separatorAlgorithm value:aLine) do:[:eachWord |
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   256
        |set word|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   257
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   258
        ignoreCase ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   259
            word := eachWord asLowercase.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   260
        ] ifFalse:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   261
            word := eachWord asLowercase.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   262
        ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   263
        (excluded includes:word) ifFalse:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   264
            set := keywordToLinesMapping at:word ifAbsent:nil.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   265
            set isNil ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   266
                set := Set new.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   267
                keywordToLinesMapping at:word put:set
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   268
            ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   269
            set add:(aLine -> opaqueReference).
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   270
        ]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   271
    ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   272
! !
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   273
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   274
!KeywordInContextIndexBuilder methodsFor:'enumerating'!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   275
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   276
entriesDo:aBlock
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   277
    keywordToLinesMapping keys asSortedCollection do:[:eachKey |
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   278
        |setOfMatches lcKey|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   279
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   280
        setOfMatches := keywordToLinesMapping at:eachKey.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   281
        lcKey := eachKey asLowercase.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   282
        setOfMatches do:[:eachAssoc |
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   283
            |text ref lines idx lIdx context left right word prevLine nextLine|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   284
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   285
            text := eachAssoc key.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   286
            ref := eachAssoc value.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   287
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   288
            lines := text asCollectionOfLines.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   289
            idx := lines findFirst:[:line | line asLowercase includesString:lcKey].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   290
            idx ~~ 0 ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   291
                context := lines at:idx.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   292
                idx > 1 ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   293
                    prevLine := (lines at:idx-1).
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   294
                    context := prevLine , ' ' , context.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   295
                ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   296
                idx < lines size ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   297
                    nextLine := (lines at:idx+1).
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   298
                    context :=  context , ' ' , nextLine.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   299
                ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   300
                lIdx := context asLowercase findString:lcKey.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   301
                left := (context copyTo:lIdx - 1) withoutSeparators.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   302
                right := (context copyFrom:lIdx + lcKey size) withoutSeparators.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   303
                word := (context copyFrom:lIdx to:lIdx + lcKey size - 1) withoutSeparators.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   304
                aBlock value:word value:left value:right value:ref.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   305
            ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   306
        ]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   307
    ]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   308
! !
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   309
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   310
!KeywordInContextIndexBuilder methodsFor:'initialization'!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   311
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   312
initialize
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   313
    keywordToLinesMapping := Dictionary new.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   314
    excluded := Set new.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   315
    separatorAlgorithm := [:line | line asCollectionOfSubstringsSeparatedByAny:' .:,;-'].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   316
! !
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   317
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   318
!KeywordInContextIndexBuilder class methodsFor:'documentation'!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   319
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   320
version
2536
8907a20de2dc changed: #examples
Claus Gittinger <cg@exept.de>
parents: 1375
diff changeset
   321
    ^ '$Header: /cvs/stx/stx/libbasic2/KeywordInContextIndexBuilder.st,v 1.2 2011-02-14 17:39:30 cg Exp $'
8907a20de2dc changed: #examples
Claus Gittinger <cg@exept.de>
parents: 1375
diff changeset
   322
!
8907a20de2dc changed: #examples
Claus Gittinger <cg@exept.de>
parents: 1375
diff changeset
   323
8907a20de2dc changed: #examples
Claus Gittinger <cg@exept.de>
parents: 1375
diff changeset
   324
version_CVS
8907a20de2dc changed: #examples
Claus Gittinger <cg@exept.de>
parents: 1375
diff changeset
   325
    ^ '$Header: /cvs/stx/stx/libbasic2/KeywordInContextIndexBuilder.st,v 1.2 2011-02-14 17:39:30 cg Exp $'
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   326
! !