KeywordInContextIndexBuilder.st
author Claus Gittinger <cg@exept.de>
Thu, 13 Oct 2016 14:46:18 +0200
changeset 4128 4cc1535fa7dc
parent 4127 0f3c785bb689
child 4129 04b54f7b1a82
permissions -rw-r--r--
#DOCUMENTATION by cg class: KeywordInContextIndexBuilder comment/format in: #examples #excluded: #separatorAlgorithm: changed: #addLine:reference: #addLine:reference:ignoreCase: #entriesDo:
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     1
"
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     2
 COPYRIGHT (c) 2003 by eXept Software AG
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     3
              All Rights Reserved
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     4
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     5
 This software is furnished under a license and may be used
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     6
 only in accordance with the terms of that license and with the
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     7
 inclusion of the above copyright notice.   This software may not
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     8
 be provided or otherwise made available to, or used by, any
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
     9
 other person.  No title to or ownership of the software is
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    10
 hereby transferred.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    11
"
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    12
"{ Package: 'stx:libbasic2' }"
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    13
4108
667d0bdaf609 #OTHER by cg
Claus Gittinger <cg@exept.de>
parents: 3184
diff changeset
    14
"{ NameSpace: Smalltalk }"
667d0bdaf609 #OTHER by cg
Claus Gittinger <cg@exept.de>
parents: 3184
diff changeset
    15
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    16
Object subclass:#KeywordInContextIndexBuilder
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    17
	instanceVariableNames:'keywordToLinesMapping excluded separatorAlgorithm'
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    18
	classVariableNames:''
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    19
	poolDictionaries:''
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    20
	category:'Collections-Support'
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    21
!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    22
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    23
!KeywordInContextIndexBuilder class methodsFor:'documentation'!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    24
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    25
copyright
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    26
"
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    27
 COPYRIGHT (c) 2003 by eXept Software AG
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    28
              All Rights Reserved
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    29
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    30
 This software is furnished under a license and may be used
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    31
 only in accordance with the terms of that license and with the
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    32
 inclusion of the above copyright notice.   This software may not
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    33
 be provided or otherwise made available to, or used by, any
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    34
 other person.  No title to or ownership of the software is
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    35
 hereby transferred.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    36
"
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    37
!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    38
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    39
documentation
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    40
"
4126
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
    41
    A support class for building KWIC (Keyword in Context) or KWOC (Keyword out of Context) indexes.
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
    42
    (for example, to build such indexes on html pages or class documentation).
4125
d597206782cc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4124
diff changeset
    43
    
3184
27271594c7d8 comments
Claus Gittinger <cg@exept.de>
parents: 2536
diff changeset
    44
    To generate a kwic, add each line together with a reference (or page number, or whatever),
27271594c7d8 comments
Claus Gittinger <cg@exept.de>
parents: 2536
diff changeset
    45
    using addLine:reference:.
4126
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
    46
    Then, when finished, enumerate the kwic and print as kwic or kwoc.
4127
0f3c785bb689 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4126
diff changeset
    47
    To ignore fill words (such as 'and', 'the', 'in', etc.), define those with: #excluded:
0f3c785bb689 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4126
diff changeset
    48
    this is defined 
0f3c785bb689 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4126
diff changeset
    49
    
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    50
    [author:]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    51
        Claus Gittinger (cg@alan)
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    52
4126
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
    53
    [examples:]
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
    54
        see examples method
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    55
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    56
    [see also:]
4125
d597206782cc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4124
diff changeset
    57
        https://en.wikipedia.org/wiki/Key_Word_in_Context (english)
d597206782cc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4124
diff changeset
    58
        https://de.wikipedia.org/wiki/Permutiertes_Register (german)
d597206782cc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4124
diff changeset
    59
        
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    60
"
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    61
!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    62
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    63
examples
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    64
"
4126
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
    65
    building a kwic; print as kwic and kwoc
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    66
                                                                [exBegin]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    67
    |kwic|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    68
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    69
    kwic := KeywordInContextIndexBuilder new.
4127
0f3c785bb689 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4126
diff changeset
    70
    kwic excluded:#('the' 'and' 'a' 'an' 'in').
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    71
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    72
    kwic addLine:'bla bla bla' reference:1.
4126
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
    73
    kwic addLine:'foo, bar. baz' reference:2.
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
    74
    kwic addLine:'one two three' reference:3.
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
    75
    kwic addLine:'a cat and a dog' reference:4.
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
    76
    kwic addLine:'the man in the middle' reference:5.
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
    77
    kwic addLine:'the man with the dog' reference:6.
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    78
4126
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
    79
    Transcript showCR:'Printed as KWIC:'.
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    80
    kwic 
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    81
        entriesDo:[:word :left :right :ref |
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    82
            Transcript 
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    83
                show:((left contractTo:20) leftPaddedTo:20);
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    84
                space;
4124
2d4e83bec872 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4108
diff changeset
    85
                show:((word contractTo:10) leftPaddedTo:10) allBold;
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    86
                space;
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    87
                show:((right contractTo:20) leftPaddedTo:20);
2536
8907a20de2dc changed: #examples
Claus Gittinger <cg@exept.de>
parents: 1375
diff changeset
    88
                space;
8907a20de2dc changed: #examples
Claus Gittinger <cg@exept.de>
parents: 1375
diff changeset
    89
                show:'['; show:ref; show:']';
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    90
                cr    
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
    91
        ].
4126
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
    92
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
    93
    Transcript cr.
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
    94
    Transcript showCR:'Printed as KWOC:'.
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
    95
    kwic 
4128
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
    96
        entriesDo:[:word :left :right :ref :fullText :context |
4126
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
    97
            Transcript 
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
    98
                show:((word contractTo:10) paddedTo:10) allBold;
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
    99
                space;
4128
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   100
                show:((context contractTo:60) paddedTo:60);
4126
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
   101
                space;
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
   102
                show:'['; show:ref; show:']';
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
   103
                cr    
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
   104
        ].
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   105
                                                                [exEnd]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   106
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   107
3184
27271594c7d8 comments
Claus Gittinger <cg@exept.de>
parents: 2536
diff changeset
   108
  KWIC index over method selector components; build a little browser window:
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   109
                                                                [exBegin]
3184
27271594c7d8 comments
Claus Gittinger <cg@exept.de>
parents: 2536
diff changeset
   110
    |kwic v s c list refs|
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   111
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   112
    kwic := KeywordInContextIndexBuilder new.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   113
    Smalltalk allClassesDo:[:eachClass |
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   114
        eachClass instAndClassSelectorsAndMethodsDo:[:sel :mthd |
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   115
            kwic addLine:sel reference:mthd.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   116
        ]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   117
    ].
3184
27271594c7d8 comments
Claus Gittinger <cg@exept.de>
parents: 2536
diff changeset
   118
27271594c7d8 comments
Claus Gittinger <cg@exept.de>
parents: 2536
diff changeset
   119
    v := StandardSystemView new.
27271594c7d8 comments
Claus Gittinger <cg@exept.de>
parents: 2536
diff changeset
   120
    v addComponent:(s := HVScrollableView for:SelectionInListView).
27271594c7d8 comments
Claus Gittinger <cg@exept.de>
parents: 2536
diff changeset
   121
    s origin:0.0@0.0 corner:1.0@0.5.
27271594c7d8 comments
Claus Gittinger <cg@exept.de>
parents: 2536
diff changeset
   122
    v addComponent:(c := HVScrollableView for:CodeView).
27271594c7d8 comments
Claus Gittinger <cg@exept.de>
parents: 2536
diff changeset
   123
    c origin:0.0@0.5 corner:1.0@1.0.
27271594c7d8 comments
Claus Gittinger <cg@exept.de>
parents: 2536
diff changeset
   124
27271594c7d8 comments
Claus Gittinger <cg@exept.de>
parents: 2536
diff changeset
   125
    refs := OrderedCollection new.
27271594c7d8 comments
Claus Gittinger <cg@exept.de>
parents: 2536
diff changeset
   126
    list := OrderedCollection new.
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   127
    kwic 
3184
27271594c7d8 comments
Claus Gittinger <cg@exept.de>
parents: 2536
diff changeset
   128
        entriesDo:[:word :left :right :ref |
27271594c7d8 comments
Claus Gittinger <cg@exept.de>
parents: 2536
diff changeset
   129
            list add:(word,' ',left,' ',word allBold,' ',right,' (',ref mclass name,')').
27271594c7d8 comments
Claus Gittinger <cg@exept.de>
parents: 2536
diff changeset
   130
            refs add:ref].
27271594c7d8 comments
Claus Gittinger <cg@exept.de>
parents: 2536
diff changeset
   131
    s list:list.
27271594c7d8 comments
Claus Gittinger <cg@exept.de>
parents: 2536
diff changeset
   132
    s action:[:lNr | c contents:(refs at:lNr) source].
27271594c7d8 comments
Claus Gittinger <cg@exept.de>
parents: 2536
diff changeset
   133
    v open.
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   134
                                                                [exEnd]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   135
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   136
  KWIC index over method selector components, with word separation:
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   137
                                                                [exBegin]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   138
    |kwic|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   139
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   140
    kwic := KeywordInContextIndexBuilder forMethodSelectorIndex.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   141
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   142
    Smalltalk allClassesDo:[:eachClass |
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   143
        eachClass instAndClassSelectorsAndMethodsDo:[:sel :mthd |
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   144
            kwic addLine:sel reference:mthd.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   145
        ]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   146
    ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   147
    kwic
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   148
                                                                [exEnd]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   149
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   150
  KWIC index over method comments:
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   151
                                                                [exBegin]
4124
2d4e83bec872 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4108
diff changeset
   152
    |kwic v s c refs list|
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   153
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   154
    kwic := KeywordInContextIndexBuilder forMethodComments.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   155
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   156
    Smalltalk allClassesDo:[:eachClass |
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   157
        eachClass instAndClassSelectorsAndMethodsDo:[:sel :mthd |
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   158
            |comment|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   159
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   160
            (sel == #documentation) ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   161
                comment := mthd comment.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   162
                comment notNil ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   163
                    kwic addLine:comment reference:mthd mclass ignoreCase:true.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   164
                ]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   165
            ] ifFalse:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   166
                (sel ~~ #examples
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   167
                and:[ sel ~~ #copyright
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   168
                and:[ sel ~~ #version]]) ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   169
                    comment := mthd comment.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   170
                    comment notNil ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   171
                        kwic addLine:comment reference:mthd ignoreCase:true.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   172
                    ]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   173
                ]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   174
            ]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   175
        ]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   176
    ].
4124
2d4e83bec872 #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4108
diff changeset
   177
    kwic.
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   178
                                                                [exEnd]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   179
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   180
  KWIC index over class comments:
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   181
                                                                [exBegin]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   182
    |kwic|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   183
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   184
    kwic := KeywordInContextIndexBuilder forMethodComments.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   185
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   186
    Smalltalk allClassesDo:[:eachClass |
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   187
        |mthd comment|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   188
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   189
        mthd := eachClass theMetaclass compiledMethodAt:#documentation.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   190
        mthd notNil ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   191
            comment := mthd comment.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   192
            comment notNil ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   193
                kwic addLine:comment reference:eachClass theNonMetaclass ignoreCase:true.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   194
            ]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   195
        ]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   196
    ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   197
    kwic
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   198
                                                                [exEnd]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   199
"
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   200
! !
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   201
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   202
!KeywordInContextIndexBuilder class methodsFor:'instance creation'!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   203
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   204
forMethodComments
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   205
    "return an indexer for method comments"
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   206
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   207
    |sepChars sep kwic|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   208
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   209
    sepChars := '.,;:_ !![]()''"#?<>|' , Character return, Character lf, Character tab.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   210
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   211
    sep := [:lines | lines asString asCollectionOfSubstringsSeparatedByAny:sepChars].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   212
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   213
    kwic := self new.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   214
    kwic separatorAlgorithm:sep.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   215
    kwic excluded:#('the' 'and' 'a' 'an' 'for' 'with' 'no').
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   216
    ^ kwic
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   217
!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   218
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   219
forMethodSelectorIndex
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   220
    "return an indexer for method selector components, with word separation at case boundaries"
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   221
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   222
    |sep kwic sepUCWords|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   223
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   224
    sepUCWords := [:word :keyWords| 
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   225
                    |s w c lastC last2C frag|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   226
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   227
                    word asLowercase = word ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   228
                        keyWords add:word.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   229
                    ] ifFalse:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   230
                        s := word readStream.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   231
                        w := '' writeStream.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   232
                        [s atEnd] whileFalse:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   233
                            c := s next.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   234
                            (c isUppercase) ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   235
                                (lastC notNil and:[lastC isUppercase not]) ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   236
                                    keyWords add:w contents.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   237
                                    w := '' writeStream.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   238
                                ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   239
                            ] ifFalse:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   240
                                (last2C notNil and:[last2C isUppercase and:[lastC isUppercase]]) ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   241
                                    c isLetter ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   242
                                        frag := w contents.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   243
                                        w := '' writeStream.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   244
                                        w nextPut:(frag last).
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   245
                                        keyWords add:(frag allButLast).
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   246
                                    ] ifFalse:[
4108
667d0bdaf609 #OTHER by cg
Claus Gittinger <cg@exept.de>
parents: 3184
diff changeset
   247
                                       "/ frag := w contents.
667d0bdaf609 #OTHER by cg
Claus Gittinger <cg@exept.de>
parents: 3184
diff changeset
   248
                                       "/ w := '' writeStream.
667d0bdaf609 #OTHER by cg
Claus Gittinger <cg@exept.de>
parents: 3184
diff changeset
   249
                                       "/ keyWords add:frag.
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   250
                                    ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   251
                                ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   252
                            ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   253
                            w nextPut:c.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   254
                            last2C := lastC.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   255
                            lastC := c.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   256
                        ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   257
                    ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   258
                  ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   259
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   260
    sep := [:line | 
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   261
                |words keyWords|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   262
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   263
                words := line asCollectionOfSubstringsSeparatedByAny:'.,;:_ '.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   264
                keyWords := OrderedCollection new.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   265
                words do:[:eachWord | sepUCWords value:eachWord value:keyWords].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   266
                keyWords
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   267
            ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   268
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   269
    kwic := self new.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   270
    kwic separatorAlgorithm:sep.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   271
    ^ kwic
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   272
!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   273
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   274
new
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   275
    ^ self basicNew initialize
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   276
! !
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   277
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   278
!KeywordInContextIndexBuilder methodsFor:'accessing'!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   279
4128
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   280
excluded:aListOfExcludedWords
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   281
    "define words which are to be ignored.
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   282
     Typically, this is a list of fillwords, such as 'and', 'the', 'in', etc."
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   283
     
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   284
    excluded := aListOfExcludedWords asSet.
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   285
!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   286
4128
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   287
separatorAlgorithm:aBlock
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   288
    "define the algorithm to split a given string into words.
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   289
     The default is to split at punctuation and whitespace
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   290
     (see #initialize)"
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   291
     
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   292
    separatorAlgorithm := aBlock.
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   293
! !
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   294
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   295
!KeywordInContextIndexBuilder methodsFor:'building'!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   296
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   297
addLine:aLine reference:opaqueReference
3184
27271594c7d8 comments
Claus Gittinger <cg@exept.de>
parents: 2536
diff changeset
   298
    "add a text line; the line is split at words and entered into the kwic.
4128
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   299
     The reference argument is stored as 'value' of the generated entries.
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   300
     It can be anything"
3184
27271594c7d8 comments
Claus Gittinger <cg@exept.de>
parents: 2536
diff changeset
   301
4128
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   302
    self addLine:aLine reference:opaqueReference ignoreCase:true
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   303
!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   304
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   305
addLine:aLine reference:opaqueReference ignoreCase:ignoreCase
4128
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   306
    "add a line to the kwic.
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   307
     The line is split up into words, and a reference to opaqueReference
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   308
     is added for each word.
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   309
     The reference argument is stored as 'value' of the generated entries.
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   310
     It can be anything"
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   311
     
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   312
    (separatorAlgorithm value:aLine) do:[:eachWord |
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   313
        |set word|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   314
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   315
        ignoreCase ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   316
            word := eachWord asLowercase.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   317
        ] ifFalse:[
4128
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   318
            word := eachWord.
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   319
        ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   320
        (excluded includes:word) ifFalse:[
4128
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   321
            set := keywordToLinesMapping at:word ifAbsentPut:[Set new].
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   322
            set add:(aLine -> opaqueReference).
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   323
        ]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   324
    ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   325
! !
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   326
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   327
!KeywordInContextIndexBuilder methodsFor:'enumerating'!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   328
4128
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   329
entriesDo:aFourToSixArgBlock
4126
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
   330
    "evaluate the argument, for each entry.
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
   331
     If it is a 4-arg block, it is called with:
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
   332
        kwic-word, 
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
   333
        left-text, 
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
   334
        right text 
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
   335
        and reference
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
   336
     If it is a 5-arg block, the original text is passed as additional argument.
4128
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   337
     If it is a 6-arg block, the original text and the context are passed as additional argument.
4126
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
   338
     (stupid, but done for backward compatibility)"
3184
27271594c7d8 comments
Claus Gittinger <cg@exept.de>
parents: 2536
diff changeset
   339
4128
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   340
    |fourArgBlock|
4126
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
   341
4128
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   342
    aFourToSixArgBlock numArgs == 4 ifTrue:[
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   343
        fourArgBlock := aFourToSixArgBlock 
4126
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
   344
    ].    
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   345
    keywordToLinesMapping keys asSortedCollection do:[:eachKey |
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   346
        |setOfMatches lcKey|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   347
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   348
        setOfMatches := keywordToLinesMapping at:eachKey.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   349
        lcKey := eachKey asLowercase.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   350
        setOfMatches do:[:eachAssoc |
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   351
            |text ref lines idx lIdx context left right word prevLine nextLine|
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   352
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   353
            text := eachAssoc key.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   354
            ref := eachAssoc value.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   355
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   356
            lines := text asCollectionOfLines.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   357
            idx := lines findFirst:[:line | line asLowercase includesString:lcKey].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   358
            idx ~~ 0 ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   359
                context := lines at:idx.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   360
                idx > 1 ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   361
                    prevLine := (lines at:idx-1).
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   362
                    context := prevLine , ' ' , context.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   363
                ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   364
                idx < lines size ifTrue:[
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   365
                    nextLine := (lines at:idx+1).
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   366
                    context :=  context , ' ' , nextLine.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   367
                ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   368
                lIdx := context asLowercase findString:lcKey.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   369
                left := (context copyTo:lIdx - 1) withoutSeparators.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   370
                right := (context copyFrom:lIdx + lcKey size) withoutSeparators.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   371
                word := (context copyFrom:lIdx to:lIdx + lcKey size - 1) withoutSeparators.
4126
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
   372
                fourArgBlock notNil ifTrue:[
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
   373
                    fourArgBlock value:word value:left value:right value:ref.
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
   374
                ] ifFalse:[
4128
4cc1535fa7dc #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4127
diff changeset
   375
                    aFourToSixArgBlock value:word optionalArgument:left and:right and:ref and:text and:context
4126
4d3ec803fddf #DOCUMENTATION by cg
Claus Gittinger <cg@exept.de>
parents: 4125
diff changeset
   376
                ].    
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   377
            ].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   378
        ]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   379
    ]
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   380
! !
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   381
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   382
!KeywordInContextIndexBuilder methodsFor:'initialization'!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   383
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   384
initialize
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   385
    keywordToLinesMapping := Dictionary new.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   386
    excluded := Set new.
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   387
    separatorAlgorithm := [:line | line asCollectionOfSubstringsSeparatedByAny:' .:,;-'].
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   388
! !
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   389
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   390
!KeywordInContextIndexBuilder class methodsFor:'documentation'!
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   391
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   392
version
4108
667d0bdaf609 #OTHER by cg
Claus Gittinger <cg@exept.de>
parents: 3184
diff changeset
   393
    ^ '$Header$'
2536
8907a20de2dc changed: #examples
Claus Gittinger <cg@exept.de>
parents: 1375
diff changeset
   394
!
8907a20de2dc changed: #examples
Claus Gittinger <cg@exept.de>
parents: 1375
diff changeset
   395
8907a20de2dc changed: #examples
Claus Gittinger <cg@exept.de>
parents: 1375
diff changeset
   396
version_CVS
4108
667d0bdaf609 #OTHER by cg
Claus Gittinger <cg@exept.de>
parents: 3184
diff changeset
   397
    ^ '$Header$'
1375
e034d3e027f2 initial checkin
Claus Gittinger <cg@exept.de>
parents:
diff changeset
   398
! !
3184
27271594c7d8 comments
Claus Gittinger <cg@exept.de>
parents: 2536
diff changeset
   399