Working with Processes

Introduction
Creating & Atarting New Processes
Shared Access to Objects, Locking & Aynchronization
Process Priorities & Scheduling
Round-Robin Scheduling (Timeslicing)
Dynamic Process Priorities
Process Stacks
Processvariables - Thread Local Storage
Views and Processes
Views and Priorities
Background Processes
Suggested Priorities & Hints
Blocking Interrupts
Interrupting a Process
Timeouts
Terminating a Process
Interrupting a Runaway Process
Processes and SnapShotImage Restart
How the Scheduler Gets Control

Introduction

Smalltalk/X provides facilities for lightweight processes (also called threads). These are implemented within Smalltalk itself i.e. the implementation does not depend on any particular operating system support.
Doing so has some advantages:

it runs on every system
all process management & scheduling is fully under your control. That means, that the scheduling algorithms could be changed for special needs.
You can have a look at the implementation and learn more about process handling and scheduling.

If the system is running on UNIX and you are already familiar with UNIX processes, you should keep in mind that:

Lightweight processes should not be confused with Unix's processes. Unix processes have separate address-spaces, meaning that once created, such a process can no longer access and/or modify objects from the parent process. With Unix processes, communication must be done via explicit interprocess mechanisms, such as shared memory, sockets, pipes or files. Also, once forked, files or resources opened by a Unix subprocess are not automatically visible and/or accessible from the parent process. (This is also true for other child processes).
In contrast, Smalltalk processes all run in one address space, therefore communication is possible via objects. They share open files, windows and other operating system resources.

On some systems, which do support native threads, Smalltalk processes are mapped to native threads. However, ST/X currently enforces and makes certain, that only one of them executes at any time.
This limitation avoids expensive synchronization in the memory management code and avoids single CPU systems from suffering any performance penalties from synchronization code which is not needed in the single CPU situation.
This may change, when multi core systems become more common than single cpu systems - however, time will show how much overhead is introduced due to this.
Notice, that using inline C-code, it is even today possible, to execute some piece of C-code concurrently on in another thread (which may make sense for heavy mathematical, 3D computational or asynchronous communication code). However, no object memory activity may take place in this concurrent code.

Creating & Starting new Processes

Processes are created by sending the message #newProcess to a block. This returns a new process object, which is NOT yet scheduled for execution (i.e. it does not start execution). To schedule the new process for execution, send it a #resume message.

The separation of creation from starting was done to allow for the new process to be manipulated (if required) before the process has a chance to run. For example, its stackSize limit, priority or name can be changed. If that is not required, (which is the case with most processes), a combined message named #fork can be used, which creates and schedules the process for execution.

Each process has accociated with it a process priority, which controls execution in case more than one process is ready to run at the same time. By default, the new process has the same processes priority as its creating processes (the parent process) priority. To fork a process with a different priority, use the #forkAt: message, which expects the new processes priority as argument. Read the section below on priorities and scheduling behavior.

Examples (You may want to start a ProcessMonitor to see the processes in the system, before executing the examples below):

Forking a process:

|newProcess| newProcess := [ Delay waitForSeconds:3. Transcript showCR:'hello there.' ] fork.

Forking a process at low priority:

|newProcess| newProcess := [ Delay waitForSeconds:3. Transcript showCR:'hello there.' ] forkAt:1.

Creating, setting its name and resuming a process:

|newProcess| newProcess := [ Delay waitForSeconds:3. Transcript showCR:'hello there.' ] newProcess. newProcess name:'my process'. newProcess resume
The #name: message in the last example sets a processes name - this has no semantic meaning, but helps in finding your processes in the ProcessMonitor.

Shared Access to Objects, Locking & Synchronization

When multiple processes are executing, special attention has to be payed when objects are accessed by more than one of them.
As usual, some danger is involved in modifying shared state; this is especially true, since in general, no interlocks have been built into the normal classes.
For example, concurrent add/removal to an instance of OrderedCollection can lead to invalid contents (since its internal information may become corrupted).

For the curious:: Instances of OrderedCollection hold some internal indices to keep track of first and last elements inside its container array. If a process switch occurs during update of these values, the stored value could become invalid.

However, to support multiple process synchronization, there are some classes available, which do handle concurrent access. Of special interest are:

SharedQueue - provides a safe implementation of a queue
SharedCollection - provides a safety wrapper for any other collection
Semaphore - for synchronization and mutual exclusion
Delay - for timing
RecursionLock - mutual exclusion between processes
Monitor - for non-block critical regions

Semaphores: The Low Level Mechanism

To fix the above problem, some synchronization method is required. The low level mechanism for this are Semaphores - especially semaphores for mutual exclusive access. Semaphores for mutual exclusions can be acquired and held while some piece of code - the so called "critical region" - is to be executed. Only one process can hold on a given exclusive semaphore at any time. If a process wants to acquire the semaphore which is held by another process at that time, the acquiring process is set to sleep until the owning process releases the semaphore.

The following code-fragment creates such a semaphore, and wraps the read/write access code into so called critical regions:

Setup:
|sema sharedCollection| ... sema := Semaphore forMutualExclusion. sharedCollection := OrderedCollection new. ...
Writer:
... sema critical:[ sharedCollection addLast:something ]. ...
Reader:
... sema critical:[ something := sharedCollection removeFirst ]. ...
The "Semaphore forMutualExclusion" expression creates a semaphore, which provides safe execution of critical regions.
Critical regions are executed by the #critical: message, which executes the argument block while asserting, that only one process is ever within a region controlled by the semaphore.
If any process tries to enter a critical region which has already been entered, it is suspended until the region is left by the other process.

RecursionLock: Recursive Entrance into Critical Regions

Often, a process may enter another critical region, while already being within a critical region. If the new region is controlled by the same sempahore as the previously entered one, this results in a deadlock situation. Such a situation arises quite often, when a class reuses code by sending self messages to other methods which themself protect the state by the semaphore. For example,

    doSomeComplexOperation
	accessLock critical:[
	    ...
	    update some state
	    ...
	    self someOtherOperation
	    ...
	    update more state
	    ...
	].

and:

    someOtherOperation
	accessLock critical:[
	    ...
	    update some state
	    ...
	].

With a "normal" semaphore, execution of doSomeComplexOperation leads to a deadlock situation, when the already acquired semaphore is tried to be acquired again.
To prevent this, a special semaphore called RecursionLock should be used; this behaves like a regular semaphore except that it allows the owning process (but only the owner) to reenter a critical region. To use a recursion lock, change the above code from Semaphore forMutualExclusion into RecursionLock new or RecursionLock forMutualExclusion.

If a process terminates, while being within a critical region, that semaphore/recursionLock is automatically released - there is no need for the programmer to care for this (the critical method ensures thus).

Monitors: Non Block-nested Critical Regions

The above #critical: messages expects a block to be passed; if your critical region cannot be placed into a block but is to be left in a different method/class from where it was entered, you can alternatively use a Monitor object.
Monitors provides separate #enter and #exit methods - so the critical region can be left anywhere else from where it was entered.
However, be very careful to always correctly leave those monitors, since they do not provide any automatic cleanup in case of non-normal termination.

Using a monitor, the above example is written as:
|monitor| ... monitor := Monitor new. ... monitor enter. ... critical code, protected by monitor ... monitor exit. ...

Synchronizing Dataflow between Concurrent Processes: SharedQueue

Simple reader/writer/filter applications are best implemented using SharedQueues, (which are basically implemented much like above code).
Thus, using a shared queue, the above becomes:

    |queue|

    ...
    queue := SharedQueue new.
    ...

writer:

    ...
    queue nextPut:something.
    ...

reader:

    ...
    something := queue next
    ...

Synchronized: Syntactic Sugar for Critical Regions

The Java language provides a special language element for synchronization: the "synchronized" keyword.
In ST/X, a corresponding method is found in the object class, which allows for a critical region semaphore to be automatically associated with any object.
This can be considered as syntactic sugar, as it does not add any new functinality.

Using the synchronized message, the above shared access code becomes:
writer:
... sharedCollection synchronized:[ sharedCollection addLast:something ]. ...
reader:
... sharedCollection synchronized:[ something := sharedCollection removeFirst ]. ...
As any object understands the synchronized:-message, regions can be made critical w.r.t. the instance (as in the above example), or the class (by coding: "Collection synchronized:"), which corresponds to the synchronized-class attribute of Java.
(even method-level synchronization is possible, by using the method (class compiledMethodAt:#selector) as receiver. However, method level locks are in most cases not useful as they do not provide enough protection. Also, they are dangerous, as they are ineffective if more critical methods are added and/or a subclass adds methods).

Why are not all Smalltalk container classes always protected by critical regions ?

The Smalltalk classes could be rewritten to add interlocks at every possible access in those containers (actually, many other classes such as the complete View hierarchy must be rewritten too). This has not been done, mainly for performance reasons. The typical case is that most objects are not used concurrently - thus the overhead involved by locking would hurt the normal case and only simplify some special cases.

Examples

Two processes, NOT synchronized (results in corrupted output):

    |p1 p2|

    p1 := [
	    10 timesRepeat:[
		Transcript show:'here'.
		Delay waitForSeconds:0.1.
		Transcript show:' is'.
		Delay waitForSeconds:0.1.
		Transcript showCR:' process1'.
	    ]
	  ] fork.

    p2 := [
	    10 timesRepeat:[
		Transcript show:'here'.
		Delay waitForSeconds:0.1.
		Transcript show:' is'.
		Delay waitForSeconds:0.1.
		Transcript showCR:' process2'.
	    ]
	  ] fork.

synchronized by a critical region:

|p1 p2 sema| sema := Semaphore forMutualExclusion. p1 := [ 10 timesRepeat:[ sema critical:[ Transcript show:'here'. Delay waitForSeconds:0.1. Transcript show:' is'. Transcript showCR:' process1'. ]. Delay waitForSeconds:0.1. ] ] fork. p2 := [ 10 timesRepeat:[ sema critical:[ Transcript show:'here'. Delay waitForSeconds:0.1. Transcript show:' is'. Transcript showCR:' process2'. ]. Delay waitForSeconds:0.1. ] ] fork.

synchronized using the synchronized (syntactic sugar) method:

|p1 p2| p1 := [ 10 timesRepeat:[ Transcript synchronized:[ Transcript show:'here'. Delay waitForSeconds:0.1. Transcript show:' is'. Transcript showCR:' process1'. ]. Delay waitForSeconds:0.1. ] ] fork. p2 := [ 10 timesRepeat:[ Transcript synchronized:[ Transcript show:'here'. Delay waitForSeconds:0.1. Transcript show:' is'. Transcript showCR:' process2'. ]. Delay waitForSeconds:0.1. ] ] fork.

synchronized by a Monitor:

|p1 p2 mon| mon := Monitor new. p1 := [ 10 timesRepeat:[ mon enter. Transcript show:'here'. Delay waitForSeconds:0.1. Transcript show:' is'. Transcript showCR:' process1'. mon exit. Delay waitForSeconds:0.1. ] ] fork. p2 := [ 10 timesRepeat:[ mon enter. Transcript show:'here'. Delay waitForSeconds:0.1. Transcript show:' is'. Transcript showCR:' process2'. mon exit. Delay waitForSeconds:0.1. ] ] fork.

Process Priorities & Scheduling

Each process has associated with it a priority (usually some small number between 1 and 30). Whenever more than one process is runnable, the scheduler selects the highest priority process for execution and passes control to it (this process is called the "active process).
The active process continues execution until either:

it terminates
termination can be explicit by sending #terminate, to the process, or implicit, when the processes code block (the block which was the receiver of #newProcess or #fork) leaves (i.e. falls through the last statement).
it suspends itself
this can be done by an explicit #suspend, or indirectly by waiting for some semaphore to be signalled or by going into a sleep for some time (see Delay class). The process remains suspended, until it receives a #resume message.
If it waits for a timer or a semaphore, the expiring timer or the triggering semaphore will send this #resume message to the waiting process(es). If it waits for I/O, the runtime system's I/O handling code will trigger the semaphore signalling.
it gives up control temporarily
if the active process receives a #yield message, it gives up control and passes it to the next runnable process with the same priority, if there is any.
If there is no other runnable process with this priority, the #yield message does nothing.
A #yield message can be send by a process itself, or by another (higher priority) process.
a higher priority process becomes runnable
in this case, the scheduler suspends the active process and transfers control to the higher priority process.
A higher priority process could also become runnable by some external event (timer or I/O arrival), which triggers the semaphore on which it was waiting.

If two or more processes with the same priority are runnable, NO automatic rescheduling is done by the runtim system (but read on). To pass control to other processes within this group, the running process must give up control explicitly; either by yielding, or by suspending itself when waiting on some semaphore.

This is called "preemtive scheduling WITHOUT round robin".
Notice, that ST/X itself is mostly thread-safe, the windowing system, shared collections such as dependencies and process lists are protected by critical regions. However applications and your programs may not be so.

Therefore, automatic round robin is initially disabled by default, but later enabled controlled by a settings option. The startup file as delivered now has this setting enabled. If your application is not prepared for it, it should be disabled there.

Round-Robin Scheduling (Timeslicing)

For round-robin scheduling, the ST/X processor scheduler class contains the required mechanisms to timeslice processes.
This implementation is relatively simple and straight forward:

it creates a new high-priority process, whose execution is controlled by the timer. Since it's running at very high priority, this process will always get the CPU for execution, whenever it's timer tick expires.
When returning from the timer-wait, that process forces a yield of the running process to allow for other processes within that priority group to execute (for the next timeslice period).

The launcher's settings-misc menu includes an item called "Preemtive Scheduling" which enables/disables round-robin mechanism.
Timeslicing can also be started by evaluating:

    Processor startTimeSlicing

and stopped via:

    Processor stopTimeSlicing

Notice, that there is NO warranty concerning the actual usability of this feature; for above (synchronization) reasons, some applications could behave strange when the timeslicer is running. Especially those ported from other Smalltalk dialects, which do not offer timeslicing may not include appropriate locking mechanisms, as they might have been never designed to be thread safe. For example, many assume that a window redraw operation is executed uninterrupted, which is no longer true, when time slicing is enabled.

To see the effect of timeslicing, open a ProcessMonitor from the Launcher's utility menu and start some processes which do some long time computation. For example, you can evaluate (in a workspace):
[ 100 timesRepeat:[3000 factorial] ] forkAt:4
some 3 or 4 times and watch the processes in the monitor.
Without time slicing, only one of these processes will make any progress; the others will wait and all of them be executed in sequence.
With time slicing on, you will notice that all of them seem to run (only one of them is actually active at any particular time).

Dynamic Process Priorities

Once preemtive scheduling (timeslicing) is enabled, the processorScheduler may be further configured, to do dynamic priority adjustments.
With this scheme, a process which used up the CPU for some time, will get its priority automatically lowered, while other runnable processes which did not get a chance to run will get their dynamic priority raised.

Dynamic priority handling requires the timeSlicing mechanism to be active, and the dynamic flag be set:
i.e., to start this mechanism, evaluate:
Processor startTimeSlicing. Processor supportDynamicPriorities:true.
There is also an entry in the launcher's misc-settings dialog, to enable/disable dynamic process priorities.

For security (and application compatibility), this is not done for all processes. Instead, only processes which return an interval from the #priorityRange message are affected by this.
The returned interval is supposed to specify the minimum and maximum dynamic priority of the process.
By default, all created processes have a nil priority range, therefore not being subject of dynamic scheduling.

Dynamic priorities are most useful for background actions, where you want to make sure that they do not normally interfere with the user interface too much, mut are guaranteed to make progress, even when other processes do heavy cpu centric processing. Typically, background communication protocols, print jobs or other house-keeping tasks benefit from a dynamic process priority.

The following example, creates three very busy computing processes. Two of them start with very low priority, but with dynamic priorities. The third process will run at a fixed priority.
|p1 p2 p3| p1 := [ 30 timesRepeat:[ 5 timesRepeat:[ 3000 factorial ]. Transcript showCR:'p1 made some progress'. ]. Transcript showCR:'p1 finished'. ] forkAt:1. p1 priorityRange:(1 to:7). p2 := [ 30 timesRepeat:[ 5 timesRepeat:[ 3000 factorial ]. Transcript showCR:'p2 made some progress'. ]. Transcript showCR:'p2 finished'. ] forkAt:4. p2 priorityRange:(4 to:9). p3 := [ 30 timesRepeat:[ 5 timesRepeat:[ 3000 factorial ]. Transcript showCR:'p3 made some progress'. ]. Transcript showCR:'p3 finished'. ] forkAt:6.
The first process will never get a priority above any GUI processes, therefore it will never disturb any views.
The second will eventueally suspend even GUI processes, in case it does not get a chance to run for a while.
The third process will run at a fixed priority of 6.
Both the first and the second process will eventually suspend the third process, since their dynamic priority will raise above 6.
(try the example, while evaluating some time consuming operation in a workspace - at fixed priority 8. In that case, only p2 will make any progress.).

Process Stacks

Each Smalltalk process has its own automatically growing stack. There is no need for the smalltalk programmer to preallocate or otherwise predefine the stack size (1). For security (and to make your life easier in case of a runaway process), the stack size is limited by a soft limit, which can be set for any process, at any time by evaluating:

    aProcess setMaximumStackSize:limit

When the stack size hits this limit, a Smalltalk signal is raised, which can be cought and handled by the Smalltalk exception mechanism. It is even possible to change the limit within the exception handler and continue execution of the process with more stack (see examples in "doc/coding").

The default stack limit is set to a reasonably number (typically 1Mb). If your application contains highly recursive code, this default can be changed with:
Process defaultMaximumStackSize:limit

Warning:: Runaway processes are cought later (i.e. stack overruns are detected later) if this limit is set to a higher value. Therefore, we do not recommend changing the default - if there is a need, better change an individual processes limit.
(1): This is only true for the Unix/Linux variant of ST/X. Because native threads are used under windows AND windows validates the stack for correctness in every API call (against whatever win32 "thinks to be correct"), no API-calls are possible with a non-contigious stack. The spaghetti-stack scheme, which implements the dynamic growing stack, makes the stack non-contigious. Therefore, under win32, the spaghetti-stack scheme is not used, and all stacks are preallocated to be contigious.
Under win32, the maximumStackSize setting is also the initial and final stack size; in other words, all threads start with that size of stack, which cannot be expanded later. The maximum-setting must have been changed BEFORE a thread is created. Thus, to get a process with more stack, you have to change the limit, fork the new thread and then change it back.

Processvariables - Thread Local Storage

The current process object contains a mechanism to define thread-local variables. These are created by:

    Processor currentProcess environmentAt:#name put:someValue

and the value fetched via:

    Processor currentProcess environmentAt:#name

The "environmentAt:" will raise an error if the variable was undefined. As an alternative, use "threadVariableValueOf:", which returns nil for undefined variables.

If you need a variable to be only valid during the execution of some code, use

    Processor currentProcess
	withThreadVariable:#name boundTo:someValue
	do:[
	    ...
	    (Processor currentProcess threadVariableValueOf:#name)
	    ...
	]

The last one being recommended, as it makes sure that the variable gets removed (released) when no longer needed.

Views and Processes

In Smalltalk/X, typically one process is created for each topview. This topview and all of its subviews are grouped together in a so called WindowGroup.
To get a picture of it, first start a ProcessMonitor then open a new workspace.
-> You will see some info about the workspaces process in the process monitors list.

Now, in the workspace evaluate:
Processor activeProcess suspend
The workspace will no longer respond to keyboard or any other events. (select the corresponding entry in the processMonitor, and use the debug function from its popup-menu, to see what the workspace is currently doing)

You can continue execution of the workspaces process with the processMonitos resume-function (or in the debugger, by pressing the continue button).

All events (keyboard, mouse etc.) are read by a separate process (called the event dispatcher), which reads the event from the operating system, puts it into a per-windowgroup event-queue, and notifies the view process about the arrival of the event (which is sitting on a semaphore, waiting for this arrival).
For multiScreen operation, one event dispatcher process is executing for each display connection.

Modal boxes create a new windowgroup, and enter a new dispatch loop on this. Thus, the original view's eventqueue (although still being filled with arriving events) is not handled while the modalbox is active (*).

The following pictures should make this interaction clear:

event dispatcher:

       +->-+
       ^   |
       |   V
       |   waits for any input (from keyboard & mouse)
       |   from device
       |   |
       |   V
       |   ask the view for its windowGroup
       |   and put event into windowgroup's queue
       |   (actually: the group's windowSensor's queue)
       |   |
       |   V
       |   wakeup windowgroups semaphore            >*****
       |   |                                             *
       +-<-+                                             *
							 * Wakeup !
							 *
each window-group process:                               *
							 *
       +->-+                                             *
       ^   |                                             *
       |   V                                             *
       |   wait for event arrival (on my semaphore)  <****
       |   |
       |   V
       |   send the event to the corrsponding controller or view
       |   |               |
       +-<-+               |
			Controller/View » keyPress:...
		    or: Controller/View » expose...
		    etc.

modal boxes (and popup-menus) start an extra event loop:


       +->-+
       ^   |
       |   V
       |   wait for event arrival (on my semaphore)
       |   |
       |   V
       |   send the event to the corrsponding view
       |   |   ^       |
       +-<-+   |       |
	       |       V
	       |    ModalBox open
	       |    create a new windowgroup (for box)
	       |       |
	       |       V
	       |       +->-+
	       |       ^   |
	       |       |   V
	       |       |   wait for event arrival (boxes group)
	       |       |   |
	       |       |   V
	       |       |   send the event to the corresponding handler
	       |       |   |               |
	       +--- done ? |               |
		       +-<-+               |
					keyPress:...

Views and Priorities

Initially, all view-processes are created at the same priority (called UserSchedulingPriority, which is typically 8). This means, that a running user process will block all other view processes (except, if it does a yield from time to time, or if timeSlicing is enabled).

Try evaluating (in a workspace, with timeSlicing disabled) ...
[true] whileTrue:[1000 factorial]
... the system seems dead (read the next paragraphs, before doing this).

Only processes with a higher priority will get control; since the event dispatcher is running at UserInterruptPriority (which is typically 24), it will still read events and put them into the view's event queues. However, all view processes run at 8 which is why they never get a chance to actually process the event.

There are two events, which are handled by the event dispatcher itself: a keypress of "CTRL-C"(or "CTRL-." depending on your setup) in a view will be recognized by the dispatcher, and start a debugger on the corresponding view-process; a keypress of "CTRL-Y" in a view also stops its processing but does not start a debugger (this is called aborting).
Notice: in the current ST/X release, the keyboard mapping was changed to map CTRL-Y to "redo". Now, you have to press CTRL-. and then click on the "abort" button.

Actually, in both cases, a signal is raised, which could in theory be cought by the view process.

Thus, to break out of the above execution, press "CTRL-C"/"CTRL-." in the workspace, and get a debugger for its process. In the debugger, press either abort (to abort the doIt-evaluation), or terminate to kill the process and shut down the workspace completely (closes the workspace).

If you have long computations to be done, AND you don't like the above behavior, you can of course perform this computation at a lower priority. Try evaluating (in the above workspace):
Processor activeProcess priority:4. [true] whileTrue:[1000 factorial]
Notice that nowadays, 1000 factorial is computed so fast, that it hardly makes sense to run it in the background!

Now, the system is still responding to your input in other views, since those run at a higher priority (8), therefore suspending the workspace-process whenever they want to run. You can also think of the the low-prio processing as being performed in the background - only running when no higher prio process is runnable (which is the case whenever all other views are inactively waiting for some input).

Some views do exactly the same, when performing long operations. For example, the fileBrowser lowers its priority while reading directories (which can take a long time - especially when directories are NFS-mounted). Therefore, you can still work with other views (even other filebrowsers) while reading directories. Try it with a large directory (such as "/usr/bin").

It is a good idea, to do the same in your programs, if operations take longer than a few seconds - the user will be happy about it. Use the FileBrowser's code as a guide.

For your convenience, there is a short-cut method provided by Process, which evaluates a block at a lower priority (and changes the priority back to the old value when done with the evaluation).
Thus, long evaluations should be done using a construct as:
Processor activeProcess withPriority:4 do:[ 10 timesRepeat:[2000 factorial] ]
You should avoid hardcoding priority numbers into your code, since these may change (PPS users noticed that Parcplaces new release 2 uses priorities between 1 and 100),
To avoid breaking your code in case this is changed in ST/X, the above is better written as:
Processor activeProcess withPriority:(Processor userBackgroundPriority) do:[ 10 timesRepeat:[2000 factorial] ]

Background Processes

The above example did its computation in the workspace process, thus the workspace did no longer respond to update- or any other events. To get around this behavior, you can also start a new process, to do this computation. Try to evaluate:

    [
	10 timesRepeat:[
	    2000 factorial
	]
    ] forkAt:4

in a workspace, and watch the process monitor. You will notice, that the workspace is not blocked, but a separate process has been created. Since it runs at a lower priority, all other views continue to react as usual.

There is one possible problem with the above background process:

"CTRL-C" or "CTRL-Y" pressed in the workspace will no longer affect the computation (because the computation is no longer under the control of the workspace).

To stop/debug a runaway background process, you have to open a ProcessMonitor, select the background process and apply the processMonitors terminate, abort or debug menu functions.
If you start background processes programmatically, you should keep references to the subprocesses in some variable, and allow termination via some menu or button function:

    |myProcess|

    ...
    myProcess := [
		    ... whatever ...
		 ] forkAt:4.

    ...
    myProcess terminate

Do not forget to terminate all of your subprocesses, when your application is finished (this is typically done in the applicationModel's #closeRequest or a view's #destroy method).

To allow easier identification of your subprocesses in the process monitor, you can assign a name to a process:
|myProcess| ... myProcess := [ ... whatever ... ] forkAt:4. myProces name:'my background process'. ... myProcess terminate
This process name has no semantic meaning - its sole purpose is for the process monitor.

Suggested Priorities & Hints

To keep the system responsive, use the following priorities in your programs:

for normal views, do not change its default priority, UserSchedulingPriority (8)
think twice about raising the priority to or above UserInterruptPriority (24).
Since this is the event dispatchers priority, no event handling (especially: no "CTRL-C" processing) takes place while running at prio >= 24. If your process has any bug (i.e. an endless loop) it may be very hard to stop it. (see below on how to do this).
In general, there is seldom any need to raise the priority above the default - except, for example, when handling input (requests) from a Socket which have to be served immediately, even if some user interaction is going on in the meantime (a Database server with a debugging window ?).
use UserBackgroundPriority (6) when doing long computations, which are presented to the user after a while.
Typically, use this priority when reading directories, long files, databases, interprocess-communication sockets etc.
For example, the file browser lowers its priority to that value, while reading directories or files.
use SystemBackgroundPriority (4) when doing computations, which are not visualized after a while; i.e. for things which can run totally in the background.
Also the animation demos run at this priority, to give foreground views a chance to complete their operations, even if they lower their priority (such as the fileBrowser).
if time slicing is disabled: no matter what priority you run on, do a yield from time to time when doing long computations.
This makes sense, even when running at lower priority, to give other applications a chance to make some progress (since they too may have lowered their priority, this yield is needed to let them run too).
If you don't want to manually add yields all over your code, and are not satisfied with the behavior of your background processes, you should enable timeslicing as described above. However, you have to care for the integrity of any shared objects manually, and fix your code to deal with concurrent accesses (i.e. add critical regions, where required).
BTW: the Transcript in ST/X is threadsafe; you can use it from any processes, at any priority.

Blocking interrupts

If you ever want to change things in Delay, Semaphore or ProcessorScheduler, never forget the possibility of external-world interrupts (especially: timer interrupts). These can in occur at any time, bringing the system into the scheduler, which could switch to another process as a consequence of the interrupt. This may even happen while running at high priority.
Whenever you are modifying data which is related to process scheduling itself (i.e. counters in semaphores, process lists in the scheduler etc), you should therefore block these interrupts for a while.
This is done by:

    OperatingSystem blockInterrupts
    ...
    modify the critical data
    ...
    OperatingSystem unblockInterrupts

Since the #blockInterrupts and #unblockInterrupts implementation does not handle nested calls, you should only unblock interrupts, if they have NOT been blocked in the first place. To do so, #blockInterrupts returns the previous blocking state - i.e. true, if they had been already blocked before.
Thus, to be certain, always use:

    |wasBlocked|

    ...
    wasBlocked := OperatingSystem blockInterrupts
    ...
    modify the critical data
    ...
    wasBlocked ifFalse:[OperatingSystem unblockInterrupts]

if there is any chance for the code between the block/unblock to return (i.e. a block return or exception handling), you also have to add proper unwind actions:

    |wasBlocked|

    ...
    wasBlocked := OperatingSystem blockInterrupts
    [
       ...
       modify the critical data
       ...
    ] valueNowOrOnUnwindDo:[
	wasBlocked ifFalse:[OperatingSystem unblockInterrupts]
    ]

that's a lot to remember; therefore, for your convenience, Block offers an "easy-to-use" interface (syntactic sugar) for the above operation:

    [
       ...
       modify the critical data
       ...
    ] valueUninterruptably.

See the code in Semaphore, Delay and ProcessorScheduler for more examples.

Notice, that no event processing, timer handling or process switching is done when interrupts are blocked. Thus you should be very careful in coding these critical regions. For example, an endless loop in such a region will certainly lock up the Smalltalk system. Also, do not spend too much time in such a region, any processing which takes longer than (say) 50 milliseconds will have a noticeable impact on the user.
Usually, it is almost always an indication of a bad design, if you have to block interrupts for such a long time. In most situations, a critical region or even a simple Semaphore should be sufficient.

While interrupts are blocked, incoming interrupts will be registered by the runtime system and processed (i.e. delivered) at unblock-time. Therefore, be prepared to get the interrupt(s) right after (or even within) the unblock call.

Also, process switches will restore the blocking state back to how it was when the process was last suspended. Thus, a yield within a blocked interrupt section will usually reenable interrupts in the switched-to process.

It is also possible to enable/disable individual interrupts. See OperatingSystem's disableXXX and enableXXX methods.

Interrupting a Process

Beside the above external interrupts, you can also manually force a process to be interrupted and evaluate something programmatically. To do so, use:

    ...
    anotherProcess interruptWith:[ some action to be evaluated ]
    ...

This forces anotherProcess to evaluate the block passed to interruptWith:. If the process is suspended, it will be resumed for the evaluation. The evaluation will be performed by the interrupted process, on top of the running or suspended context Thus a signal-raise, long return, restart or context walkback is possible in the interrupt action block and will be executed on behalf of the interrupted processes stack - not the caller's stack. Therefore, this is a method of injecting a piece of code into any other process at any time.

BTW: the event dispatcher's "CTRL-C" processing is implemented using exactly this mechanism.

Try:
|p| p :=[ [true] whileTrue:[ 1000 factorial ] ] forkAt:4. " to find it easier in the process monitor " p name:'my factorial process'. " make it globally known " Smalltalk at:#myProcess put:p.
then:
myProcess interruptWith:[Transcript showCR:'hello'].
or (see the output on the xterm-window, where ST/X has been started):
myProcess interruptWith:[thisContext fullPrintAll].
or:
"this brings the process into the debugger" myProcess interruptWith:[Object errorSignal raise]
finally cleanup (terminate the process) with:
myProcess terminate. Smalltalk removeKey:#myProcess.
As another example, we can catch some signal in the process, as in:
|p| p :=[ Object errorSignal catch:[ [true] whileTrue:[ 1000 factorial ] ]. Transcript showCR:'process finished gracefully'. Smalltalk removeKey:#myProcess. ] forkAt:4. " to find it easier in the process monitor " p name:'my factorial process'. " make it globally known " Smalltalk at:#myProcess put:p.
then send it the signal with:
myProcess interruptWith:[Object errorSignal raise]
The above was shown for demonstration purposes; since process termination is actually also done by raising an exception (Process terminateSignal), graceful termination is better done by:
|p| p :=[ Process terminateSignal catch:[ [true] whileTrue:[ 1000 factorial ] ]. Transcript showCR:'process finished gracefully'. Smalltalk removeKey:#myProcess. ] forkAt:4. " to find it easier in the process monitor " p name:'my factorial process'. " make it globally known " Smalltalk at:#myProcess put:p.
then terminate it with:
myProcess terminate

Timeouts

Based on the above interrupt scheme, ProcessorScheduler offers methods to schedule timeout-actions. These will interrupt the execution of a process and force evaluation of a block after some time.

this kind of timed blocks are installed (for the current process) with:
Processor addTimeBlock:aBlock afterSeconds:someTime
to interrupt other processes after some time, use:
Processor addTimeBlock:aBlock for:aProcess afterSeconds:someTime
there are alternative methods which expect millisecond arguments for short time delays.

For example, the autorepeat feature of buttons is done using this mechanism. Here a timed block is installed with:
Processor addTimeBlock:[self repeat] afterSeconds:0.1
Also, animations can be implemented with this feature (by scheduling a block to draw the next picture in the view after some time delay).

See ``working with timers & delays'' for more information.

Terminating a Process

Processes are terminated with the #terminate message. Technically, this does not really terminate the process, but instead raises a TerminateProcessRequest exception. Of course, this signal can be cought or otherwise handled by the process; especially to allow for the execution of cleanup actions, as shown in the above example (see the Process » startup method in a browser).

This process termination is called ``soft termination'', because the affected process still has a chance to gracefully perform any cleanup in unwind blocks or by providing a handler for the exception.

A ``hard termination'' (i.e. immediate death of the process without any cleanup) can be enforced by sending it #terminateNoSignal. Except for emergency situations (a buggy or looping termination handler), there should never be a need for this, because it mey leave semaphores or locks in a state which prevents further access to shared state. Often, you will have to care for those leftover locks in a ProcessMonitor or SemaphoreMonitor afterwards.

So the most distinguishing aspect of soft termination is that all unwind blocks (see Block » valueOnUnwindDo:) are executed - in contrast to a hard terminate, which immediately kills the process. (read: ``context unwinding'')

Process Groups

A variant of the #terminate message is #terminateGroup. This terminates a process along with all of the subprocesses it created, unless the subprocess detached itself from its creator by becoming a process group leader. A process is made a group leader by sending it the #beGroupLeader message.
In ST/X, all GUI processes are process leaders by default.

Interrupting a Runaway Process

In case of emergency (for example, when a process with a priority higher than UserInterruptPriority loops endless), you can press "CTRL-C" in the xterm (or windows console) window, where Smalltalk/X was started.

(Notice: this may be impossible to do on the Windows operating system which does not attach a console to ".exe" programs. Therefore, for Windows, ST/X is deployed also as a ".com" application, which always opens a controlling console. During development, it is therefore prefereable to use the "stx.com" executable.)

The interrupt will ST/X in whatever it is doing (even the event dispatcher) and enter a debugger.

If the scheduler was hit with this interrupt, all other process activities are stopped, which implies that other existing or new views will not be handled while in this debugger (i.e. the debuggers inspect functions will not work, since they open new inspector views).
If your runaway process was hit, the debugger behaves as if the "CTRL-C" was pressed in a view (however, it will run at the current priority, so you may want to lower it by evaluating:

Processor activeProcess priority:8

In this debugger, either terminate the current process (if you were lucky, and the interrupt occured while running in the runaway process) or try to terminate the bad process by evaluating some expression like:

Process allInstances do:[:p | p priority > 24 ifTrue:[ p id > 1 ifTrue:[ "/ do not kill the scheduler p terminate ] ] ]

Your runaway process is of course easier to locate, if you gave it a distinct name before; in this case, use:

Process allInstances do:[:p | p name = 'nameOfBadProcess'ifTrue:[ p terminate ] ]

A somewhat less drastic fix is to send it an abortSignal:

Process allInstances do:[:p | p name = 'nameOfBadProcess'ifTrue:[ p interruptWith:[Object abortSignal raise] ] ]

Most processes provide a handler for this signal at some save place, where they are prepared to continue execution. Those without a handler will terminate. Therefore, a workspace or browser will return to its event loop, while other processes may terminate upon receipt of this signal.

In some situations, the system may bring you into a non graphical MiniDebugger (instead of the graphical DebugView). This happens, if the active process at interrupt time was a DebugView, or if any unexpected error occurs within the debuggers startup sequence.
The MiniDebugger too supports expression evaluation, abort and terminate functions, however, these have to be entered via the keyboard in the xterm window (where you pressed the "CTRL-C" before). Type ? (question mark) at the MiniDebuggers prompt to get a list of available commands.

On some keyboards, the interrupt key is labeled different from "CTRL-C".
The exact label depends on the xmodmap and stty-settings. Try "DEL" or "INTR" or have a look at the output of the stty unix command.

Notes:: (*) this is not fully correct: the modalbox's event loop peeks into the other windowgroup's eventQueue and handles redraw requests. Thus, the original group's views will still redraw themselves when exposed.
However, no input events (keyboard and/or mouse) are handled while a modalBox is active.

Processes and SnapShotImage Restart

When Smalltalk/X is restarted from a snapShotImage, by default, processes are NOT resumed or restarted. The reason is that due to the generated C-code (stc compilation), process stacks are not portable or restartable. Thus, all processes which were running at image save time are dead when the image is restarted.
However, by specially marking a process as restartable with:

    p :=  [...] newProcess.
    p restartable:true.
    p resume

at process creation time, the process will be restarted (from it beginning) when st/x is started from an image.

How the Scheduler Gets Control

The scheduling of all Smalltalk processes is done by another process, the so called scheduler process. This process happens to execute at the highest priority (in current systems: 31).
Whenever the scheduler passes control to another (user- or background) process, it makes certain that it will regain control (i.e. execute) whenever a scheduled timeout is due, or some I/O data arrives at some stream (to signal corresponding seaphores) or a window event arrives (mouse or keyboard actions).
Of particular interest is the display connection, which must be served whenever input arrives, to ensure that CTRL-C processing is performed (to stop runaway user processes).
Depending on the operating system, two mechanism are used:

enabling an I/O interrupt, or
polling in 20ms intervals

The first technique leads to less idle CPU overhead, but happens to not work reliably on all systems.
The second can be used on all systems (especially, some older Unix versions had trouble in their X-display connection, when I/O interrupts were used).

To decide, which technique to use, the scheduler process asks the OperatingSystem, if it supports I/O interrupts (via OperatingSystem » supportsIOInterrupts), and uses one of the above according to the returned value.

To date, (being conservative) this method is coded to return false for most systems (even for some, where I/O interrupts do work). You may want to give it a try and enable this feature by changing the code to return true.
If you are still able to CTRL-C-interrupt an endless loop in a workspace, and timeslicing/window redrawing happens to still work properly, you can keep the changed code and send a note to cg@exept.de or info@exept.de; please include the system type and OS release; we will then add #ifdef'd code to the supportsIOInterrupt method.

<cg@exept.de>

Doc $Revision: 1.36 $ $Date: 2016/11/05 17:38:36 $

Working with Processes

Contents

Why are not all Smalltalk container classes always protected by critical regions ?

Process Groups