new algorithms for asynchronous communication

New a I g o r it h ms for aisy n c h r o n o u s co m m u n ica t i o n

H.R.Simpson

Indexing terms: Asynchronou,s communication, Algorithms, Four-sloi mechanims

Abstract: Concurrent processes are said to communicate asynchronously when there is no mutual timing interference resulting from their communication operations. This property can be achieved by mechanisms which use multiple shared memory locations (slots) to transfer data, and where access to these slots is co-ordinated by small shared control variables. Algorithms are known which allow a writing process to communicate asynchronously with a reading process through a four-slot mechanism with no mutual timing constraints. The paper gives new algorithms for a four-slot mechanism, and shows how these may be applied in a design. The new algorithms have an access-control strategy which is complementary to that used by the previously known algorithms.

'I Introduction

This paper builds on and extends previousl,y known techniques [l-31 based on the use of multiple shared memory locations (slots) as a means of achieving asynchronous communication between a single writing process and a single reading process operating concur- rently. Fig. 1 illustrates the relationship between the two processes. The control variables (bits or small inte- gers) are available to both processes and co-ordinate access to the data in the slots; no use is made of the more conventional synchronous methods for access control such as semaphores [4], monitors [5] , rendez- vous [6] or critical sections [7], which inevitably intro- duce temporal interference as the processes synchronise their communication operations, together with a failure mode associated with the implicit waiting and restart controls. The most exacting form of operating environ- rnent for the control algorithms set out here is one in which there is no global clock so that each process exe- cutes in its own independent time frame. However, such techniques can also be used to advantage in less stringent operating conditions, as will be briefly discussed in the context of a software design.

The intermediate structure between the two processes is known as a communication mechanism; in addition to the control variables and data slots, it contains the

0 IEE, 1997 TEE Pror.rdin.gx online no. 19971 21 X Paper first received 5th July 1996 and in revised form 4th March 1997 The author is with Matra BAe Dynamics, Executive Technologist, Digital Information Processing, PO Box 19, Six Hills Way, Stevenage, Herts SG1 2DA, UK

writing and reading access controls which implement the algorithms. These access controls are invoked by the connected processes whenever they need to transfer data. The data in the mechanism has the function of reference data, i.e. it is conceptually a single record which can be updated at any time by the writer and can be read at any time by the reader.

writer reader

I data I Fig. 1 Shaved-memouy conzmunication

Two properties are particularly important: coherence and freshness. Coherence is guaranteed when at no time is it possible for a read access to obtain data which has been (partially) written by more than one write access. Freshness is guaranteed when at no time is it possible for a read access to obtain stale data, i.e. data which is not the latest supplied by a write access. Note here that the term 'access' includes both control operations and data transfer operations.

A little thought shows that, if writer and reader are to be able to access a communication mechanism con- currently, then there must be more than one place for data in the mechanism. The search for suitable access control algorithms starts with the simplest forms [3], examines their failure modes, and then extends the algorithms in ways which show some prospect of removing these failure modes. A two-slot mechanism can fail if the duration of a data read operation is longer than the interval between successive data write operations. A three-slot mechanism can fail if the interval between the write operation which indicates (to the reader) the last slot written and the write operation which chooses the next slot for writing is shorter than the interval between the read operation which chooses a slot for reading and the read operation which indicates (to the writer) the slot being read. The relevant failure mode is implicit in any two- or three-slot mechanism, and any particular solution can be analysed to show that it exists [3].

The two- and three-slot access control strategies treat the slots as a uniform linear set. The addition of a fourth slot to such a set still exhibits a failure mode [3]. However, successful algorithms can be formulated when the four slots are arranged as two pain, and when an orthogonal control strategy is introduced in which the writing and reading processes avoid one another by each being given control of a different dimension in the two-by-two array so created. The pre-

221 IEE Pro< -Comput Dixit Tech, Vol 144, No 4, July 1997

viously known algorithms for such a mechanism are as fQllOWS:

__ _- writing: d[w,s[w]]:=inp?~t:s[u;]:=s[w]:I:=w~~w:=T

reading: r:=I;w:=s;output:=d[r,~[r]]

In this mechanism., data is passed through a four-slot array d, arranged as two pairs of slots. The control variables 1, w and r are bits indicating, respectively, which of the pairs holds the latest data, which is being used to assemble new data, and which has been selected for reading. The variables s and v are two-bit vectors indicating, respectively, the slot within each pair which holds the latest data, and which slot has been selected for reading. Operations separated by the / I symbol can be performed in either order or with any degree of overlap, provided that the logical intent is preserved: new 1 becomes old w and new w becomes the inverse of r . Whereas the two- and three-slot mechanisms discussed above are only satisfactory if operated in such a way that the failure mode cannot occur, a four-slot mechanism based on these algorithms is free from mutual timing constraints.

Although the writing and reading algorithms are each stated as a series of discrete operations, this does not imply the need for any form of atomicity [8]. Indi- vidual operations on one side may overlap or interleave those from the other to any degree. In particular, a vector assignment may be executed on an element-by-element basis, although the technique for analysing mechanism operation [3] assumes that all element assignments must have started before any has finished, and that all. have finished before the next operation is started.

Other researchers [8, 91 have also become aware of the advantages offered by a two-by-two arrangement of slots. These solutions are not expressed in the compact algebraic style given above, being primarily directed towards software implementation. The principal differ- ence is in the technique used for selecting a slot when the writer is switching pairs; it turns out that, in these circumstances, either slot in the new pair will do.

At a more detailed level in the analysis of communication mechanism properties, it is necessary to examine the way in which the control-variable values are passed from one process to the other. An important issue here is metastability [lo]. This is an inescapable physical phenomenon which occurs when a variable set by a process operating in one time frame is observed by another process operating in a different time frame, at a time when the variable’s value is changing (it is assumed that bit variables are implemented so that, when an existing value is rewritten, the variable is undisturbed and there is no possibility that a concurrent read obtains an incorrect value). Metastability causes a potential increase in the time taken for a read operation on a shared control variable to make a new stable value available, resulting in temporary uncertainty in shared variable values whenever they are read. The impact of metastability is minimal in the previ-. ously known four-slot mechanism [3] , amounting only to the need to ensure a reasonable interval between the: last operation in a control sequence and the subsequenl data access operation; this aspect will be discussed in relation to the new algorithms.

228

2 New solution

The previously known algorithms effectively give the writer control of the slots within a pair, whereas the reader has control of the pairs. It would appear that success is entirely due to this orthogonality of access control and it therefore seems likely that complementary algorithms exist in which the writer controls the pairs and the reader controls the slots. New algorithms based on this alternative principle of operation are given, and the way they work is discussed informally using a data-flow model. The effects of metastability are considered by identifying critical interaction points within the algorithms. Finally, a design based on the new algorithms is presented.

2.7 New algorithms There are two important features of the previous algorithms which one would expect to carry over into any new solution. First it is observed that the writing algorithm writes the data and then uses a ‘write post sequence’ to indicate the location of this data and to choose the destination for the next write, and the reading algorithm uses a ‘read pre sequence’ to find the latest data and indicate its interest in this location before reading the data. Secondly, it is seen that each sequence of control operations (write post or read pre) first manipulates the variable corresponding to the array dimension it controls, before moving on to take account of what the other process is doing in the other dimension. Apart from these general points, very little guidance can be offered as to how to go about formu- lating new algorithms. This is a process of intuitive design and iterative improvement. The result must exhibit properties which both withstand informal scru- tiny and are then shown to be sound when subjected to rigorous analysis. The proposed new algorithms are as follows:

~ - writing: d[ip,w[zp]]:=znput;zp:=zp;w[zp]:=r[zp]

reading: r:=w;op:=ap;output:=d[op,r[op]] -

As before, data is passed through a four-slot array d, arranged as two pairs of slots. The control variable ip is a bit indicating the next pair to be used for writing. It is seen that ip steers input data to alternate pairs, so avoiding the data last written. The variable op is a bit used by the reader to select the pair not currently being used for writing, and therefore containing the latest completely written data.

The variables r and w are two-bit vectors; elements of r indicate the slot in each pair which has been committed for reading, and elements of w indicate the slot in a pair which is being written or the slot in a pair which contains the latest data for that pair. At the start of the read pre sequence, the reader indicates its interest in the current values of w. One of the slots (in one of the pairs), indicated by the w vector, is designated for writing and the iplop logic at the pair level will prevent the reader from gaining access to it; the other slot (in the other pair), indicated by the w vector, must contain the latest data.

2.2 Data-flow model The data-flow model in Fig. 2 can be used as a graphi- cal aid to reason about the way the algorithms work. There is a box for each algorithm variable. Heavy interconnection lines indicate the flow of data and light

IEE Pvoc -Comput Digit Tech, Vol 144, No 4, July 1997

lines indicate the transfer of values from one control variable to another. The control-variable boxes, in addition to holding the control values, are shown as switches selecting the data-flow paths from input to data-variable box, and from data-variable box i o output; the settings of these switches are determined by the control sequences as specified in the algorithms.

Fig.2 Data-Jlow model

As a specific example of algorithm operation, consider a sequence of actions leading to the state of the switches shown in Fig. 2. Suppose that the read.er has been inactive for some time, while the writer has been alternately overwriting to slots 40, 01 and 41, I] . From this it can be inferred that r = (1, 0) and w = (0, 1). Now consider a complete read sequence occurring between two complete write sequences as follows:

d[O,O] := i n p u t ; i p := I; w[l] := 1; r := (0; 1); op := 0; ou tpu t := d[O, 01;

d[l , 11 := i n p u t ; i p := 0 ; w[O] := 1;

The control-variable values now correspond ‘to the switch settings in Fig. 2, where 0 is equivalent to a switch in the upper position. The above sequence shows that the reader acquires the freshest data, and that subsequent writes will avoid any extension of the read operation. Specific cases such as this give a useful feel for the way the algorithms interact but they cannot address, with any confidence of completeness, the full range of possible interleaving and overlap. A compre- hensive correctness analysis is developed in a sequel paper.

It is clear from the diagram that the writer, through the variable ip, has complete control over pair selection. It is not quite so easy to see how the reader dominates slot selection, as there is a two way interaction at this level. The point here is that the first operation of the read presequence is a vector assignment which, notionally, puts the reader’s mark both on the slot which is declared to hold the latest data and on the slot which will next be declared to hold the latest data; following completion of any current write operation, the writer must not make any further use of these marked slots. The freshness property is preserved by the reader’s selection of the slot containing the latest data in the pair containing the latest data. The coherence property is preserved by the

IEE Proc -Comput. Digit. Tech.. Vol. 144, No. 4, July 1997

writer’s avoidance both of the pair containing the latest data and of the slots which the reader has selected.

2.3 Metastability The discussion so far has been couched entirely in terms of the logical properties of the algorithm operations, and has implicitly assumed that the bit- control variables behave in a straightforward manner. However, when the writer and reader are fully asynchronous it cannot be assumed that the shared control variables are stable when they are observed. It has been shown 131 that the ‘flicker’ effect (the arbitrary value exhibited by a bit variable over the bounded time while it is changing) can safely be ignored, as the value obtained by a read operation must be the old or new value, and either is acceptable to the mechanism logic. The ‘dither’ effect (the potentially unbounded time taken for a reader to come to a conclusion concerning the value which has been obtained) requires more careful consideration.

Dither arises from the metastability phenomenon [lo], and occurs whenever a bit variable is observed close to the time it is being changed. In these circumstances, the value observed may, for a period of time, assume an invalid state (neither a 0 nor a l), or the change in value may be deferred so that the delay through a device implementing the bit variable is longer than the normal switching time. Either way, the problem resolves itself after an interval which is dependent on the proximity of the reading and writing operations [1 11. Quite modest additional delays between the reading and the use of shared values reduces the problem to vanishingly small proportions (i.e. well below other failure modes), but it is still necessary to examine the control sequences for the impact of this effect to determine whether additional delays between operations should be enforced. It is only those implementations where the control bits are addressed directly which can be affected, e.g. where they are located in specialised hardware, or take the form of software variables in asynchronous dual-port memory. Control bits in arbitrated shared memory (e.g. bus memory) will not be affected (the arbitrator must already have resolved metastability effects).

Consideration of the effects of metastability is dependent on the recognition of the following interact- ing events in the operation of the new algorithms:

w i p , rep writer indicates pair. reader chooses pair T Z S , wcs reader indicates slot, writer chooses slot wzs, res writer indicates slot, reader chooses slot

‘Indication’ occurs at the end of an assignment operation and recognises the point at which the new value becomes visible (to the other process). ‘Choosing’ occurs at the beginning of an assignment and recognises the point at which the operation is committed to the subsequent indication of a particular value. Thus the control events are positioned as follows: wip occurs at the - end of ip : = F ; rcp occurs at the beginning of op : = ip; ris and rcs occur at the end and beginning of r : = w; wis and wcs occur at the end and beginning of w[@] : = y[ip]. There are no other interactions; for example, wcp (writer chooses pair) is only of interest to the writer and does not interact with the reader in any way.

Metastability can only occur when control sequences

229

overlap or immediately abut one another, and there are three event pairs which invoke the effect: (i) wip-vcp: Metastability can occur when the read pre sequence finishes fractionally after the write post sequence starts. The effect is to cause some temporary uncertainty in op. There is no feedback of the effect within the control sequences. (ii) wis-rcs: Metastability can occur when the read pre sequence starts fractionally after the write post sequences finishes. The effect is to cause some temporary uncertainty in r [ ip ] . There is no feedback of the effect within the control sequences. (iii) ris-wcs: Metastability can occur when the middle event of the write post sequence occurs fractionally after the middle event of the read pre sequence. The effect is to cause some temporary uncertainty in iv[ ip]. By the time this happens, this variable will already have been interrogated by the read prl: sequence, so again there is no feedback within the coni:rol sequences. In all the above cases it is still necessary to allow control variables to settle before they are used to access the data slots. The conclusion is that a reasonable interval must be allowed between the final operation of a control sequence and the subsequent data-access operation, but that within each control sequence ‘operations can proceed at a rate such that the result of one [under normal (no metastability) switching conditions] is estab- lished before the next is started.

2.4 Software design Algorithms can be used as the basis for constructing asynchronous communication mechanisms in either hardware or software [2]. Here a design is described to illustrate some of the application issues which arise in software implementations. The design in Fig. 3 takes the form of a ‘route’ pool, where the term ‘route’ is used to denote a communication path1 between two processes, and the term ‘pool’ denotes that the form of communication is such that neither process affects the timing of the other [12]. The access control operations are shown as delivering pointers, to highlight the fact that access can safely take place at any time between the control sequences.

route pool: var data : array [0..1, 0.,1] of data := ((init, init), (init, init));

r, w :array [0..1] of O..l ; ip : 0..1;

function post-write : rda ta ; begin

end ;

function pre-read : Tdata; var op : 0..1; begin

end ~

ip := not ip; w[ip] := not r[ip]; post-write := Tdata[ip, w[ip]]

r := w; op := not ip; pre-read :=rdata[op, r[op]]

end.

1.3 Soflvvnrr design

The post-write and pre-read functions faithfully reflect the algorithm-control operations. The post-write function has the dual role of giving access to the previous data written as well as indicating where the next data are to be assembled; it should be called immediately each data write is complete, so as to make this new data available to the reader. Provided that all four data slots are initialised, the control variables do not need to be preset. A pre-read function executing before

230

the post-write function has been invoked will gain access to the initial value as determined by the initial (random) settings of the control variables. The first call of the postwrite function merely delivers a pointer to the first slot to be used for writing while designating a new slot (also holding the initial value) as containing the latest data.

This design can be used for communication through unarbitrated asynchronous dual port memory, where metastability is a theoretical possibility and the effects discussed in Section 2.3 must be borne in mind. Alternatively, the design can be used for communication through arbitrated shared memory, where metastability effects will have been resolved at the arbiter level, and there is no impact within the algorithms themselves. The need for a four-slot mechanism in this situation will depend on the granularity of the control over access to the shared memory [3]. A third possible scenario occurs when the writer and reader are tasks at different priority levels executing in the same processor; the four-slot algorithms permit safe communication with no form of critical section or pre-emption inhibition.

3 Conclusions

This paper has presented new algorithms for achieving asynchronous communication, based on the application of an orthogonal avoidance strategy in mechanisms which contain a two-by-two array of slots through which data can be passed. The new algorithms are complementary to a previously known solution. Although there can be no guarantee that all solutions have been found, this now seems more likely; the new and previous algorithms, taken together, appear to offer little prospect of further refinement.

The properties of the new algorithms have been discussed informally. A sequel paper will present an analysis of the new algorithms based on the dynamically changing slot roles during asynchronous operation. This confirms the results arrived at informally. A further paper will show how the algorithms can be extended and adapted to produce multiwriter and mul- tireader solutions.

Multiple-slot mechanisms achieve their goal of asynchronous operation at the expense of additional memory space. This introduces a significant overhead where a large amount of intermediate data is involved. How- ever, where there is a relatively small amount of data to be passed (e.g. access control data) they give attractive solutions. The absence of temporal interference, and the fact that there are no failure modes arising from supporting synchronisation, make them particularly suitable for high-integrity dependable systems.

There is some evidence [13] that awareness of the previous four-slot algorithms is growing. The new algorithms show that the idea of orthogonal avoidance has a more general application in the design of asynchronous forms of communication. In due course these asynchronous forms should take their place alongside the traditional algorithms [ 141 based on synchronous interaction.

4 Acknowledgment

Several colleagues have helped in the preparation of this paper by their comments on early drafts. In particular I thank Eric Campbell and Len Griffiths for their

IEE Proc -Cumput. Digit. Tech.. Vol 144, No. 4, July 1997

part in discussions which have assisted in the clarifica- tion of important points of detail, and Jing Chen for bringing some related work to my attention.

5 References

1 SIMPSON, H.R.: ‘Fully asynchronous communication’. IEE Col- loquium on Mascot in real-time systems, 1987, (IEE Colloquium digest 58) SIMPSON, H.R.: ‘Four-slot fully asynchronous communication mechanism’, ZEE Pvoc. E, Comput. Digit. ’I’ech., 1990, 137, (l), pp. 17-30 SIMPSON, H.R.: ‘Correctness analysis for class of asynchronous communication mechanisms’, IEE Proc. E, 1992, 139, ( I ) , pp. 35- 49

4 DIJKSTRA, E. W.: ‘Co-operating sequential processes’ in ‘Pro- gramming languages’ (Academic Press, 1968)

5 HOARB, C.A.R.: ‘Monitors: an operating system structuring concept’, Commun. ACM, 1974, 17, (lo), pp. 549-557

2

3

6 HOARE, C.A.R.: ‘Communicating sequential processes’ (Pren- tice-Hall International 1985)

7 LAMPORT, L.: ‘The mutual exclusion problem Part 11: State- ment and solutions’, J. A C M , 1986, 33, (2), pp. 327-348

8 ANDERSON, J.H., and GOUDA, M.G.: ‘A criterion for atomicity’, Form. Asp. Comput., 1992, 4, (3), pp. 273-298

9 TROMP, J.: ‘How to construct an atomic variable’,Proc. 3rd Int. Workshop on Distvibuted algorithms, 1989, pp. 292-302

10 CHANEY, T.J., and MOLNAR, C.E.: ‘Anomalous behavior of synchronizer and arbiter circuits’, IEEE Truns., 1973, C-22, (4), pp, 42 1-422

11 HORSTMANN, J.U., EICHEL, H.W., and COATES, R.L.: ‘Metastability behaviour of CMOS ASIC flip-flops in thcory and test’, IEEE J. Solid-state Circuits, 1989, 24, (l), pp. 146-157

12 SIMPSON, H.R.: ‘Architecture for computer based systems’, Proceedings of IEEE 1994 Tutorial and Workshop on Systems rngineering of computer-based systems, May 1994, pp. 70-82

13 BURNS, A., and WELLINGS, A.J.: ‘Concurrency in Ada’ (Cam- bridge University Press, 1995)

14 RAYNAL, M.: ‘Algorithms for mutual exclusion’ (North Oxford Academic Publishers Ltd., 1986)

IEE ProcComnput. Digit. Tech., Vol. 144, No 4, . ldy 1997 231

new algorithms for asynchronous communication

Documents