what is distributed data processing system? · pdf filetypically has both general-purpose and...

9
What is a "Distributed" Data Processing System? Philip H. Enslow, Jr. Georgia Institute of Technology Introduction Words have only one purpose in a technical context-the transmission of information. When they fail to do that, they lead to confusion and misunderstanding. "Distributed data processing" and "distributed processing" are two phrases which illustrate that axiom. Like many other words in the lexicon of the computer professional, these have become cliches through over-use, losing much of their original meaning in the process. This paper is an attempt to reverse that trend. A great many claims of improved performance (the table below presents only a partial list) are being made for "distributed data processing" systems by vendors as well as authors.'4 Hardly anyone reasonably knowledgeable about the current state of the art in multiple-processor data processing sys- tems would claim that major advances toward any significant number of these goals are possible with present technology. To fulfill such a claim, distributed Claims made for "distributed" data processing systems. HIGH SYSTEM PERFORMANCE, FAST RESPONSE, HIGH THROUGHPUT HIGH AVAILABILITY HIGH RELIABILITY REDUCED NETWORK COSTS GRACEFUL DEGRADATION (FAIL-SOFT CAPABILITY) EASE OF MODULAR, INCREMENTAL GROWTH AND CONFIGURATION FLEXIBILITY RESOURCE SHARING AUTOMATIC LOAD SHARING HIGH ADAPTABILITY TO CHANGES IN WORD LOAD INCREMENTAL REPLACEMENT AND/OR UPGRADING OF COMPONENTS (BOTH HARDWARE AND SOFTWARE) EASY EXPANSION IN BOTH CAPACITY AND FUNCTION EASY ADAPTATION TO NEW FUNCTIONS GOOD RESPONSE TO TEMPORARY OVERLOADS data processing must be something new and must represent a large research area. However, what is usually presented today as distributed data proces- sing is just a warmed-over version of an old concept with a new sales word attached. This paper discusses some of the essential charac- teristics for a new class of systems, still in the research stage, that will provide some of the benefits given in the table. We do not deprecate present work in improving systems for increased performance or reliability, even if it has been labeled "distributed data processing." Rather, we hope to introduce some precision of terminology and evaluation for this new area. What is distributed? At least four physical components of a system might be distributed: hardware or processing logic, data, the processing itself, and the control (such as the operating system). Some speak of a system that has any one of these components distributed as being a "distributed data processing system." However, a definition that is based solely on the physical distribution of some components of the system is doomed to failure. A proper definition must also cover the concepts under which the distributed components interact. Trivial examples abound of system organizations that exhibit physical distribution but are not considered distributed data processing systems-for example, physical distribution of input/output processing hardware and function. Similarly, there is certainly no distri- bution of processing functions if there is no distribu- tion of processing hardware, and, conversely, a system that distributes hardware without distribu- ting processing is difficult to imagine. January 1978 13

Upload: doanduong

Post on 15-Mar-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

What is a "Distributed"Data Processing System?Philip H. Enslow, Jr.Georgia Institute of Technology

Introduction

Words have only one purpose in a technicalcontext-the transmission of information. Whenthey fail to do that, they lead to confusion andmisunderstanding. "Distributed data processing"and "distributed processing" are two phrases whichillustrate that axiom. Like many other words in thelexicon of the computer professional, these havebecome cliches through over-use, losing much oftheir original meaning in the process. This paper isan attempt to reverse that trend.A great many claims of improved performance

(the table below presents only a partial list) are beingmade for "distributed data processing" systemsby vendors as well as authors.'4 Hardly anyonereasonably knowledgeable about the current stateof the art in multiple-processor data processing sys-tems would claim that major advances toward anysignificant number of these goals are possible withpresent technology. To fulfill such a claim, distributed

Claims made for "distributed" data processing systems.

HIGH SYSTEM PERFORMANCE, FAST RESPONSE, HIGH THROUGHPUTHIGH AVAILABILITYHIGH RELIABILITYREDUCED NETWORK COSTSGRACEFUL DEGRADATION (FAIL-SOFT CAPABILITY)EASE OF MODULAR, INCREMENTAL GROWTH AND CONFIGURATION

FLEXIBILITYRESOURCE SHARINGAUTOMATIC LOAD SHARINGHIGH ADAPTABILITY TO CHANGES IN WORD LOADINCREMENTAL REPLACEMENT AND/OR UPGRADING OFCOMPONENTS (BOTH HARDWARE AND SOFTWARE)

EASY EXPANSION IN BOTH CAPACITY AND FUNCTIONEASY ADAPTATION TO NEW FUNCTIONSGOOD RESPONSE TO TEMPORARY OVERLOADS

data processing must be something new and mustrepresent a large research area. However, what isusually presented today as distributed data proces-sing is just a warmed-over version of an old conceptwith a new sales word attached.This paper discusses some of the essential charac-

teristics for a new class of systems, still in the researchstage, that will provide some of the benefits givenin the table. We do not deprecate present work inimproving systems for increased performance orreliability, even if it has been labeled "distributeddata processing." Rather, we hope to introducesome precision of terminology and evaluation forthis new area.

What is distributed?

At least four physical components of a systemmight be distributed: hardware or processing logic,data, the processing itself, and the control (such asthe operating system). Some speak of a system thathas any one of these components distributed asbeing a "distributed data processing system."However, a definition that is based solely on the

physical distribution of some components of thesystem is doomed to failure. A proper definitionmust also cover the concepts under which thedistributed components interact. Trivial examplesabound of system organizations that exhibit physicaldistribution but are not considered distributeddata processing systems-for example, physicaldistribution of input/output processing hardwareand function. Similarly, there is certainly no distri-bution of processing functions if there is no distribu-tion of processing hardware, and, conversely, asystem that distributes hardware without distribu-ting processing is difficult to imagine.

January 1978 13

A working definition of a distributed data proces-sing system adheres to the claim that "new"system designs will be required to achieve a majorportion of the benefits cited in the table. Thereforewhat is presented might be considered the "researchand development" definition of distributed dataprocessing systems. This definition has five com-ponents:

* A multiplicity of general-purpose resource com-ponents, including both physical and logicalresources, that can be assigned to specific taskson a dynamic basis. Homogeneity of physicalresources is not essential.

* A physical distribution of these physical andlogical components of the system interactingthrough a communication network. (A networkuses a two-party cooperative protocol to con-trol the transfers of information.)* A high-level operating system that unifies andintegrates the control of the distributed compo-nents. Individual processors each have their ownlocal operating system, and these may be unique.

* System transparency, permitting services to berequested by name only. The server does not haveto be identified.* Cooperative autonomy, characterizing the oper-ation and interaction of both' physical and logicalresources.

These properties and operating characteristicsare present in a number of systems to varyingdegrees, providing some of the benefits listed in thetable. However, only the combination of all of thecriteria uniquely defines distributed data processingsystems.

The first essential ingredient

If there were not a multiplicity of assignableresources to provide services within the system,there could be no distribution of processing through-out the system. The modifier "general-purpose" hasoften been used to describe these "assignable"resources, such as general-purpose computer systemsor general-purpose processors. However, it is notclear that this modifier is either necessary or suffi-cient. What is necessary is that the system have theability to be dynamically reconfigured on a short-termbasis with respect to those resources that providespecific services at any given time. This reconfigur-ation or reassignment of resources must be possiblewithout affecting the operation of those resourcesnot directly involved. If the purpose of the systemis to provide service that requires a general-purposeprocessor, the system must have a multiplicity -ofgeneral-purpose processors. If the purpose is tofulfill some other function, such as transactionprocessing or on-line control, the system could meetall of the five criteria while the assignable hardwareresources might be considered special-purpose inthe larger sense. However, they are general-purpose

and reassignable within that system. A systemtypically has both general-purpose and special-purpose (fixed function) components, perhaps withsome of the latter being dedicated (nonreassignable).Many variations are possible, from the situation

in which there is only one resource of each type,which does not meet the first criterion of the defini-tion, to the other extreme in which every resource isreplicated. Between these extremes are configurationshaving some single and some replicated resources.Since the management of freely assignable, multiple,duplicate resources within a system that is nottotally replicated is amost as difficult as the situationwith total replication, and may provide almost thesame benefits, the general discussion that followscovers both types of systems.The availability of multiple resources and the

capability to utilize them effectively are essentialto attaining the objectives of high availability,overall system reliability, and graceful degradation.These characteristics also directly contribute toflexibility, adaptability, and incremental expansion.

Physical distribution and intercommunication

Although there are many ways to define a networkand many different aspects of it to consider, themost important one here is the transfer of messages.The physical transfer of messages through a networkis an excellent example of cooperation between twophysical resources. Such transfers follow a two-partyprotocol, in which the two parties must cooperate tosuccessfully complete the transfer. This transfermechanism is in contrast to that of a gated transfer,where the master has full authority to force a slaveto physically accept a message. Under a two-partyprotocol, the destination can physically refuse themessage by returning an indication such as NOTREADY, BUSY, or NOT ACKNOWLEDGE. The unit oftransfer in the system then is a message, not a signal.Gated or master-slave control of information transferwould preclude autonomous operation, discussedlater, whereas a high degree of autonomy among allresources is essential to attain a high availability forthe system.To support the concept of autonomous operation

of processes, the concept of "network" interconnec-tion must be extended to logical as well as physicalresources, with respect to the transfer of informationsuch as status, requests for service, synchronizationbetween logical resources, and so on.The most important characteristics of a two-party

cooperative protocol is that it allows any resource,either physical or logical, to refuse the transfer oracceptance of a message based on its own knowledgeof its own status. This definition of a networkestablishes no criteria for the length of the inter-connection paths of the network. They may be asshort as a connection between two resources on asingle integrated circuit chip. The essential definingcharacteristic is the utilization of a two-party pro-

COMPUTER14

Unity of system operation

To integrate the physical and logical componentsof a distributed data processing system into afunctioning whole, the concept of a high-level opera-ting system must be implemented. The variousprocessors may have individual operating systems,but a well-defined set of policies must govern the inte-grated operation of the total system. The mechanismsimplementing these policies may be uniquely identi-fiable within each individual operating system, orthey may be merely logical extensions of the localpolicies. There must be no strong hierarchy existingbetween the high-level operating system and thelocal operating system for an individual physicalresource, since that would violate the criterion ofautonomous operation. Also, local operating systemsneed not be homogeneous, although the existence ofa variety of interfaces certainly complicates thesystem design problem.As emphasized by another author, "[System]

software makes the difference between a cluster ofcomputers that act independently and that samecluster integrated into a smooth-working, cost-effective distributed data processing operation."5A distributed system has multiple loci of control

as well as processing activity. These multiplicities,as well as dynamic changes in the loci of control,must both exist to meet all of the five criteria ofour definition. Additionally, the physical binding ofthe loci of control or processing activities must beminimized. The nucleus of the operating systemessential for each processor must be minimized,and -the loci of overall system control must bedynamic during runtime. The degree of permanentbinding must be minimized, and the system musthave no "critical paths" or "critical components,"such as permanent binding of any single copy of aresource, to include global state information. Theperformance of the system must not be catastrophic-ally affected or even seriously degraded by thefailure or overload of a resource that has a perma-nently assigned system control function. A similarcomment applies to processing activities, althoughthere will be instances in which the requirementfor a specialized resource might require permanentbinding of a processing function-for example, aspecial transfer or data base or a unique I/O device.Other properties that must characterize the high-

level operating system are the support of an effectivetransfer bandwidth between processors larger thanthat found in simple communicating computers, theability of a processor to continue to operate andeffectively utilize its local resources when discon-nected from the network and to be easily reconnectedand reintegrated into the system without affectingother operations in progress, and the support ofprocess and processor interactions at a "fine-grain"level when appropriate. In particular, the effectivetransfer bandwidth between processors must bewider than that found in simple communicatingcomputers. Likewise, a processor must continue tooperate and effectively utilize its local resources

when disconnected from the network, and it must beeasily reconnected and reintegrated into the networkwithout affecting other operations.The high-level operating system, whether it exists

as distinct and identifiable blocks of code or only asa design philosophy, converts the available resourcesfrom a mere collection of hardware into a function-ing, coherent system. The design of the high-leveloperating system is obviously the key factor in over-all system design, and it presents a number of prob-lems that will require extensive further research tosolve. Several of these aspects of the high-leveloperating system are discussed later in this article.

System transparency

If any system is to provide the capabilities listedabove, the interface presented to the user must beone of services and not servers. The user must beable to request an action by specifying what is tobe done and not be required to specify which physicalor logical component is to provide that service.An important characteristic of the distributed

system is that its existence is totally transparentto the user, unless he wishes to take cognizanceof it for specific reasons of efficiency. He shouldbe able to develop routines and programs and handledata bases just as if he were communicating witha single, centralized system. In fact, the user inter-face to a distributed system must be even simplerthan those of present centralized systems. Thisresults from the fact that the user communicateswith the high-level operating system; one functionof a portion of the control software is to deal withall varieties of command languages and data defini-tions present in the system. Initial work on non-homogeneous network systems has provided someof the strongest motivations for better (even stand-ardized) operating system command languages.Because the organization of the system can be totallytransparent to the user, system status informationcan be hidden from him, so that it would be impos-sible for him to know what resources are currentlyavailable or which would be the best to execute histask. Therefore, he must be able to request servicesby the name of the service and not by the designationof the server. If he wants a virtual uniprocessor, heshould be able to get it. If another application wouldbe better served by designated resources, that usershould also be able to request that form of service.The nature of the user's interface with a distributed

system dictates one of its most important character-istics: the services provided by a distributed proces-sing system could be provided by a single-processorsystem if the necessary hardware existed and couldbe so configured. However, such a single systemwould not provide the other benefits of a distributeddata processing system, such as reliability, adapt-ability, and modularity.

January 1978 15

Cooperative autonomy

This is the final, and perhaps most important,essential ingredient. A distributed data processingsystem must be designated so that the operationsof all components or resources, both physical andlogical, are very highly autonomous. At the physicallevel, this may be accomplished by the use ofnetwork transmission protocol in which the trans-mission of a message requires cooperative actionsby both the sender and the receiver. At the logicallevel, between processes, the same degree of coopera-tion must exist. Further, any resource must beable to refuse a request for service, even after it hasaccepted the physical message. This is a result ofthe fact that there is no hierarchy of control withinthe system.This is not anarchy. All components follow a

"master plan," which is reflected in the philosophyof the high-level operating system. This mode ofoperation should be described as cooperative auton-omy rather than simply autonomy. A high degreeof autonomy between all components is essential toattaining many of the benefits listed in the table,and this characteristic of system operation andcomponent interaction will result only from meetingall of the five criteria of the definition.

Some excluded systems

Considerable work has been done on new systemdesigns to achieve subsets of these benefits, butvery few systems have made substantial progresstoward meeting all of the criteria. Perhaps the mostwidely known of these is Arpanet; however, onlythe communication subsystem of that networkqualifies in this respect. Many other systems, someof which are discussed elsewhere in this magazine,have made substantial improvements in subsets ofthe areas of system performance; examples are theHoneywell Experimental Distributed Processor, theCm* system at Carnegie-Mellon University, Mininetat the University of Waterloo and ICOPS at BrownUniversity. However, the number of systems mis-labeled as distributed data processing systems farexceeds these.Most of the criteria contained within the definition

are met by crossing a threshold on a particulardimension. The definition is not 'a set of binarycriteria, and better understanding of these criteriaand their thresholds can be obtained by consideringsome systems that are excluded by the definition.

It excludes, for example, distribution within asingle mainframe. One writer has characterized thearchitecture of several of the modern processorsystems that include independent I/O channels as"incorporating distributed processors since [it] con-tains separate I/O processors, arithmetic logicprocessors and possibly diagnostic processors."6Such a categorization has little utility and has notfound very wide acceptance. Obviously, there is apermanent binding of tasks to the various compon-ents in this type of system organization.

A front-end processor that controls communicationwith a mainframe definitely does not constitutethe type of distributed system defined here.7Although it may meet some of our criteria, it alsois dedicated to one function and is not freelyassignable.Many instances of a master/slave relationship

occur in both hardware and software control. Thekey point is that the recipient of the informationtransferred, be it data or a control signal, cannotdecide whether or not to accept the transfer andact upon it. When this concept is implemented inhardware, it is often referred to as gated transfer.In software control systems the master/slave rela-tionship is quite commonly encountered in multiplecomputer and basic multiprocessor operatingsystems.The continued decline in the price of hardware

has made more and more attractive new multiple-processor system organizations incorporatingspecialized functional units, such as vector multiplier,a floating-point arithmetic unit, or a fast Fouriertransform unit. In the general concept of operation,such dedicated function processing is only slightlydifferent from a master/slave relationship. Themajor difference is that the master/slave controlrelationship also excludes many hardware systemscontaining multiple general-purpose processingunits from our definition. What causes some of theterminology confusion with these configurationsis that these specialized services are often providedby a general-purpose unit, such as a programmablemicroprocessor. The functional unit may be "special-ized" by a microprogram, or it may be completelygeneral but utilized in a dedicated functional role,such as a minicomputer to control input/output ina larger system. The distinguishing characteristicof this class of excluded systems is the dedicationof the resource to a single or a fixed set of functions.It operates in a master/slave mode, as far as thecontrol over its own activities is concerned. Thecriteria of both free assignment and autonomy areviolated.There is wide agreement (except perhaps among

marketing and advertising people) that a single hostprocessor with a collection of remote terminalsthat simply collect and transmit data does notqualify as a distributed data processing system,even if the terminals are intelligent and do someediting and formatting.Even the presence of multiple hosts in a complex

network interconnection structure does not neces-sarily make the system distributed. It may bedistributed from the point of view of switching;but from the point of view of overall operations andcontrol, it usually is centralized. Systems such asthese do not have the capability for dynamic realloca-tion or reassignment of tasks in the event of hard-ware failure.

Intelligent terminals systems are most oftenpresented as distributed processing systems inadvertising copy. However, the operation of a systemwith intelligent terminals or local processors has to

COMPUTER16

be studied carefully to determine to what extent theprocessing is actually distributed. Such a system(several are commercially available) consists ofseveral terminals connected to a local processor thathas secondary storage capabilitites, such as disks orcassettes. It offers intelligent data entry-fieldediting and similar functions executed in the localprocessor through the execution of a program storedthere. It has shared file access, but only to localfiles. It communicates with a main processor, but todo so, the local processor must emulate a "dumb"terminal in order to use normal protocols. Finally,it is capable of remote job entry. There is no indica-tion of any distribution of the control function, forthe distribution of work is fixed and a local terminalcannot affect it.A terminal with a resident text editor, whether it

is provided by hardware or software, is not anexample of a distributed data processing system.In order to meet the definition, the terminal must be"smart" enough, first, to do some real work, andsecond, to recognize when it cannot accomplish itsassigned work and to pass it on to another appro-priate service unit. The simple off-loading of work toa higher level when this level is fully utilized is justthe beginning of the transition to fully distributedprocessing. If the terminal coordinates several con-current and simultaneous remote jobs, giving each adifferent type of service at a different location, with-out human intervention, then it more closely re-sembles a distributed system. The threshold isreached when the local control system can decidewhether work should be done locally or passed on tothe rest of the system, basing its decision on ananalysis of local workloads and capabilities. Dis-tributed processing is definitely not equated withmerely "moving equipment to the periphery of abusiness system to capture and process data at thesource."4Perhaps the intelligent terminal does have a role

to play in the development of distributed processingsystems. It may facilitate a painless transition tomore decentralized organizations for hardware anddata storage as well as control. This is accomplishedby adding features to the local system and makingother modifications that increase the local functions,prior to establishing higher-level system connectionsand a complete build-up of global functions.

Dimensions that characterize distribution

From the point of view of system implementation,three "dimensions" can be used to characterizethe degree of decentralization of a system, as wellas to define distributed systems. These three dimen-sions represent decentralization in hardware organi-zation, control point organization, and data baseorganization.Several points can be identified along the dimension

of hardware organization. In order of increasingdecentralization, they are as follows:

January 1978

SYSTEMS NEUROSCIENCEEdited by JACQUELINE METZLERCenter for Systems Neuroscience Executive Committee:Michael A. Arbib, William L. Kilmer and D. Nico Spinelli

CONTENTS: J. Metzler, Mental Transformations: A Top-Down Analysis. J. Metzler and D. N. Spinelli, CorticalDevelopment and Perceptual Invariance. S. Amari et al.,A Neural Model for the Handling of Phenomena Asso-ciated with Trains of Light Stimuli: An Updated Versionto Fit Fusion Data. S. Amari, A Mathematical Approachto Neural Systems. S. Amari and M. A. Arbib, Competitionand Cooperation in Neural Nets. I. Lieblich and S. Amari,The Amygdaloid Kindling Phenomenon: A Tentative Model.W. L. Kilmer et al., Two Time-Domain Oscillatory BrainModels for Psychologists. A. I. Karshmer and F. K. Len-herr, Some Design Criteria for the CORETEX Language.M. A. Arbib and I. Lieblich, Motivational Learning ofSpatial Behavior. W. L. Kilmer and D. Willard, A Modelof C A Hippocampus as a Spatial Cognitive Map.1977, 288 pp., $12.00/f8.50 ISBNI: 0-12-491850-6

PERSPECTIVES ON COMPUTERSCIENCEFrom the 10th Anniversary Symposium at the Computer ScienceDepartment, Carnegie-Mellon UniversityEdited by ANITA K. JONESA Volume in the ACM MONOGRAPH Series

CONTENTS: A. Perlis, The Keynote Speech. C. G. Bell,What Have We Learned from the PDP-11? H. Berliner,The Use of Domain-Dependent Descriptions in TreeSearching. R. E. Fikes, Knowledge Representation in Au-tomatic Planning Systems. A. N. Habermann, On theConcurrency of Parallel Processes. R. Iturriaga, A CaseStudy-Assimilation and Creation of Computer ScienceTechnology in a Developing Country. Z. Manna and A.Shamir, A New Approach to Recursive Programs. A.Meyer and M. I. Shamos, Time and Space. A. Newell andG. Robertson, C.mmp: A Progress Report on SynergisticResearch. R. Reddy and A. Newell, Multiplicative Speedupof Systems. H. A. Simon, On the Nature of Understanding.W. A. Wulf, Some Thoughts on the Next Generation ofProgramming Languages.1977, 256 pp., $19.50/E13.85 ISBN: 0-12-389450-6

A. P. I. C. STUDIES IN DATA PROCESSING NO. 14

SOFTWARE ENGINEERINGProceedings of a Symposium held at The Queen's Universityof Belfast, 1976Edited by R. H. PERROTT

CONTENTS: C. A. R. Hoare, Polemical Prologue. THEPRESENT: C. A. R. Hoare, Introduction. Operating Sys-tems: D. Hartley and J. Larmouth, Taming OS 360. D. R.McGregor and A. J. T. Colin, Some Applications of Soft-ware Engineering Principles. Languages: M. Jackson,Cobol. K. Roberts, Fortran. Language Design Experience:J. Nicholls, Conflicting Issues in Programming LanguageDesign. G. B. Newell, The Xeka Story. Operating SystemDesign Experience: G. B. Newell, Possible Lessons fromDesigning and Producing George 3 and 4 Operating Sys-`tems. THE FUTURE. R. M. McKeag, Operating Systems.N. Wirth, Programming Languages: What to Demand andHow to Assess Them.1977, 218 pp., $16.00/£8.20 ISBN: 0-12-551450-6Send payment with order and save postage plus 50¢ handling.Orders under $15.00 must be accompanied by payment.Prices are subject to change without notice.U.S. customers please note: On prepaid orders-payment willbe refunded for titles on which shipment is not possible within120 days.

Academic Press, Inc.A Subsidiary of Harcourt Brace Jovanovich, Publishers111 FIFTH AVENUJE, NEW YORK, N.Y. 1000324-28 OVAL ROAD, LONDON NW1 7DXPlease send me the following:_copies, Metzler: Systems Neuroscience_copies, Jones: Perspectives on Computer Science_copies, Perrott: Software EngineeringCheck enclosed. Bill me_NAMEADDRESSCITY/STATE/ZIP Reader SericeNew York residents please add sales tax. 'Reaer IDirect all orders to Mr. Paul Negri, Media Dept. Computer/1/78 Number1

* A single central processor unit with one controlunit, one arithmetic/logic unit, and one centralmemory.

* Multiple execution units; systems containingone control unit, multiple identical ALU's (forexample, the processing elements in the Illiac IV),and, perhaps, multiple independent centralmemories.

* Separate specialized functional units; systemsconsisting of one general-purpose control unit andmultiple ALU's or processing units. Of the latter,some may be specialized units, such as channelsor input/output processor, fast Fourier transforms,processors, vector units, or floating-point arith-metic units. Additional control units may be asso-ciated with the specialized functional unit, butthese are fixed- or limited-capability controlunits. Meanwhile, some of the additional ALU'smay be identical general-purpose units.* Multiple processors; systems consisting of mul-tiple control units, multiple ALU's (general-pur-pose units), and, perhaps, multiple independentcentral memories, but only one single coordinatedinput/output system.* Multiple computers-each a general-purposeCPU with control unit, ALU, central memory, andinput/output system.

The five general criteria exclude all but the lasttwo of these organizations from the class designatedas distributed systems.

Similarly, control organization is a dimension,delineated by the following points, also in order ofincreasing decentralization:

* Single fixed control point, either physically orconceptually.* A fixed master/slave relationship. There may bemultiple levels of master-slave relationships, andthe relationships between the slaves may be non-symmetric. The important point is that themaster/slave relationship is fixed until modifiedby totally outside actions.

* A dynamic master/slave relationship, modifiableby software.

* Multiple (perhaps replicated) loci of systemcontrol operating totally autonomously. An exam-ple is two separate computers interacting only atthe I/O level through the transfer of completefiles.* Multiple control points cooperating on theexecution of a task that has been fragmentedinto subtasks.

* Replicated, identical control points cooperatingon the execution of a task.* Multiple control points, not necessarily homo-geneous, fully cooperating on the execution of atask.

The region excluded from distributed processingalong this dimension consists of the first fourentries.The characterization or ordering of points along

the axis of the data base dimension is not so simple.The proper ordering of points along this axis is notapparent. We have attempted an ordering here;however, the use and maintenance of variousorganizations of distributed data bases are not yetfully understood.The data base itself has two components that may

be distributed-the files and the directory catalogingthose files. Either one of these can be distributedwithout regard to the other, although some combina-tions are obviously meaningless or impractical.In addition to the issue of distribution, there isalso the aspect of replication of the files and/ordirectory. The points in our list, given in order ofincreasing "distribution," refer to concepts thathave actually been implemented or have beenseriously considered for implementation:

* Centralized data bases, including a single copyof both files and directory, held in secondarystorage.* The same, held in primary memory.* Distributed files with a single centralized direc-tory, and no local directory. All accesses have tobe processed through the single central directory.* Replicated data base, with a complete copy ofall files and a directory of the system at eachprocessing node.* Partitioned data base, consisting of the datamost directly required at each node kept at thatnode, with copies as required at other nodes,plus a complete duplicate copy of all files at themaster node. Under this arrangement any trans-action causing an exception at a local node isprocessed by the master.

* Partitioned data base, with local data (files anddirectory) being retained at the processing nodes,but not duplicated at the master node. Howe-ver,the master node maintains a complete directory forthe routing of transactions that cause local excep-tions.* Partitioned data base with no master file ordirectory-no single source of information as tothe location of a specific file.

The first three of these data base organizationsexhibit one common characteristic. They all have asingle entry point, a single copy of a centralizeddirectory, for any use of the data base. Obviously,this situation directly affects the vulnerability,reliability, and throughput of the system.Another possible way to characterize the data

base is to consider the naming and the storing of thedata as distinct dimensions. Again, several combina-tions are possible, such as a common name space(file catalog) and distributed storage of the dataitself, contrasted with maintenance of totally separ-ate name spaces on a single shared storage device.

COMPUTER18

Within the definition given previously, and takinginto account the exclusion of various regions in thethree dimensions, a class of systems that are distri-buted can be represented in three-dimensional spaceas shown in the figure.8

High-level operating systems

The high-level operating system is a key ingredientin the distributed data processing system. Itsdesign must take into account several characteristicsand problems.The classical design for operating systems, as it

has developed, assumes the availability of a largeamount of system information. Although the com-pleteness and validity of information about thework being presented by the user is questionable,the operating system is usually assumed to haveaccess to complete and accurate information aboutthe environment in which it is functioning. This isnot the case in a distributed data processing system;complete information about the system will never

be available. The resources provide a service, butthey may, either intentionally or unintentionally,shield information from outside inspection.In distributed systems, there will always be a

time delay in the collection of information aboutthe status of the system components. The ramifica-tions of these time delays are extremely important.In a conventional centralized processor, the operatingsystem can request status information, being assuredthat the interrogated component will not changestate while awaiting a decision based on that statusinformation, since only the single operating systemasking the question may give commands. In a dis-tributed data processing system, the time lags thatoccur can become significant; as a result, inaccurate(badly out-of-date) information can be transmittedbecause the autonomous component proceeds alongits own path. If you have ever worked with input/output device handlers, you surely have oftenwondered whether or not the information that hasbeen obtained is accurate. For distributed dataprocessing systems, it will be essential to raise thedegree of paranoia of the system designer to a muchhigher level than for centralized systems. The system

ALLOWABLEREGION FORDISTRIBUTED DATAPROCESSING SYSTEMS

7

/ / I

I--" / I--' /

/X;

/CENTR

COMPLETE/REPLICATION

DISTRIBUTED FILES./ SINGLE CENTRAL DIF

7-1

PARTITIONED DATA BASE,/ NO MASTER FiLE OR DIRECTORYPARTITIONED DATA BASE.CENTRAL MASTER DIRECTORYrIONED DATA BASE.AL MASTER COPY

RECTORY

SINGLE COPY,/ PRIMARY STORAGE

SINGLE COPY./SECONDARY STORAGE

~~~ o -z.s~' ~ .Jn

7- C,X ,, oC ,

S ~~~CO.L : r JO z

CONTROL DECENTRALIZATION s

_JIli UJ', o

Dimensions characterizing distribution

January 1978

MULTIPLECOMPUTERS

MULTIPLEPROCESSORS

SEPARATE" SPECIALIZED''FUNCTIONALUNITS

MULTIPLEEXECUTION UNITS

SINGLE CPU

r-l"

I

I

ttw

A3:

19

must be designed to work even with erroneous orinaccurate status information.A further complication with, regard to system

information available is the possibility of variationsin the information presented to different systemcontrollers. These variations may be a result both oftime delays and of differences in the shielding ofinformation from different controllers. As Lelannhas observed, "This absence of uniqueness, both intime and in space, has very important consequences."9

General control problems

High-level operating systems as described hereare highly nonhierarchical-that is, they are single-level and have no internal master/slave relationships.This characteristic, combined with component auto-nomy, greatly exacerbates the control problems.Even if autonomous multiple components are coop-erating, the probability of simultaneous conflictingactions is much higher than in hierarchical systems.Also, synchronizing the actions of the variouscontrollers in the system is much more difficult,because of the presence of appreciable time-lags.Finally, the problem of deadlocks or infinite cycleswithin the system is quite different from thatassociated with other systems. Some proposals callfor an umpire (an outside third party) to solve thisproblem; however, such an umpire would have to betransient, since the presence of a permanent umpirewould denote an unacceptable degree of hierarchicalcontrol.From the operating characteristics of the distri-

buted processing system, some conclusions can bedrawn about the nature of system communication.The second criterion of our definition requires amessage-type protocol for all transfers, both physi-cal and logical, both in interprocess communicationsand interprocessor communications. There must beno global variables and there must be no tunnelingacross system components. All parameters must bepassed across well-defined and rigidly enforcedinterfaces.Much of the work done on communication in

uniprocessor and multiprocessor environments isapplicable, but extensions to the solutions foundthere are definitely required to cope with theautonomous nature of the system components inthe distributed system.'0The user must communicate with the system by

directives containing service names only. Our criter-ion of system transparency makes unnecessary andperhaps impossible the user designation of the sys-tem component offering a desired service. However,this requirement introduces new problems of systemfailure and user error detection, since no one processorcan establish whether the service requested can beprovided anywhere in the system, or even whetherit is legal.Resource management in a distributed processing

system is a multidimensional job. Thus far, verylittle work has been done on the aspects of resource

management that apply specifically to distributedprocessing systems. However, low-level functionsare quite similar to those performed on uniprocessors;they include physical resource allocation, and man-agement of those facilities required by a processafter it has been scheduled on a particular systemcomponent. Before that can be done, however, therequired resources may have to be assembled at onelocation, or linkage mechanisms established so thatthey can be used remotely. The problems that haveto be addressed in that process are locating theresources, determining which components are suit-able, and determining the best way to move theresources to the selected location. At an even higherlevel is the scheduling problem, determining when afunction should be initiated or terminated.Any system exhibiting monolithic, autonomous

control presents completely new problems in systemscheduling. A request for service in a nonhierarchi-cal system might well result in an initial denial of thatservice by all physical resources. In that instance,the requesting entity might initiate an evaluationof relative priorities between the new request andcurrently executing tasks, followed perhaps by bid-ding (priority adjustments) and preemption. Theefficient execution of this procedure is one of themost important functions of the high-level operatingsystem.When all of these problems and their possible

solutions are compared to similar problems and solu-tions encountered in uniprocessor systems, themajor factor exacerbating the distributed systemcontrol problem is seen as communication withinthe distributed data processing system, which isasynchronous with respect to the detailed executionof the functions, and which exhibit time-lags in addi-tion to the communication processing time itself.Uniprocessors cope with many of the problemswith semaphores, flags, lockout gates, or timeouts.To attempt to do this in a reasonably complexdistributed system requires too much time, in thesense that such practices greatly reduce the through-put rate of the system. Bear in mind that transittime for signals transmitting the semaphores is onthe order of 100 milliseconds. In addition to thelowering of performance, the reliability and therobustness of most of the uniprocessor solutionsare in doubt, since a system operation such as TESTand SET cannot be replicated as a single indivisiblemachine-level instruction that can be executedimmediately on the next machine cycle.The problem of time is further complicated by

the fact that most of the procedures, such as votingand software synchronization, which have beenpresented as solutions to the difficulties introducedby transit time, require even more processingby every component in the system." -13

Conclusions

Distributed data processing systems are a newclass of organizations and operations that exhibit a

COMPUTER20

high degree of distribution in all dimensions, as wellas a high degree of cooperative autonomy in theiroverall operation and interaction. Numerous systemshave been designed that meet one or another of thecriteria in the definition for distributed data proces-sing; however, a large number of research issuesmust still be solved before we can implement a tirueand complete distributed data processing system. U

Acknowledgments

I cannot accept the credit for many of the ideasand finer points of discussion in this article; however,I do accept the responsibility for the manner inwhich they have been combined and presented as asomewhat controversial definition of a new class ofsystems. Space does not permit me to aeknowledgeindividually all of those that have contributed, butI would like to take special note of the participantsin the Workshops on Distributed Data Processing,as well as other individuals who have taken the timeto discuss with me at iength these definitionalissues.

References

1. G. A. Champine, "Six Approaches to DistributedData Bases," Datamation, May 1977, pp. 69-72.

2. Dixon R. Doll, "Relating Networks to Three Kinds ofDistributed Function," Data Communications, March1977, pp. 37-42.

3. General Automation, "Distributed Processing Net-works," Advertising Brochure, March 1977.

4. Arthur Lynch, "Distributed Processing Solves Main-frame Problems," Data Communications, December1976, pp. 17-22.

5. Stephen A. Kallis, Jr., "Networks and DistributedProcegsing," Mini-Micro Systems, March 1977, pp.32-40.

6. Jules H. Gilder, "Distributed Processing: Keywordfor Tomorrow's Supercomputers," Computer Deci-sions, April 1976, p. 14.

7. Foster Brown, "There Ain't No Free Lunch," Com-puterDecisions, April 1977, pp. 46 and 48.

8. P. H. Enslow, Jr., "What Does 'Distributed Processing'Mean?" Distributed Systems, Infotech State of theArt Report, Maidenhead, England, 1976, pp. 257-272.

9. Gerard Le Lann, "Distributed Systems-Towards aFormal Approach," 1977 IFIP Congress Proc., pp.155-160.

10. E. Douglas Jensen, "Distributed Processing in a Real-Time Environment," Infotech State of the Art Reporton Distributed Systems, pp. 303-318.

11. Peter A. Alsberg, and John D. Day, "A Principle forResilient Sharing of Distributed Resources," Centerfor Advanced Computation, University of Illinois,Urbana, Illinois. Prepared for the Brown tJniversityWorkshop on Distributed Processing, August 1976.

12. Enrique trana and Geneva G. Belford, "Techniquesfor Update Synchronization in Distributed DataBases," Center for Advanced Computation, Univer-sity of Illinois, Urbana, Illinois.

13. Robert H. Thomas, "A Solution to the Update Prob-lem for Multiple Copy Data Bases Which Uses Dis-tributed Control," Bolt Beranek and Newman, Cam-bridge, Mass. Prepared for the Brown UniversityWorkshop on Distributed Processing, August 1976.

i Philip H. Enslow, Jr., is an associate0 professor in the School of Informationy and Computer Science, Georgia Institute

of Technology. His primnary interestsare in the areas of computer organi-

y zation and operating systems. Prior tojoining Georgia Tech in 1975, he waschief of the Information SciencesBranch of the U.S. Army EuropeanResearch Office.

A member of the IEEE and ACM, Enslov is the Com-puter Society's Publications Committee chairmani andeditor-in-chief and a member of its Board of Governors.His BS is from the United States Military Academy; hisMS and PhD degrees are from Stanford University.

FREESYSTEM SELECTION ADVICE. WE WILL HELPYOU TO CHOOSE FROM THE BEST OF EACHMANUFACTURER TO COMPLETE THE SYSTEMBEST SUITED TO YOUR NEEDS. COME SEEANDTRY:

PROCESSOR TECH. ICOM DISCSSOL-20 SYSTEM NORTH STARTDL ZPUZ16K TARBELLCROMEMCO SEALSPOLYMORPHIC DYNABYTEVECTOR GRAPHICS LEAR ADM-3AIMSAI COMPUCOLORSWTP SOROCINTEL SANYOBYTE HITACHI

LAWNDALE

BYTEi SHOPthe affordable computer store

16508 HAWTHORNE BLVD.LAWNDALE, CA 90260

(213) 371-2421HRS: TUE.-FRI. 12-8, SAT. 10-6

ALPHA MICRO 16/8OKIDATADECWRITERMULTITERMS.R POLYPHONICCOMPUTALKERS.S.MUSICI.C S. SOCKETSTOOLS. SUPPLIESBOOKS. MAGAZINES

TORRANCE

BANKAMERICARD * MASTERCHARGE i AMERICAN EXPRESS

Reader Service Number 2 21January 1978