naming

NamingChapter 4

Table of ContentsConceptsLocate mobile entitiesGarbage collection

4.1 Naming EntitiesWhats the most important requirement for a name?Convenient and Unique

Can you image the longest English name in the world?Take a breathMr. Adolph Blaine Charles David Earl Frederick Gerald Hubert Irvin John Kenneth Lloyd Martin Nero Oliver Paul Quincy Randolph Sherman Thomas Uncas Victor William Xerxes Yancy Wolfeschlegelsteinhausenbergerdorffwelchevoralternwarengewissenschaftschaferswessenschafewarenwohlgepflegeundsorgfaltigkeitbeschutzenvonangreifeudurchihrraubgierigfeindewelchevoralternzwolftausendjahresvorandieerscheinenerscheinenvanderersteerdemenschderraumschiffgebrauchlichtalsseinursprungvonkraftgestartseinlangefahrthinzwischensternaitigraumaufdersuchenachdiesternwelchegehabtbewohnbarplanetenkreisedrehensichundwohinderneurassevonverstandigmenschlichkeitkonntefortpflanzenundsicherfeuenanlebenslanglichfreudeundruhemitnicheinfurchtvorangreifenvonandererintelligentgeschopfsvonhinzwischenternart Zeus igraum Senior

He was born in Munich in 1904 and lived in Philadelphia for most of his life. Apparently he shortened his name to Wolfeschlegelsteinhausenbergerdorff, and subsequently went by Hubert Blaine Wolfe, but the "Senior" indicates that he passed some form of his name to his son. The full version of the name of 590 letters appeared in the 12th edition of The Guinness Book of Records. He now lives in Philadelphia, Pennsylvania, U.S.A., and has shortened his surname to Mr. Wolfe + 585, Senior.

(2003-03-04) 13230 13230226 >>> 2003

4.1 Naming EntitiesDefinitionA name in a distributed system is a string of bits or characters that is used to refer to an entity. An access point is yet special kind of entity to be used to access some entity, i.e. phone #. The name of an access point is called an address.

4.1 Naming EntitiesIdentifier: uniquely identify an entity. An identifier refers to at most one entityEach entity is referred to by at most one identifierAn identifier always refers to the same entity (i.e., it is never reused)phone #, passport #

4.1 Naming EntitiesAddresses and identifiers are normally represented in machine-readable form, bit strings.A human-friendly name is generally represented as a character sting. www.njust.edu.cn

4.1 Naming EntitiesName spacesLabeled (acyclic) directed graphLeaf node: 0 outgoing edgesDirectory node: n outgoing edgesRoot nodePath name Form: N:Absolute path name Relative path name

Name Spaces (1)A general naming graph with a single root node.

Name Spaces (2)The general organization of the UNIX file system implementation on a logical disk of contiguous disk blocks.

4.1 Naming EntitiesName resolutionThe process of looking up a name.N:Closure MechanismKnowing how and where to start name resolution. 4340880245810362/home/yangzhao/movie$HOME

4.1 Naming EntitiesName resolutionLinkingAn alias is another name for the same entity, for example, an environment variable. Alias implementationHard links: multiple absolute paths names to the same nodeSymbolic link: storing an absolute path name in the leaf nodeWhy aliasing?

Linking and Mounting (1)The concept of a symbolic link explained in a naming graph.

4.1 Naming EntitiesName resolutionMountingLet a directory node store the identifier of a directory node from a different name space, which we refer to as a foreign name space. Mount point vs. Mounting pointMounting implementationThe name of an access protocolThe name of the serverThe name of the mounting point in the foreign name space

Linking and Mounting (2)Mounting remote name spaces through a specific process protocol.

4.1 Naming EntitiesName resolutionMountingExamplenfs://flits.cs.vu.nl/home/steencd /remotels -lTransparent

4.1 Naming EntitiesName resolutionGlobal Name Service (GNS)Add a new root node and to make the existing root nodes its children. /home/steen/keys => n0:/home/steen/keys

Linking and Mounting (3)Organization of the DEC Global Name Service

4.1 Naming EntitiesName Space LayerGlobal layerclose to root node, rarely updatedAdministrational layerorganizations, groups of entitiesManagerial layernodes for hostsuser-defined directories and files

A zone is a part of the name space that is implemented by a separate name server.

Name Space Distribution (1)An example partitioning of the DNS name space, including Internet-accessible files, into three layers.

4.1 Naming EntitiesAvailabilityPerformanceClient-side CachingReplication

Name Space Distribution (2)A comparison between name servers for implementing nodes from a large-scale name space partitioned into a global layer, as an administrational layer, and a managerial layer.

ItemGlobalAdministrationalManagerialGeographical scale of networkWorldwideOrganizationDepartmentTotal number of nodesFewManyVast numbersResponsiveness to lookupsSecondsMillisecondsImmediateUpdate propagationLazyImmediateImmediateNumber of replicasManyNone or fewNoneIs client-side caching applied?YesYesSometimes

4.1 Naming EntitiesImplementation of Name resolutionLocal name resolverEnsuring that the name resolution process is carried outIterative name resolutionRecursive name resolution

root:Root server is assumed to be known.

Implementation of Name Resolution (1)The principle of iterative name resolution.

Implementation of Name Resolution (2)The principle of recursive name resolution.

Implementation of Name Resolution (3)Recursive name resolution of . Name servers cache intermediate results for subsequent lookups.

Server for nodeShould resolveLooks upPasses to childReceives and cachesReturns to requestercs#----#vu### #ni## ## # #root## # ## # # #

4.1 Naming EntitiesComments on the recursive name resolution-: puts a higher performance demand on each name server+: caching results is more effective+: communication costs may be reduced

Implementation of Name Resolution (4)The comparison between recursive and iterative name resolution with respect to communication costs.

The DNS Name SpaceThe most important types of resource records forming the contents of nodes in the DNS name space.

Type of recordAssociated entityDescriptionSOAZoneHolds information on the represented zoneAHostContains an IP address of the host this node representsMXDomainRefers to a mail server to handle mail addressed to this nodeSRVDomainRefers to a server handling a specific serviceNSZoneRefers to a name server that implements the represented zoneCNAMENodeSymbolic link with the primary name of the represented nodePTRHostContains the canonical name of a hostHINFOHostHolds information on the host this node representsTXTAny kindContains any entity-specific information considered useful

DNS Implementation (1)An excerpt from the DNS database for the zone cs.vu.nl.

DNS Implementation (2)Part of the description for the vu.nl domain which contains the cs.vu.nl domain.

NameRecord typeRecord valuecs.vu.nlNISsolo.cs.vu.nlsolo.cs.vu.nlA130.37.21.1

The X.500 Name Space (1)A simple example of a X.500 directory entry using X.500 naming conventions.

AttributeAbbr.ValueCountryCNLLocalityLAmsterdamOrganizationLVrije UniversiteitOrganizationalUnitOUMath. & Comp. Sc.CommonNameCNMain serverMail_Servers--130.37.24.6, 192.31.231,192.31.231.66FTP_Server--130.37.21.11WWW_Server--130.37.21.11

The X.500 Name Space (2)Part of the directory information tree.

The X.500 Name Space (3)Two directory entries having Host_Name as RDN.

AttributeValueAttributeValueCountryNLCountryNLLocalityAmsterdamLocalityAmsterdamOrganizationVrije UniversiteitOrganizationVrije UniversiteitOrganizationalUnitMath. & Comp. Sc.OrganizationalUnitMath. & Comp. Sc.CommonNameMain serverCommonNameMain serverHost_NamestarHost_NamezephyrHost_Address192.31.231.42Host_Address192.31.231.66

4.2 Locating Mobile EntitiesNamesHuman-friendly namesAddressesIdentifiersTraditional naming systemsHuman-friendly names maps to addressesBoth names and addresses can change

Naming versus Locating EntitiesDirect, single level mapping between names and addresses.Two-level mapping using identities.

4.2 Locating Mobile EntitiesLocating an entityBroadcastingA message with an entity ID is broadcastEach machine checks whether it has the entityOne repliesMulticastingA group of hosts receives the request+: locate the nearest replica

4.2 Locating Mobile EntitiesLocating an entityForwarding PointersAn entity moves from A to BA reference to B is left at ALocating an entity is followed by the chain of forwarding pointers.+: simple-: a chain can be very long => inefficient (space, time)-: a chain is easy to be broken

Forwarding Pointers (1)The principle of forwarding pointers using (proxy, skeleton) pairs.

Forwarding Pointers (2)Redirecting a forwarding pointer, by storing a shortcut in a proxy.Sending the response directly to the initiating proxy or along the reverse path of forwarding pointers

If a process in a chain of (proxy, skeleton) pairs crashes: A objects home location always keep a reference to its current location

4.2 Locating Mobile EntitiesLocating an entityHome-based ApproachesHome locationkeeps track of the current location of an entityis often chosen the place where an entity was created normally+: improve the previous two approaches: scalability and performance problems-: always contact the home location first-: fixed home location

Home-Based ApproachesThe principle of Mobile IP.

4.2 Locating Mobile EntitiesLocating an entityHierarchical ApproachesMultiple-tiered home-based approachA network is divided into a collection of domainsLeaf domain is the lowest-level domaindir(D): entities in the domain Droot node: knows about all entities

Hierarchical Approaches (1)Hierarchical organization of a location service into domains, each having an associated directory node.

4.2 Locating Mobile EntitiesLocation recordA location record for entity E in the directory node N for a leaf domain D contains the entitys current address in that domainThe directory node N for the next higher-level domain D that contains D will have a location record for E containing only a pointer to N.Likewise, the parent node of N will store a location record for E containing only a pointer to N.

Hierarchical Approaches (2)An example of storing information of an entity having two addresses in different leaf domains.

4.2 Locating Mobile EntitiesLookup operationA client wishing to locate an entity E, issues a lookup request to the directory node of the leaf domain D in which the client resides. If the directory node does not store a location record for E, then E is not located in D currently.Go for Ds parent (next level higher), and so on.Once the request reaches a node M that stores a location recode for E, then E is in dom(M).

Hierarchical Approaches (3)Looking up a location in a hierarchically organized location service.

4.2 Locating Mobile EntitiesLocalityThe entity is searched in a gradually increasing ring centered around the requesting client. The search area is expanded each time the lookup request is forwarded to a next higher-level directory nodeThe worst case is that the request reaches the root node.

Hierarchical Approaches (4)An insert request is forwarded to the first node that knows about entity E.A chain of forwarding pointers to the leaf node is created.

4.2 Locating Mobile EntitiesPointer CachesCaching is effective only if the cached data rarely changeIf a mobile entity E always moves within a domain D, then the path of pointers for entity E from the root node to dir(D) does not have to change. A reference to dir(D) can, in principle, be cached at every node along the path from the leaf node where the lookup was initiated.

Pointer Caches (1)Caching a reference to a directory node of the lowest-level domain in which an entity will reside most of the time.

4.2 Locating Mobile EntitiesPointer Cachesdir(D) store a pointer to Originally: the subdomain where E currently resides Possible Improvement: the actual address of E directly Open questionsHow to find the best directory node to store the current address of mobile entity? Least upper boundWhen to invalidate a cache entry?

Pointer Caches (2)A cache entry that needs to be invalidated because it returns a nonlocal address, while such an address is available.

4.2 Locating Mobile EntitiesScalability IssuesThe root may be required to handle so many lookup and update requests => bottleneckSolution: partition the root node and other high-level directory nodes into subnodes. Each subnode is responsible for handling the requests related to a specific subset of all the entities that are to be supported by the location service. Deciding which subnodes should handle which entities in very large-scale location services is still an open question.

Scalability IssuesThe scalability issues related to uniformly placing subnodes of a partitioned root node across the network covered by a location service.

4.3 Removing Unreferenced EntitiesGeneral concept about objectsIf an object is referenced by some pointers, it can be accessed and used.If an object can no longer be accessed, it should be removed.Explicitly or Implicitly ?Language-dependent

4.3 Removing Unreferenced EntitiesIts always very hard to make sure whether or not an entity is referred by someone, especially in DS.Distributed garbage collectorsAssumption: An object can be accessed only if there is a remote reference to itAn object for which no remote reference exists should be removed.

4.3 Removing Unreferenced EntitiesIs this true? Having a remote reference to an object means that the object will ever be accessed or the object is not a garbage?

The Problem of Unreferenced ObjectsAn example of a graph representing objects containing references to each other.

4.3 Removing Unreferenced EntitiesDGC (Distributed garbage collection)Requires network communicationEfficiency and scalabilityPossible failures in communication, machines or processesSeveral solutionsReference CountingReference ListingTracing-based

Reference CountingSimple Reference CountingSteps:Each object stores its own reference counter in its associated skeletonWhen a process P creates a reference to a remote object O, it first installs a proxy then requires the counter in O to be increased by one.When a remote reference is to be removed, the counter is decreased by one.Two problems

Reference Counting (1)The problem of maintaining a proper reference count in the presence of unreliable communication.

Reference Counting (2)Copying a reference to another process and incrementing the counter too lateA solution.

Reference CountingAdvanced Reference CountingWeighted reference countingEach object has a fixed total weightWhen the object is created, the total weight is stored in its associated skeleton, along with a partial weight, which is initialized to the total weight. When a new remote reference (p,s) is created, half of the partial weight stored in the objects skeleton is assigned to the new proxy p.

Reference CountingAdvanced Reference CountingWeighted reference countingWhen a reference is removed, the total weight of the object is subtracted the partial weight of the removed reference.

Advanced Referencing Counting (1)The initial assignment of weights in weighted reference countingWeight assignment when creating a new reference.

Advanced Referencing Counting (2)Weight assignment when copying a reference.

Reference CountingAdvanced Reference CountingWeighted reference countingProblem?Only a limited number of references can be created.

Advanced Referencing Counting (3)Creating an indirection when the partial weight of a reference has reached 1.

Reference CountingAdvanced Reference CountingWeighted reference countingNew problem ?Forwarding pointer: long chains degrade performance and is easy to be broken.New solution: generation reference countingG[i] in skeleton: denotes the number of outstanding copies for generation I;If a proxy p is removed, it sends (k,n) to the skeleton: G[k] = G[k] 1; G[k+1] =G[k+1]+n

Advanced Referencing Counting (4)Creating and copying a remote reference in generation reference counting.

Reference ListingDescription Instead of counting references, a skeleton maintains an explicit reference list of all proxies that point to it.More information, better serviceAdding or removing proxies are idempotent operations (operations that can be repeated without affecting the end result)Keep sending a message to add its proxy, stops as soon as delivery has been acknowledged.The skeleton knows which proxies are up, which are down.

Identifying Unreachable EntitiesTracing-based GCMark-and-sweepMark phase: entities are traced by following chains of references originating from entities in the root set. (3-color algorithm)Sweep phase: remove the entities that have not been marked.Main drawback: stop-the-world synchronization is often not acceptable for DGCPossible improvement: incremental GC, but

Identifying Unreachable EntitiesTracing-based GCTracing in Groups (scalability)GC takes place within groups through a combination of mark-and-sweep and reference countingA group is simply a collection of processes. Basic idea: Collect all garbage within a groupConsider a larger group that encompasses a number of subgroups which have just been cleaned up.

Identifying Unreachable EntitiesAlgorithm of collecting garbage within a group Initial marking, in which only skeletons are marked.Intraprocess propagation of marks from skeletons to proxiesInterprocess propagation of marks from proxies to skeletonsStabilization by repetition of the previous two steps.Garbage reclamationA skeleton can be marked either soft or hardA proxy can be marked none, soft, or hard

Identifying Unreachable EntitiesA skeleton is hard: it is either reachable from a proxy in a process outside the group, or reachable from a root object inside of the groupA skeleton is soft: it is reachable only from proxies inside the group. A soft skeleton can be changed into hard, but not the other way around.

Identifying Unreachable EntitiesA proxy is hard: it is reachable from an object in the root set.A proxy is soft: it is reachable from a skeleton that has been marked soft as well. A soft proxy cannot be changed into hard.A proxy is none: it is not reachable. A none proxy can be changed into soft or hard.

Identifying Unreachable EntitiesMarking algorithmA skeleton is marked either soft or hard, depending on whether it can be reached from a proxy outside the group.Each process run its own local garbage collector requiring to propagate marks from skeletons to proxies within the process it is running. Marks are propagated between different processes. Soft marks do not have to be propagated since the first step already did this.

Tracing in Groups (1)Initial marking of skeletons.

Tracing in Groups (2)After local propagation in each process.

Tracing in Groups (3)Final marking.

naming

Documents