Naming Chapter 4. Naming Names are used to share resources, to uniquely identify entities, or to...

Naming

Chapter 4

Naming

• Names are used to share resources, to uniquely identify entities, or to refer to locations.

• Name resolution is used for a process to access the named entity.

• Three issues are covered in this chapter:– Naming entities: the organization and

implementation of human-friendly names– Locating mobile entities– Removing unreferenced entities

Naming• A name in a distributed system is a string of bits or

characters that is used to refer to an entity.• An entity in a distributed system can be practically

anything including resources, processes, users, mailboxes, newsgroups, Web pages, and so on.

• To operate on an entity, we need to access it at an access point. The name of an access point is called an address. The address of an access point of an entity is the address of that entity.

• A location independent name for an entity E, is independent from the addresses of the access points offered by E. – For example, www.yahoo.com has a single name for the

Web service, independent from the addresses of the different Web servers.

http://www.yahoo.com/

Naming

• A true identifier is a name having the following properties: – Each identifier refers to at most one entity.

– Each entity is referred to by at most one identifier .

– An identifier always refers to the same entity (prohibits reusing an identifier).

• In many computer systems, addresses and identifiers are represented in machine-readable form only.

• Human-friendly names are tailored to be used by humans and generally represented as a character string.

Name Space

• Names in a distributed system are organized into a name space.

• A name space can be represented as a labeled, directed graph with two types of nodes:– A leaf node represents a (named) entity.– A directory node is an entity that refers to other

nodes.

• A directory node contains a directory table of (edge label, node identifier) pairs. Example: (‘home’, n1), (‘keys’, n5)

Name Space

• A node which has only outgoing and no incoming edges is called the root (node).

• Each path in a naming graph can be referred to by the sequence of labels corresponding to the edges in the path, such as

N:<label-1, label-2, …, label-n> where N refers to the first node in the path.

• If the first node in a path name is the root of the naming graph, it is called an absolute path name. Otherwise, it is called a relative path name.

Name Space• A global name is a name that denotes the same

entity no matter where that name is used in a system (e.g. absolute path).

• A local name is a name whose interpretation depends on where that name is being used. (e.g. relative path).

• Example of path names in file systems (Refer to Figure 4-1)– N0:<home, steen, mbox> /home/steen/mbox– /home/steen/keys = /keys (The same node can be

represented by different path names.)• Name space can be organized in various ways: a tree

or a acyclic graph.

Name Spaces

A general naming graph with a single root node.

Name Space

• We can easily store all kinds of attributes in a node, describing aspects of the entity the node represents: – Type of the entity – An identifier for that entity – Address of the entity's location – Nicknames

• Directory nodes can also have attributes, besides just storing a directory table with (edge label, node identifier) pairs.

Name Spaces

• The general organization of the UNIX file system implementation on a logical disk of contiguous disk blocks.– Boot block: loaded when system is booted.– Super block: information on the entire file system– Inode: information on the location of the data of its associated

file– Disk block: file data blocks

Name Resolution• The process of looking up a name is called name

resolution.• Problem: To resolve a name we need a directory node.

How do we actually find that (initial) node? • Knowing how and where to start name resolution is

called closure mechanism. The mechanism – select the implicit context from which to start name

resolution. – determine how name resolution should proceed.

• Examples:– www.cs.wichita.edu: start at a DNS name server – /home/john/box: start at the local NFS file server (possible

recursive search) – 316-978-3156: dial a phone number – 156.26.1.30: start at a WSU router

http://www.cs.wichita.edu/

Naming Linking and Mounting• An alias is another name for the same entity.

• The hard link approach is to allow multiple absolute paths names to refer to the same node in a naming graph.

• The symbolic link approach creates a leaf node to store an entity path information.

• Name resolution can also be used to merge different name spaces. Mounting is an example (one way) of merging different name spaces.

Linking and Mounting

The concept of a symbolic link explained in a naming graph.

Merging Name Spaces• Mount point is a (Directory) node storing the node

identifier. Mounting point is a (Directory) node in the foreign name space where the entity is located. For example,

• To mount a foreign name space in a distributed system requires the following information:– The name of an access protocol– The name of the server– The name of the mounting point in the foreign name

space

• Example: mount nfs://flits.cs.vu.nl//home/steen /remote/vu– nfs://flits.cs.vu.nl//home/steen is the mounting point.– /remote/vu is the mout point.


Mounting remote name spaces through a specific process protocol.

Merging Name Spaces

• We have different name spaces that we would like to access from any given name space. – Solution 1: Introduce a naming scheme by which

pathnames of different name spaces are simply concatenated (URLs).

– Solution 2: Introduce (specific) nodes that contain the name of a node in a foreign name space, along with the information how to select the initial context in that foreign name space (Jade).

– Solution 3: Use only full pathnames, in which the starting context is explicitly identified, and merge by adding a new root node (DCE's Global Name Space).

• n0:/home/stten/keys, m0:/home/steen


Organization of the DCE Global Name Service

Name Space Implementation• A name space forms the heart of a naming service. A naming

service is implemented by name server. In distributed systems, it is necessary to distribute the name resolution process as well as name space management across multiple machines, by distributing nodes of the naming graph.

• Consider a hierarchical naming graph and distinguish three levels: – Global level: Consists of the high level directory nodes. Main aspect is

that these directory nodes have to be jointly managed by different administrations

– Administrational level: Contains mid level directory nodes that can be grouped in such a way that each group can be assigned to a separate administration.

– Managerial level: Consists of low level directory nodes within a single administration. Main issue is effectively mapping directory nodes to local name servers.

Name Space Distribution

An example partitioning of the DNS name space, including Internet-accessible files, into three layers.

Name Space Distribution

• A comparison between name servers for implementing nodes from a large-scale name space partitioned into a global layer, an administrational layer, and a managerial layer.

• Name servers in the global and administrational layer are the most difficult to implement because of replication and caching.

Item Global Administrational Managerial

Geographical scale of network Worldwide Organization Department

Total number of nodes Few Many Vast numbers

Responsiveness to lookups Seconds Milliseconds Immediate

Update propagation Lazy Immediate Immediate

Number of replicas Many None or few None

Is client-side caching applied? Yes Yes Sometimes

Implementation of Name Resolution• Each client has access to a local name resolver, which is

responsible for ensuring that the name resolution process is carried out.

• There are two ways to implement name resolution:– In iterative name resolution, a name resolver hands over the

complete name to the root name server, returning each intermediate result back to the client’s name resolver. The final name resolver responds to the client the final resolution.

– In recursive name resolution, a root name server recursively passes the result to the next-level name server. The root name server responds to the client the final resolution..

Implementation of Name Resolution

The principle of iterative name resolution.


The principle of recursive name resolution.

Implementation of Name Resolution• Recursive name resolution

– Drawback: it demands high performance on each name server.

– Advantages: • Caching results is more effective.

• Communication cost is reduced.

• One of the largest distributed naming services is the Internet Domain Name System (DNS).


Recursive name resolution of <nl, vu, cs, ftp>. Name servers cache intermediate results for subsequent lookups.

Server for node

Should resolve

Looks upPasses to

childReceives

and cachesReturns to requester

cs <ftp> #<ftp> -- -- #<ftp>

vu <cs,ftp> #<cs> <ftp> #<ftp> #<cs>#<cs, ftp>

nl <vu,cs,ftp> #<vu> <cs,ftp> #<cs>#<cs,ftp>

#<vu>#<vu,cs>#<vu,cs,ftp>

root <nl,vu,cs,ftp> #<nl> <vu,cs,ftp> #<vu>#<vu,cs>#<vu,cs,ftp>

#<nl>#<nl,vu>#<nl,vu,cs>#<nl,vu,cs,ftp>


The comparison between recursive and iterative name resolution with respect to communication costs.

The DNS Name Space• The DNS name space is hierarchically organized as a rooted

tree.• The string representation of a path name consists of listing its

labels, starting with the rightmost one, and separating the labels by a dot.

• For example, www.cs.wichita.edu., which includes the rightmost dot to indicate the root node.

• The label attached to a node’s incoming edge is used as the name for that node.

• A subtree is a called a domain; a path name to its root node is called a domain name.

• The contents of a node is formed by a collection of resource records. Different types of resource records are shown in Figure 4-12.

• DNS distinguishes aliases from what are called canonical names or primary name

http://www.cs.wichita.edu/

The DNS Name Space

The most important types of resource records forming the contents of nodes in the DNS name space.

Type of record

Associated entity

Description

SOA Zone Holds information on the represented zone

A Host Contains an IP address of the host this node represents

MX Domain Refers to a mail server to handle mail addressed to this node

SRV Domain Refers to a server handling a specific service

NS Zone Refers to a name server that implements the represented zone

CNAME Node Symbolic link with the primary name of the represented node

PTR Host Contains the canonical name of a host

HINFO Host Holds information on the host this node represents

TXT Any kind Contains any entity-specific information considered useful

DNS Implementation• The DNS name space can be divided into a global

layer and administrational layer.• Secondary name servers do not access the database

directly but request the primary server to transfer its content. The latter is called a zone transfer in DNS terminology.

• A DNS database is implemented as a small collection of files.

• DNS is comparable to a telephone book for looking up phone numbers.

• In UNIX, use the command dig or nslook to query domain name server.

DNS Implementation

An excerpt from the

DNS database for the zone

cs.vu.nl.

DNS Implementation

Part of the description for the vu.nl domain which contains the cs.vu.nl domain.

Name Record type Record value

cs.vu.nl NIS solo.cs.vu.nl

solo.cs.vu.nl A 130.37.21.1

X.500• X.500 Directory Service is a standard way to develop

an electronic directory so that it can be part of a global directory available with Internet access.

• The idea is to be able to look up people in a user-friendly way by name, department, or organization. It is similar to use the yellow pages.

• Because these directories are organized as part of a single global directory, you can search for hundreds of thousands of entries from a single place on the World Wide Web.

X.500• An X.500 directory service consists of a

number of records, usually referred to as directory entries.

• The collection of all directory entries in an X.500 directory service is called a Directory Information Base (DIB).

• Each record is uniquely named. Each naming attribute is called a Relative Distinguished Name (RDN).

• Calling read will return a directory entry. Calling list will return a list of entries.

The X.500 Name Space

A simple example of a X.500 directory entry using X.500 naming conventions.

Attribute Abbr. Value

Country C NL

Locality L Amsterdam

Organization L Vrije Universiteit

OrganizationalUnit OU Math. & Comp. Sc.

CommonName CN Main server

Mail_Servers -- 130.37.24.6, 192.31.231,192.31.231.66

FTP_Server -- 130.37.21.11

WWW_Server -- 130.37.21.11


Part of the directory information tree.


Two directory entries having Host_Name as RDN.

Attribute Value Attribute Value

Country NL Country NL

Locality Amsterdam Locality Amsterdam

Organization Vrije Universiteit Organization Vrije Universiteit

OrganizationalUnit Math. & Comp. Sc. OrganizationalUnitMath. & Comp. Sc.

CommonName Main server CommonName Main server

Host_Name star Host_Name zephyr

Host_Address 192.31.231.42 Host_Address 192.31.231.66

X.500 Implementation• In X.500, each local directory is called a Directory

System Agent (DSA). A DSA can represent one organization or a group of organizations.

• The DSAs are interconnected from the Directory Information Tree (DIT). The user interface program for access to one or more DSAs is a Directory User Agent (DUA).

• DUAs include whois, finger, and programs that offer a graphical user interface. X.500 is implemented as part of the Distributed Computing Environment (DCE) in its Global Directory Service (GDS).

• The Lightweight Directory Access Protocol (LDAP) is a simplified protocol to accommodate X.500 directory services in the Internet.

X.500 Implementation• Providing an X.500 directory allows an organization

to make itself and selected members known on the Internet.

• Two of the largest directory service providers are InterNIC, the organization that supervises domain name registration in the U.S., and ESnet, which maintains X.500 data for all the U.S. national laboratories.

• ESNet and similar providers also provide access to looking up names in the global directory, using a number of different user interfaces including designated Web sites, whois, and finger.

• These organizations also provide assistance to organizations that are creating their own Directory Information Tree (DIT).

http://www.internic.com/

Locating Mobile Entities• Three types of names were distinguished: human-

friendly names, identifiers, and addresses. All naming systems maintain a mapping of human-friendly names to addresses.

• Traditional Naming service:– Aimed at providing the content of nodes in a name space.

Given a (compound) name, content could consist of different (attribute, value) pairs.

– Assume node contents at global and administrational level is relatively stable for scalability reasons.

– An efficient implementing can be achieved through replication and caching.

Naming & Locating Objects• If we also assume that node contents is also stable at

managerial level, then we can only use a global name as the identifier (think of Web pages).

• Problem: Hence, it is not realistic to assume stable node contents down to the local naming (managerial) level. For highly mobile entities, matters become worse.

• Traditional solution: – Traditional naming services maintain a direct mapping

between human-friendly names and the addresses of entities.

– Each time a name or an address changes, the mapping needs to change as well.

Naming & Locating Objects• Better solution: separate naming from locating entities.

Steps to locate entities:– Retrieve the identifier.– Locating an entity is handled by the location service.

• A location service accepts an identifier as input and returns the current address.

• If multiple copies exist, then multiple addresses may be returned.

• A Two-level mapping is used to local mobile objects: – Name: Any name in a traditional naming space. – Entity ID: A true identifier. – Address: Provides all information necessary to contact an

entity. – Observation: An entity's name is now completely

independent from its location.

Naming versus Locating Entities

a) Direct, single level mapping between names and addresses.b) T-level mapping using identities.

Simple Solutions for Locating Entities • Two solutions for locating an entity only applies to

local-area networks: broadcasting/multicasting and forwarding pointers.

• Broadcasting: Simply broadcast the ID, requesting the entity to return its current address. – Can never scale beyond local area networks (think of

ARP/RARP).

– Require all processes to listen to incoming location requests.

• Broadcasting becomes inefficient when the network grows. One possible solution is to switch to multicasting.

Simple Solutions for Locating Entities • Forwarding pointers: Each time an entity moves, it

leaves behind a pointer telling where it has gone to. – Referencing can be made entirely transparent to clients by

simply following the chain of pointers.

– Update a client's reference as soon as present location has been found.

– Advantage: simple

– Drawbacks: Geographical scalability problems

• Long chains are not fault tolerant (Locating an entity is expensive. All intermediate locations in a chain have to maintain. The chain is vulnerable to be broken.)

• Increased network latency

Forwarding Pointers

The principle of forwarding pointers using (proxy, skeleton) pairs.

Forwarding Pointers

Redirecting a forwarding pointer, by storing a shortcut in a proxy.

Home Based Approaches• The use of broadcasting or multicasting is difficult to

implement in large-scale networks. Long chains of forwarding pointers introduce performance problems and are susceptible to broken link.

• A popular approach to supporting mobile entities in large-scale networks is to introduce a home location, which keeps track of the current location of an entity.

• Single tiered scheme: Let a home keep track of where the entity is: – An entity's home address is registered at a naming service. – The home registers the foreign address (care-of-address)

of the entity, which is acquired in the foreign network. – Clients always contact the home first, and then continues

with the foreign location.

Home Based Approaches• Problems with home based approaches:

– The home address has to be supported as long as the entity lives.

– The home address is fixed, which means an unnecessary burden when the entity permanently moves to another location.

– Poor geographical scalability (the entity may be next to the client).

• Possible solution to home-based approaches: Two tiered scheme (similar to mobile telephony):– Check local visitor registry first.

– Fall back to home location if local lookup fails.

Home-Based Approaches

The principle of Mobile IP.

Hierarchical Location Services• The two-tiered home-based approach can be

generalized to multiple layers. – A network is divided into a collection of domains. – Each domain can be subdivided into multiple subdomain. – A lowest-level domain, called a leaf domain.

• Each domain has an associated directory node that keeps track of the entities in that domain.

• HLS (Hierarchical Location Services): – Each entity in a domain is represented by a location record

in that directory node.– The address of an entity is stored in a leaf node.– Intermediate nodes contain a pointer to a child if and only if

the subtree rooted at the child stores an address of the entity – The root knows about all entities.

Hierarchical Approaches

Hierarchical organization of a location service into domains, each having an associated directory node.


An example of storing information of an entity having two addresses in different leaf domains.

Hierarchical Location Services

• Basic principles: – Start lookup at the local leaf node.

– If the node knows about the entity, follow downward pointer, otherwise go one level up.

– Upward lookup always stops at the node that stores a location record for the entity.

• If a mobile entity moves regularly within a domain, it is effective to cache a reference to the directory node. In this case, it makes sense to start a lookup at that directory node. This approach is referred as a pointer caching.


Looking up a location in a hierarchically organized location service.


a) An insert request is forwarded to the first node that knows about entity E.

b) A chain of forwarding pointers to the leaf node is created.

Pointer Caching

• Example: If an entity E moves regularly between leaf domains D1 and D2, it may be more efficient to store E's contact record at the least common ancestor. – Lookup operations from either D1 or D2 are on

average cheaper. – Update operations (i.e., changing the current

address) can be done directly at the directory node.

– Note: Assuming that E generally stays between D1 and D2 does make sense to cache a pointer to.

Pointer Caches

Caching a reference to a directory node of the lowest-level domain in which an entity will reside most of the time.

Pointer Caches

A cache entry that needs to be invalidated because it returns a nonlocal address, while such an address is available.

Scalability Issues• Storing a location record for each entity is not a

problem. The problem is we need to ensure that servers can handle a large number of requests per time unit high level servers are in big trouble.

• Solution: Assume at least at global and administrational level that content of nodes hardly ever changes. In that case, we can apply extensive replication by mapping nodes to multiple servers, and start name resolution at the nearest server.

• Observation: An important attribute of many nodes is the address where the represented entity can be contacted. Replicating nodes makes large scale traditional name servers unsuitable for locating mobile entities.

Hierarchical Location Services (HLS): Scalability Issues

• Size scalability: The problem is over loading higher level nodes: – Only solution is to partition a node into a number

of subnodes and evenly assign entities to sub nodes .

– Naive partitioning may introduce a node management problem, as a subnode may have to know how its parent and children are partitioned.

HLS: Geographical Scalability Issues• We have to ensure that lookup operations generally

proceed monotonically in the direction of where we'll find an address: – If entity E generally resides in California, we should not let a

root subnode located in France store E 's contact record. – Unfortunately, subnode placement is not that easy, and only a

few tentative solutions are known. • We need to ensure that the name resolution process

scales across large geo graphical distances. – By mapping nodes to servers that may be located anywhere,

we introduce an implicit location dependency in our naming scheme (e.g. A domain E is mapped in Finland and moved to California. A request to E first contact Finland.) .

– Deciding which subnodes should handle which entities remain an open question.

Scalability Issues

The scalability issues related to uniformly placing subnodes of a partitioned root node across the network covered by a location service.

Unreferenced Objects: Problem Naming

• Assumption: Objects may exist only if it is known that they can be contacted: – Each object should be named – Each object can be located – A reference can be resolved to client-object

communication

• Problem: Removing unreferenced objects: – How do we know when an object is no longer

referenced (think of cyclic references)? – Who is responsible for (deciding on) removing an

object?

The Problem of Unreferenced Objects

An example of a graph representing objects containing references to each other.

Reference Counting

• Principle: Each time a client creates (removes) a reference to an object O, a reference counter local to O is incremented (decremented)

• Problem 1: Dealing with lost (and duplicated) mes • sages:

– An is lost so that the object may be prematurely removed

– A is lost so that the object is never re moved

– An ACK is lost, so that the increment/decrement is resent.

• Solution: Keep track of duplicate requests.

Reference Counting

The problem of maintaining a proper reference count in the presence of unreliable communication.

Reference Counting

• Problem 2: Dealing with duplicated references - client P 1 tells client P 2 about object O: – Client P 2 creates a reference to O, but dereferencing

(communicating with O) may take a long time

– If the last reference known to O is removed before P 2 talks to O, the object is removed prematurely

• Solution 1: Ensure that P 2 talks to O on time: – Let P 1 tell O it will pass a reference to P 2

– Let O contact P 2 immediately

– A reference may never be removed before O has acked that reference to the holder

Reference Counting

a) Copying a reference to another process and incrementing the counter too late

b) A solution.

Weighted Reference Counting

• Solution 2: Avoid increment and decrement messages: – Let O allow a maximum M of references – Client P 1 creates reference grant it M 2 credit – Client P 1 tells P 2 about O, it passes half of its

credit grant to P 2 – Pass current credit grant back to O upon reference

deletion Skeleton

Advanced Referencing Counting

a) The initial assignment of weights in weighted reference countingb) Weight assignment when creating a new reference.

Reference Listing • Observation: We can avoid many problems if we

can tolerate message loss and duplication • Reference listing: Let an object keep a list of its • clients:

– Increment operation is replaced by an (idempotent) – Decrement operation is replaced by an (idempotent)

• There are still some problems to be solved: • Passing references: client B has to be listed at O

before last reference at O is removed (or keep a chain of references)

• Client crashes: we need to remove outdated registrations (e.g., by combining reference listing with leases)

Leases • Observation: If we cannot be exact in the presence

of communication failures, we will have to tolerate some mistakes

• Essential issue: We need to avoid that object references are never reclaimed

• Solution: Hand out a lease on each new reference: – The object promises not to decrement the reference count

for a specified time – Leases need to be refreshed (by object or client)

• Observations: – Refreshing may fail in the face of message loss – Refreshing can tolerate message duplication – Does not solve problems related to cyclic references


c) Weight assignment when copying a reference.


Creating an indirection when the partial weight of a reference has reached 1.


Creating and copying a remote reference in generation reference counting.

Tracing in Groups

Initial marking of skeletons.

Tracing in Groups

After local propagation in each process.

Tracing in Groups

Final marking.

Date post:	20-Dec-2015
Category:	Documents
View:	229 times
Download:	1 times

Naming Chapter 4. Naming Names are used to share resources, to uniquely identify entities, or to...

Documents