+ All Categories
Home > Documents > *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf ·...

*:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf ·...

Date post: 10-Aug-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
83
*:96 Internet application layer protocols and standards Compendium 4A: Basics, E-mail & NNTP Not allowed during the exam Last revision: 11 Jan 2005 Introduction and basic concepts ....................................................................................................... 2-34 A beginner's guide to URLs ............................................................................................................ 35-36 Distributed File Systems .................................................................................................................. 37-38 Cache Consistency Mechanisms ........................................................................................................... 39 (The Domain Name System .................................................................................................. 40-63 i 4B) E-mail Message Handling Overview.......................................................................................................... 64-94 Usenet News........................................................................................................................................ 94-98 Relative Addresses .......................................................................................................................... 98-100 (Applicatiions: Electronic Mail (822, SMTP, MIME) ................................................ 98-111 i 4B) (Post Office Protrocol ........................................................................................................ 112-118 i 4B) IMAP................................................................................................................................................. 121-123 PGP Pretty Good Privacy .............................................................................................................................. 124 NNTP News and Usenet............................................................................................................................ 125-127 The documents are not ordered in a suitable order for reading them, see compendium 6 page 909-911
Transcript
Page 1: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

*:96 Internet application layerprotocols and standards

Compendium 4A:Basics, E-mail & NNTP

Not allowed during the examLast revision: 11 Jan 2005

Introduction and basic concepts .......................................................................................................2-34A beginner's guide to URLs............................................................................................................35-36Distributed File Systems..................................................................................................................37-38Cache Consistency Mechanisms...........................................................................................................39(The Domain Name System..................................................................................................40-63 i 4B)

E-mailMessage Handling Overview..........................................................................................................64-94Usenet News........................................................................................................................................94-98Relative Addresses ..........................................................................................................................98-100(Applicatiions: Electronic Mail (822, SMTP, MIME)................................................98-111 i 4B)(Post Office Protrocol ........................................................................................................112-118 i 4B)IMAP.................................................................................................................................................121-123

PGPPretty Good Privacy ..............................................................................................................................124

NNTPNews and Usenet............................................................................................................................125-127

The documents are not ordered in a suitable order for reading them, see compendium 6 page 909-911

Page 2: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

Exc

erpt

fro

m:

Inte

rnet

appl

icat

ion

laye

rpr

otoc

ols

By

Jaco

b Pa

lme

4M

essa

ge H

andl

ing

Tab

le o

f co

nten

ts1

Intr

oduc

tion

and

Bas

ic C

once

pts

......

......

......

......

......

......

.....

2

1.1

A S

impl

e E

xam

ple

......

......

......

......

......

......

......

......

..2

1.2

Proc

ess,

Clie

nt, H

ost,

Serv

er...

......

......

......

......

......

.41.

3 Pr

otoc

ols.

......

......

......

......

......

......

......

......

......

......

.....

5

1.4

Lay

erin

g M

odel

......

......

......

......

......

......

......

......

......

.61.

4.1

Lay

ers

belo

w th

e ap

plic

atio

n la

yer.

......

......

..9

1.4.

2 L

ayer

ing

with

in th

e ap

plic

atio

nla

yer.

......

......

......

......

......

......

......

......

....1

0

1.5

Port

s an

d A

pplic

atio

ns...

......

......

......

......

......

......

...10

1.6

Tel

net:

A S

impl

e A

pplic

atio

n...

......

......

......

......

....1

41.

6.1

Man

ual t

est o

f e-

mai

l pro

toco

ls...

......

......

....1

4

1.7

Arc

hite

ctur

es...

......

......

......

......

......

......

......

......

......

16

1.8

Cha

inin

g, R

efer

ral o

r M

ultic

astin

g....

......

......

......

.18

1.9

Sym

met

ric

and

Asy

mm

etri

c Pr

otoc

ols.

......

......

....2

0

1.10

Tra

nsfe

r of

Res

pons

ibili

ty...

......

......

......

......

......

.21

1.11

Ide

ntif

icat

ion

......

......

......

......

......

......

......

......

......

.22

1.12

Tra

nsac

tions

and

Ses

sion

s...

......

......

......

......

......

.23

1.13

Tur

n-ar

ound

Tim

e, P

ipel

inin

g an

d W

indo

win

g.25

1.14

Ter

min

atin

g a

Con

nect

ion

......

......

......

......

......

....2

6

1.15

Int

erm

edia

ries

......

......

......

......

......

......

......

......

......

28

1.16

Nam

es a

nd A

ddre

sses

......

......

......

......

......

......

.....

291.

16.1

Dom

ain

met

hod

of a

lloca

ting

glob

ally

uni

que

nam

es...

......

......

......

...29

1.16

.2 H

ash

Cod

e M

etho

d of

Cre

atin

gG

loba

lly U

niqu

e N

ames

......

......

......

...32

1.16

.3 U

ser-

frie

ndly

Nam

es...

......

......

......

......

......

.33

1.17

The

Dom

ain

Nam

e Sy

stem

......

......

......

......

......

...33

1.17

.1 M

X r

ecor

ds...

......

......

......

......

......

......

......

....3

5

1.18

Top

-lev

el d

omai

ns...

......

......

......

......

......

......

......

..36

1.19

The

old

ver

sus

new

pro

blem

......

......

......

......

......

.38

1.19

.1 V

ersi

on n

umbe

r....

......

......

......

......

......

......

..39

1.19

.2 F

eatu

re S

elec

tion

Met

hod.

......

......

......

......

.40

1.19

.3 F

eatu

re n

amin

g....

......

......

......

......

......

......

...41

1.19

.4 B

uilt-

in E

xten

sion

Poi

nts.

......

......

......

......

..43

1.20

Sta

ndar

ds T

erm

inol

ogy

......

......

......

......

......

......

...45

1.21

OSI

ver

sus

the

Inte

rnet

......

......

......

......

......

......

....4

5R

efer

ence

s47

Compendium 4 page 2

Page 3: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

Intr

od

uct

ion

an

d B

asi

cC

on

cep

ts

1.1

A S

imp

le E

xa

mp

le

Inte

rnet

app

lica

tion

lay

er p

roto

cols

wer

e fr

om t

he

begi

nnin

g ve

ry s

impl

e. A

com

pute

r pr

ogra

m o

n

one

com

pute

r co

uld

conn

ect

to a

pro

gram

run

ning

on a

noth

er c

ompu

ter,

and

sen

d si

mpl

e te

xtua

l

com

man

ds.

Fig

ure

1 sh

ows

the

inte

ract

ions

nee

ded

in a

sim

ple

case

to s

end

an e

-mai

l mes

sage

.

6M

essa

ge H

andl

ing

HE

LO

(=

here

I am

)

250

(=O

K)

MA

IL F

RO

M (

=sp

ecify

sen

der)

250

(=O

K)

RC

PT

TO

(=s

peci

fy r

ecip

ient

)

250

(=O

K)

DA

TA

354

(=Se

nd te

xt)

.... m

essa

ge h

eade

r an

d te

xt ..

.

. (=

End

of

text

)

250

(=O

K)

Mai

lcl

ient

Mai

lse

rver

Fig

ure

1: In

tera

ctio

ns f

or s

endi

ng a

n e-

mai

l mes

sage

The

fol

low

ing

is a

ll t

hat

is n

eces

sary

to

send

e-

mai

l fr

om o

ne h

ost

to a

noth

er (

“C:”

is

text

sen

t by

the

clie

nt, “

S:”

is te

xt r

etur

ned

by th

e se

rver

):

C: <connects to server>

S: <accepts connection>

C: HELO mail.duckland.com

S: 250 mail.northpole.com

Compendium 4 page 3

Page 4: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

C: MAIL FROM: donald-

[email protected]

S: 250 [email protected]...

Sender ok

C: RCPT TO: father-

[email protected]

S: 250 father-

[email protected]...

Recipient ok

C: DATA

S: 354 Enter mail, end with "." on

a line by itself

C: From: Donald Duck <donald-

[email protected]>

C: To: Father Christmas <father-

[email protected]>

C: Date: 24 Dec 1999

C:

C: Peace be with you.

C: .

S: 250 PAA00329 Message accepted

for delivery

C: QUIT

S: 221 mail.northpole.com closing

connection

S: <closes connection>

So,

jus

t by

sen

ding

a f

ew l

ines

of

text

bac

k an

d

forw

ards

be

twee

n tw

o co

mpu

ters

, an

e-

mai

l

mes

sage

is tr

ansm

itted

.

8M

essa

ge H

andl

ing

1.2

P

roce

ss,

Cli

en

t, H

ost

,S

erv

er

Per

sona

l com

pute

r

E-m

ail

prog

ram

= C

lient

= U

ser

Age

nt

E-m

ail

serv

er=

One

or

mor

epr

oces

ses

Hos

t co

mpu

ter

WW

Wse

rver

= O

ne o

rm

ore

proc

esse

s

FT

Pse

rver

= O

ne o

rm

ore

proc

esse

s

Con

nect

ion

Fig

ure

2: P

roce

ss, C

lient

, Hos

t, Se

rver

, Use

r A

gent

Her

e ar

e so

me

impo

rtan

t ki

nds

of

obje

cts

in

netw

ork

prot

ocol

s (s

ee a

lso

Figu

re 2

):

Pro

cess

: A

pro

gram

run

ning

on

a co

mpu

ter.

Mos

t co

mpu

ters

all

ow s

ever

al p

roce

sses

to

run

at

the

sam

e tim

e.

Age

nt:

A p

roce

ss c

onne

cted

to

the

netw

ork.

The

ter

m “

agen

t” i

s al

so s

omet

imes

use

d fo

r a

proc

ess

whi

ch a

cts

wit

hout

im

med

iate

use

r co

ntro

l,

and

whi

ch m

ay r

epre

sent

its

elf

as a

use

r to

the

netw

ork.

Clie

nt:

A

proc

ess

whi

ch

can

esta

blis

h

conn

ectio

ns to

ser

vers

and

sen

d re

ques

ts to

them

.

Compendium 4 page 4

Page 5: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

Use

r ag

ent:

A c

lien

t w

hich

rep

rese

nts

a us

er.

Use

r ag

ents

oft

en h

ave

a us

er i

nter

face

, so

tha

t

hum

an u

sers

can

dir

ectly

con

trol

them

.

Ser

ver:

A p

roce

ss w

hich

acc

epts

con

nect

ions

from

cli

ents

and

per

form

s se

rvic

es f

or t

hem

. A

serv

er

can

ofte

n ha

ndle

se

vera

l si

mul

tane

ous

conn

ecti

ons

from

dif

fere

nt c

lien

ts.

Tec

hnic

ally

,

this

can

be

done

by

runn

ing

a si

ngle

pro

cess

, or

by

runn

ing

a se

para

te p

roce

ss f

or e

ach

clie

nt.

Hos

t: A

com

pute

r co

nnec

ted

to t

he n

etw

ork

and

prov

idin

g se

rvic

es.

Man

y di

ffer

ent

serv

ices

can

be p

rovi

ded

by d

iffe

rent

ser

vers

on

the

sam

e

host

. Not

e th

at t

he s

ame

proc

ess

can

be a

ser

ver

for

one

agen

t, an

d a

clie

nt f

or a

noth

er a

gent

. F

igur

e 3

show

s th

ree

agen

ts, o

ne u

ser

agen

t an

d tw

o se

rvic

e

agen

ts. T

he m

iddl

e se

rvic

e ag

ent

acts

as

a se

rver

to

the

user

age

nt,

and

as a

cli

ent

to t

he o

ther

use

r

agen

t.

Serv

ice

agen

t

Serv

er

Use

r ag

ent

Clie

nt

Serv

ice

agen

t

Clie

ntSe

rver

Fig

ure

3: T

he m

iddl

e ag

ent i

n th

is f

igur

e ac

ts a

s a

serv

erto

a u

ser

agen

t and

as

a cl

ient

to a

noth

er s

ervi

ce a

gent

.

1.3

P

roto

cols

The

wor

d pr

otoc

ol c

omes

fro

m t

he l

angu

age

of

dipl

omac

y. B

y pr

otoc

ol i

s m

eant

the

rul

es f

or t

he

lang

uage

us

ed

in

com

mun

icat

ion

betw

een

10M

essa

ge H

andl

ing

coun

trie

s. S

uch

rule

s ar

e in

tend

ed t

o en

sure

tha

t

coun

trie

s co

rrec

tly

unde

rsta

nd

each

ot

her,

to

redu

ce t

he r

isk

of m

isun

ders

tand

ing.

And

the

inte

ntio

n w

ith

netw

ork

prot

ocol

s is

the

sam

e. T

he

prot

ocol

s sp

ecif

y cl

earl

y w

hat

can

be s

aid,

and

how

it

is t

o be

sai

d. L

ike

prog

ram

min

g la

ngua

ges,

the

wor

ds a

nd p

hras

es i

n so

me

netw

ork

prot

ocol

s

can

have

som

e si

mil

arit

y to

nat

ural

lan

guag

e. B

ut

like

pr

ogra

mm

ing

lang

uage

, th

e ar

tifi

cial

lang

uage

s ar

e si

mpl

er, m

ore

rest

rict

ed, m

ore

exac

t,

and

do n

ot a

llow

am

bigu

itie

s. T

he s

imil

arit

y to

natu

ral

lang

uage

s is

a w

ay o

f m

akin

g th

e pr

otoc

ols

easi

er to

und

erst

and

for

hum

ans.

1.4

L

aye

rin

g M

od

el

Lay

erin

g is

eas

iest

to

unde

rsta

nd i

f yo

u st

art

wit

h

the

case

whe

re o

nly

one

com

pute

r an

d no

net

wor

k

is i

nvol

ved.

Eve

n in

tha

t ca

se,

mos

t pr

ogra

ms

are

stru

ctur

ed i

nto

laye

rs. S

uppo

se y

ou a

re d

esig

ning

a

prog

ram

for

des

igni

ng f

low

char

ts.

Suc

h a

prog

ram

may

hav

e th

e st

ruct

ure

show

n in

Fig

ure

4:

Compendium 4 page 5

Page 6: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

Ro

uti

nes

fo

r sa

vin

gd

ata

in f

iles

Set

tin

g a

co

lor

to a

pix

el o

n t

he

scre

en

Ro

uti

nes

fo

r d

raw

ing

and

co

lori

ng

lines

an

d r

ecta

ng

els

Ro

uti

nes

fo

rh

and

ling

flo

w c

har

ts

Use

r in

terf

ace

Ro

uti

nes

fo

r w

riti

ng

on

th

e h

ard

dis

k

Fig

ure

4: S

truc

ture

d pr

ogra

mm

ing:

Rou

tines

inhi

gher

laye

rs u

se f

eatu

res

of lo

wer

laye

rs.

Fig

ure

4 sh

ows

an e

xam

ple

of t

he s

truc

ture

of

a

prog

ram

run

ning

on

a si

ngle

com

pute

r. T

he h

ighe

r

laye

rs u

se f

eatu

res

of t

he l

ower

lay

ers.

Not

e th

at i

t

is o

nly

the

low

est

laye

r w

hich

act

uall

y w

rite

s

info

rmat

ion

on t

he s

cree

n or

on

the

hard

dis

k. T

he

high

er l

ayer

s in

flue

nce

the

cont

ents

of

the

scre

en

and

the

hard

dis

k on

ly i

ndir

ectl

y by

usi

ng r

outi

nes

in l

ayer

s be

low

the

m.

The

lay

ers

pile

d on

top

of

each

oth

er a

re o

ften

ref

erre

d to

as

a “s

tack

”.

Net

wor

k la

yeri

ng w

orks

in

the

sam

e w

ay.

But

in n

etw

orks

, th

ere

are

mor

e th

an o

ne p

rogr

am o

n

mor

e th

an o

ne h

ost

com

pute

r. S

o th

ere

is a

nee

d

for

a se

para

te “

stac

k” o

f la

yers

on

each

com

pute

r.

In t

he s

impl

e ca

se w

here

onl

y tw

o pr

oces

ses

runn

ing

on t

wo

com

pute

rs a

re i

nvol

ved,

thi

s ca

n be

depi

cted

as

in F

igur

e 5.

12M

essa

ge H

andl

ing

Sen

dd

ocu

men

ts

Sen

de-

mai

l

Ro

ute

pac

kets

Sen

d p

acke

tso

f b

its

Sen

db

its

Sen

d e

lect

ron

so

r p

ho

ton

s

Rec

eive

do

cum

ents

Rec

eive

e-m

ail

Ro

ute

pac

kets

Rec

eive

pac

kets

of

bit

s

Rec

eive

bit

s

Rec

eive

ele

c-tr

on

s/p

ho

ton

s

Fig

ure

5: T

wo

stac

ks o

f la

yers

, one

on

each

host

, com

mun

icat

ing

with

eac

h ot

her.

Why

are

the

hor

izon

tal

line

s da

shed

bet

wee

n al

l

the

boxe

s ex

cept

in

the

low

est

laye

r? T

hink

of

the

exam

ple

in F

igur

e 5.

The

re,

only

the

low

est

laye

r

phys

ical

ly w

rite

s on

the

scre

en o

r th

e ha

rd d

isk.

All

the

high

er l

ayer

s w

ill

indi

rect

ly w

rite

on

the

scre

en

or t

he h

ard

disk

thr

ough

the

low

ers

belo

w t

hem

.

The

pri

ncip

le f

or n

etw

orke

d pr

otoc

ols

are

the

sam

e. T

he l

ayer

“S

end

bits

” do

es n

ot p

hysi

call

y

send

bit

s to

the

lay

er “

Rec

eive

bit

s”.

The

lay

er

“Sen

d bi

ts”

asks

the

low

er l

ayer

“S

end

elec

tron

s or

phot

ons”

to

se

nd

the

bits

, w

hich

ar

e th

en

forw

arde

d up

to

the

box

“Rec

eive

bit

s”.

Sin

ce t

he

com

mun

icat

ion

betw

een

“Sen

d bi

ts”

and

“Rec

eive

bits

” is

ind

irec

t an

d no

t di

rect

, it

is

show

n as

a

dash

ed li

ne.

Compendium 4 page 6

Page 7: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

The

re i

s an

othe

r w

ay t

o vi

ew l

ayer

iing

, sh

own

in F

igur

e 6.

Dat

a in

the

inne

rmos

tco

nten

t.

Dat

a in

the

inne

rmos

tco

nten

t.

Add

ition

al i

nfor

mat

ion

on th

e en

velo

pe.

Dat

a in

the

inne

rmos

tco

nten

t.

Add

ition

al i

nfor

mat

ion

on th

e en

velo

pe.

Add

ition

al i

nfor

mat

ion

on th

e ou

ter e

nvel

ope.

The

inne

rmos

t co

n-te

nt in

an

enve

lope

with

add

ition

alin

form

atio

n on

the

enve

lope

.

The

nex

t lay

er d

own

puts

the

info

rmat

ion

from

the

laye

r abo

vein

side

a n

ew e

nve-

lope

.

The

inne

rmos

tco

nten

t.

Fig

ure

6: L

ayer

ing

seen

as

enve

lope

s in

side

env

elop

es in

side

enve

lope

s.

The

impo

rtan

t und

erst

andi

ngs

of th

is v

iew

are

two:

1.E

ach

low

er la

yer

usua

lly a

dds

info

rmat

ion.

The

am

ount

of

info

rmat

ion

tran

spor

ted

incr

ease

s w

ith e

ach

adde

dlo

wer

laye

r.

2.T

he c

ode

at e

ach

laye

r on

lylo

oks

at th

e in

form

atio

n ad

ded

at th

is la

yer.

The

info

rmat

ion

in h

ighe

r la

yers

is ju

stre

gard

ed a

s a

set o

f bi

ts b

y th

elo

wer

laye

r, it

doe

s no

tun

ders

tant

its

inne

r co

nten

t or

stru

ctur

e.

For

exa

mpl

e, t

he e

-mai

l tr

ansp

ort

syst

em (

SM

TP

)

wil

l tr

ansp

ort

an e

-mai

l m

essa

ge f

rom

its

sen

der

to

its

reci

pien

t. T

his

is d

one

usin

g th

e in

form

atio

n on

the

e-m

ail

enve

lope

. T

he

cont

ents

in

side

th

e

14M

essa

ge H

andl

ing

enve

lope

are

jus

t m

oved

alo

ng,

but

not

touc

hed.

And

the

e-m

ail

tran

spor

t sy

stem

use

s an

othe

r la

yer

belo

w it

(T

CP

), w

hich

mov

es a

doc

umen

t fro

m o

ne

host

to

anot

her

host

. T

CP

has

its

ow

n en

velo

pe,

and

the

cont

ents

(th

e e-

mai

l en

velo

pe a

nd c

onte

nt)

is ju

st m

oved

alo

ng, n

ot to

uche

d.

Thi

s is

not

alw

ays

quit

e tr

ue.

For

exa

mpl

e, e

-

mai

l tr

ansp

ort

syst

ems

wil

l so

met

imes

con

vert

the

cont

ent

of t

he m

essa

ge f

rom

one

enc

odin

g to

anot

her

enco

ding

. T

his

is r

egar

ded

as a

“la

yeri

ng

viol

atio

n” a

nd i

s no

t re

gard

ed a

s qu

ite

neat

, bu

t is

som

etim

es n

eces

sary

.

The

maj

or l

ayer

s in

com

pute

r ne

twor

king

are

liste

d in

Tab

le 1

.

App

licat

ion

laye

rT

he a

ctua

l app

licat

ion,

suc

h as

e-m

ail o

r th

e W

WW

.

Sess

ion

laye

rK

eep

an e

xcha

nge

of in

form

atio

n go

ing

thro

ugh

inte

ract

ions

bac

k an

d fo

rwar

d be

twee

n th

e tw

o ho

sts.

Tra

nspo

rt la

yer

Tra

nspo

rt a

who

le d

ocum

ent f

rom

sen

der

to r

ecip

ient

.

IP la

yer

Tra

nspo

rt s

mal

l pac

kets

of

bits

thro

ugh

the

netw

ork.

Phys

ical

laye

rA

cur

rent

or

light

tran

spor

ts a

str

eam

of

bits

.

Tab

le 1

: T

he m

ajor

laye

rs in

com

pute

r ne

twor

king

.

1.4

.1

La

yers

be

low

th

ea

pp

lica

tio

n l

aye

r

Thi

s bo

ok i

s on

ly a

bout

the

app

lica

tion

lay

er.

The

laye

rs b

elow

it

are

desc

ribe

d in

oth

er b

ooks

. T

hey

wil

l on

ly

be

men

tion

ed

as

tool

s us

ed

by

the

app

lica

tio

n

lay

er.

On

th

e In

tern

et,

mo

st

com

mun

icat

ion

uses

a s

tand

ard

call

ed T

CP

/IP

for

the

low

er l

ayer

s. T

CP

/IP

pro

vide

s a

reli

able

way

of

5

Compendium 4 page 7

Page 8: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

mov

ing

a do

cum

ent

from

one

por

t on

one

hos

t to

anot

her

port

on

an

othe

r ho

st.

Som

e In

tern

et

appl

icat

ions

use

an

alte

rnat

ive

to T

CP

, ca

lled

UD

B.

The

dif

fere

nce

betw

een

TC

P a

nd U

DB

is

that

TC

P e

nsur

es t

hat

all

the

pack

ets,

int

o w

hich

a

docu

men

t ha

s be

en s

plit

, ar

e re

liab

ly t

rans

port

ed

and

coll

ecte

d to

geth

er i

n th

e ri

ght

orde

r. I

f a

pack

et

is l

ost,

TC

P w

ill

ensu

re t

hat

it i

s re

sent

. U

DB

jus

t

disr

egar

ds l

ost

pack

ets.

Thi

s m

akes

UD

B f

aste

r

and

suit

able

fo

r sa

me-

tim

e pr

otoc

ols

like

sim

ulta

neou

s tr

ansm

issi

on o

f vo

ice

and

vide

o. F

or

such

voi

ce a

nd v

ideo

, it

is

mor

e im

port

ant

to g

et a

cont

inuo

us s

trea

m a

t th

e ri

ght

spee

d. L

ost

pack

ets

are

acce

pted

as

nois

e. M

ost

othe

r ap

plic

atio

ns, l

ike

e-m

ail,

FT

P (

file

tra

nsfe

r) a

nd n

orm

al W

WW

com

mun

icat

ion

use

TC

P,

whe

re r

elia

bili

ty i

s m

ore

impo

rtan

t tha

n sp

eed

and

sam

e-tim

e co

ntin

uity

.

1.4

.2

La

yeri

ng

wit

hin

th

ea

pp

lica

tio

n l

aye

r

Acc

ordi

ng t

o a

fund

amen

tali

st v

iew

of

netw

orki

ng,

netw

orks

onl

y ha

ve m

ajor

lay

ers

of t

he k

ind

show

n

in T

able

1.

But

in

real

ity,

the

re a

re o

ften

lay

ers

wit

hin

the

appl

icat

ion

laye

r. T

he p

roce

ssin

g of

dat

a

wit

hin

an a

ppli

cati

on o

ften

is

stru

ctur

ed i

nto

a

seri

es o

f op

erat

ions

per

form

ed w

hen

send

ing

data

,

and

the

reve

rse

seri

es w

hen

rece

ivin

g da

ta.

Man

y

stan

dard

s im

plic

itly

or

expl

icit

ly s

peci

fies

thi

s, a

s

is s

how

n in

Fig

ure

7. A

sim

ple

exam

ple

of t

his

is

the

e-m

ail

stan

dard

s. F

igur

e 8

show

s ho

w t

he e

-

16M

essa

ge H

andl

ing

mai

l st

anda

rds

are

stru

ctur

ed i

nto

a “H

eade

r an

d

body

” la

yer

and

a “M

ail f

orw

ardi

ng la

yer”

.

App

licat

ion

data

Trea

tmen

tty

pe A

Trea

tmen

tty

pe B

Tra

nspo

rtla

yer

App

licat

ion

data

Rev

erse

of

trea

tmen

t A

Rev

erse

of

trea

tmen

t B

Tra

nspo

rtla

yer

Fig

ure

7: L

ayer

ing

with

in th

e ap

plic

atio

n la

yer

(lay

ers

belo

w th

e tr

ansp

ort l

ayer

om

itted

)

Hi!

From:

Subject:

Hi!

Lay

ers

belo

w th

e ap

plic

atio

nla

yer

MAIL FROM:

RCPT TO:

From:

Subject:

Hi!

The

mes

sage

con

tent

Env

elop

e w

ith in

form

atio

nto

con

trol

for

war

ding

of

the

mes

sage

to th

e re

cipi

ents

is a

dded

to th

e he

ader

and

cont

ent

Hea

der

with

info

rmat

ion

to th

e re

cipi

ent a

nd th

ere

cipi

ents

’s m

aile

r is

adde

d to

the

cont

ent

Fig

ure

8: L

ayer

ing

with

in th

e ap

plic

atio

n la

yer

in e

-mai

l sta

ndar

ds

Compendium 4 page 8

Page 9: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

1.5

P

ort

s a

nd

Ap

pli

cati

on

s

The

re

are

man

y di

ffer

ent

appl

icat

ion

laye

r

prot

ocol

s. T

he p

roto

col

used

whe

n se

ndin

g e-

mai

l

is

diff

eren

t fr

om

the

prot

ocol

us

ed

whe

n

dow

nloa

ding

web

pag

es.

An

e-m

ail

clie

nt i

s no

t

able

to

com

mun

icat

e w

ith

a W

WW

ser

ver,

bec

ause

they

spe

ak d

iffe

rent

lan

guag

es.

Onl

y a

clie

nt a

nd a

serv

er

whi

ch

spea

k th

e sa

me

lang

uage

ca

n

com

mun

icat

e. B

ut t

he s

ame

host

can

hav

e se

vera

l

diff

eren

t cl

ient

and

ser

ver

appl

icat

ions

. A

per

sona

l

com

pute

r ca

n, f

or e

xam

ple,

at

the

sam

e ti

me

run

both

mai

l pr

ogra

ms

and

web

bro

wse

r. A

nd t

he

sam

e ho

st c

an a

t th

e sa

me

tim

e ru

n m

ail

serv

ers

and

WW

W s

erve

rs.

It i

s ve

ry i

mpo

rtan

t th

at m

essa

ges

sent

by

an e

-

mai

l cl

ient

are

sen

t to

an

e-m

ail

serv

er,

and

that

mes

sage

s se

nt b

y a

web

bro

wse

r ar

e se

nt t

o a

WW

W s

erve

r. T

o en

sure

thi

s, a

hos

t co

mpu

ter

has

diff

eren

t po

rts,

as

is s

een

in F

igur

e 9.

The

tra

nspo

rt

laye

r es

tabl

ishe

s ho

ses

betw

een

appl

icat

ions

. A

nd

the

hose

s fi

t in

to d

iffe

rent

por

ts f

or d

iffe

rent

appl

iati

ons.

T

his

is

done

by

as

sign

ing

a po

rt

num

ber

to e

ach

port

. W

hen

a cl

ient

con

nect

s to

a

serv

er,

it s

peci

fies

whi

ch p

ort

num

ber

it w

ants

to

acce

ss.

The

re a

re s

tand

ard

port

num

bers

for

eac

h

appl

icat

ion

prot

ocol

. T

hus,

whe

n a

web

bro

wse

r

conn

ects

to

a w

eb s

erve

r, i

t us

uall

y as

k fo

r po

rt

num

ber

80,

beca

use

this

is

th

e st

anda

rd

port

num

ber

for

web

ser

vers

. If

the

re i

s m

ore

than

one

web

se

rver

on

th

e sa

me

host

, th

ey

may

us

e

diff

eren

t po

rt n

umbe

rs.

In t

hat

case

, th

e cl

ient

can

18M

essa

ge H

andl

ing

use

the

port

num

ber

to i

ndic

ate

whi

ch s

erve

r it

wan

ts to

con

nect

to.

App

licat

ion

clie

nt

Clie

nt-c

ompu

ter

Port

App

licat

ion-

serv

er

Serv

erho

st

Port

Pack

ets

Fig

ure

9: A

TC

P co

nnec

tion

betw

een

two

com

pute

rs c

an b

ese

en a

s a

hose

, thr

ough

whi

ch p

acke

ts o

f oc

tets

can

be

sent

betw

een

port

s. T

he p

ort a

re o

peni

ngs

to a

pplic

atio

nsru

nnin

g on

the

com

pute

rs.

The

re

is

a lo

ng

list

of

su

ch

stan

dard

po

rt

num

bers

mai

ntai

ned

by t

he s

tand

ards

org

anis

atio

n

IAN

A (

IAN

A m

ay s

oon

chan

ge t

o be

com

e a

part

of

ICA

NN

).

IAN

A

mea

ns

Inte

rnet

A

ssig

ned

Num

bers

A

utho

rity

, an

d it

s m

ain

task

is

to

dist

ribu

te n

umbe

rs a

nd n

ames

whi

ch a

re a

lloc

ated

for

part

icul

ar

uses

. S

ome

com

mon

of

th

ese

stan

dard

por

t num

bers

are

list

ed in

Tab

le 2

.

Tab

le 2

: So

me

com

mon

por

t num

bers

20ft

p-da

taFi

le tr

ansf

er, d

ata

21ft

pFi

le tr

ansf

er, c

ontr

ol

23te

lnet

Ter

min

al e

mul

ator

25sm

tpE

-mai

l for

war

ding

53dn

sD

omai

n na

me

look

up

70go

pher

Gop

her

sear

ch

Compendium 4 page 9

Page 10: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

79fi

nger

Find

ing

info

abo

ut a

use

r

80ht

tpR

etri

evin

g w

eb p

ages

109

pop2

Del

iver

ing

e-m

ail t

o its

fin

alre

cipi

ent

110

pop3

Del

iver

ing

e-m

ail t

o its

fin

alre

cipi

ent

119

nntp

Use

net N

ews

143

imap

2D

eliv

erin

g e-

mai

l to

its f

inal

reci

pien

t

194

irc

Dis

trib

uted

cha

t sys

tem

220

imap

3D

eliv

erin

g e-

mai

l to

its f

inal

reci

pien

t

993

sim

amp4

Del

iver

ing

e-m

ail t

o its

fin

al r

ecie

ntth

roug

h a

secu

re c

hann

el

995

spop

3D

eliv

erin

g e-

mai

l to

its f

inal

rec

ient

thro

ugh

a se

cure

cha

nnel

A f

ull

list

can

be

foun

d at

the

IA

NA

web

sit

e

[1].

1.6

T

eln

et:

A S

imp

leA

pp

lica

tio

n

In

the

1970

s an

d 19

80s,

m

ost

com

pute

r us

er

inte

rfac

es w

ere

base

d on

sen

ding

one

or

mor

e

sing

le c

hara

cter

s to

the

hos

t, a

nd g

etti

ng o

ne o

r

mor

e si

ngle

cha

ract

ers

back

. M

any

appl

icat

ions

on

unix

com

pute

rs s

till

use

suc

h in

terf

aces

. T

elne

t is

an I

nter

net

prot

ocol

for

usi

ng s

uch

an i

nter

face

on

anot

her

com

pute

r th

an

your

ow

n.

A

sim

ple

exam

ple

of a

Tel

net

usag

e is

sho

wn

in F

igur

e 10

.

Tel

net

is o

f sp

ecia

l in

tere

st,

beca

use

it w

orks

the

sam

e w

ay m

ost

appl

icat

ion

laye

r pr

otoc

ols

wor

k:

Seq

uenc

es o

f ch

arac

ters

are

sen

t ba

ck a

nd f

orw

ard

betw

een

two

com

pute

rs.

Bec

ause

of

this

, te

lnet

clie

nts

can

som

etim

es b

e us

ed t

o co

nnec

t to

oth

er

appl

icat

ions

than

thos

e de

sign

ed to

use

teln

et.

20M

essa

ge H

andl

ing

OSF1 V4.0 (tellus.dsv.su.se) (ttyp8)

login: jpalme

jpalme's Password:

Last login: Sun Jun 27 18:12:50

Digital UNIX V4.0A;

Fri Jul 11 04:55:52 MET DST 1997

You have new mail.

TERM = (vt100)

Thi

s in

form

atio

n is

type

d by

the

user

Her

e th

e us

er ty

pes

a pa

ssw

ord,

whi

chis

not

sho

wn

on th

esc

reen

Her

e th

e us

er ty

pes

only

the

Ret

urn

key

toaf

firm

that

his

teln

et is

usin

g so

-cal

led

vt10

0em

ulat

ion

Her

e th

e us

er c

an ty

peco

mm

ands

to th

e sh

ell,

whi

ch is

a li

ne-o

rien

ted

inte

rfac

e to

man

ipul

ate

file

san

d st

art p

roce

sses

Fig

ure

10: E

xam

ple

of a

sim

ple

teln

et u

sage

. The

hos

t and

the

user

take

turn

s se

ndin

g ch

arac

ters

and

new

line

cod

es to

eac

h ot

her.

Her

e is

an

expe

rim

ent

whi

ch w

ill

mak

e it

eas

ier

to u

nder

stan

d ho

w n

etw

ork

prot

ocol

s w

ork.

All

you

need

to

perf

orm

thi

s ex

peri

men

t is

a t

elne

t

clie

nt o

n a

com

pute

r co

nnec

ted

to t

he I

nter

net.

The

teln

et c

lien

t m

ust

have

the

cap

abil

ity

to c

onne

ct t

o

othe

r po

rts

than

the

regu

lar

teln

et p

ort 2

3.

1.6

.1

Ma

nu

al

test

of

e-

ma

il p

roto

cols

Use

the

tel

net

clie

nt t

o co

nnec

t to

por

t 25

of

the

com

pute

r ru

nnin

g th

e m

ail

serv

er y

ou a

re u

sing

.

Eve

ryth

ing

you

type

is

unde

rlin

ed i

n th

e te

xt

Compendium 4 page 10

Page 11: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

belo

w.

Tex

t fr

om t

he c

ompu

ter

is n

ot u

nder

line

d.

The

cha

ract

er Ҧ

” m

eans

tha

t yo

u sh

ould

pus

h th

e

Ret

urn

key.

Rep

lace

“mail.foo.bar

” w

ith

the

dom

ain

nam

e of

the

mai

l se

rver

you

are

usi

ng,

and

repl

ace

“mycomputer.foo.bar

” w

ith

the

dom

ain

nam

e of

you

r ow

n co

mpu

ter.

The

fir

st l

ine

belo

w i

s th

e un

ix c

omm

and

to

star

t te

lnet

and

con

nect

it

to p

ort

25 o

f th

e se

rver

mai

l.foo

.bar

. If

you

are

runn

ing

teln

et f

rom

ano

ther

com

pute

r th

an a

uni

x co

mpu

ter,

you

rep

lace

the

firs

t li

ne b

elow

wit

h th

e co

mm

and

to g

et y

our

teln

et p

rogr

am to

do

this

con

nect

ion.

unix>telnet mail.foo.bar 25

Connected to mail.foo.bar.

Escape character is '^]'.

220 info.dsv.su.se ESMTP Sendmail

8.8.5/8.8.5; Fri, 24 Dec 1999

15:24:09 +0200 (MET DST)

helo¶

501 helo requires domain address

helo mycomputer.foo.bar¶

250 info.dsv.su.se Hello

tellus.dsv.su.se 130.237.161.217],

pleased to meet you

MAIL FROM: [email protected]

250 [email protected]... Sender

ok

RCPT TO: [email protected]

250 [email protected]...

Recipient ok

DATA¶

354 Enter mail, end with "." on a

line by itself

From: Donald Duck <donald-

[email protected]

To: Father

Christmas<[email protected]

Date: 24 Dec 1999¶

¶ Peace be with you.¶

22M

essa

ge H

andl

ing

250 PAA00329 Message accepted for

delivery

quit¶

221 mail.foo.bar closing

connection

Wha

t yo

u ar

e do

ing

in t

he e

xam

ple

abov

e is

to

perf

orm

a c

ompl

ete

send

ing

of a

n e-

mai

l m

essa

ge,

whe

re y

ou m

anua

lly

sim

ulat

e th

e m

ail

clie

nt w

ith

text

, whi

ch y

ou ty

pe o

n yo

ur k

eybo

ard.

Not

e th

at t

he s

endi

ng o

f an

e-m

ail

mes

sage

cons

ists

of

a se

ries

of

phra

ses

sent

bac

k an

d

forw

ard

betw

een

clie

nt a

nd s

erve

r. A

ll a

ppli

cati

on

prot

ocol

s do

not

wor

k th

at w

ay.

HT

TP

1.0

use

s

only

a s

ingl

e se

quen

ce o

f oc

tets

fro

m t

he c

lien

t,

and

a si

ngle

seq

uenc

e of

oct

ets

back

fro

m t

he

serv

er.

1.7

Arc

hit

ect

ure

s

Net

wor

k pr

otoc

ols

mea

n th

at s

ever

al p

roce

sses

in

diff

eren

t co

mpu

ters

com

mun

icat

e w

ith

each

oth

er.

By

“Arc

hite

ctur

e” i

s m

eant

the

org

anis

atio

n of

appl

icat

ions

int

o di

ffer

ent

mod

ules

, an

d ho

w t

hese

mod

ules

com

mun

icat

e.

Compendium 4 page 11

Page 12: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

Clie

nt

Serv

er

Clie

nt

Clie

nt

Dat

a ba

se

Fig

ure

11:

Sim

ple

clie

nt-s

erve

r ar

chite

ctur

e

Fig

ure

11

show

s a

sim

ple

and

com

mon

arch

itec

ture

, w

ith

a si

ngle

ser

ver,

and

a n

umbe

r of

clie

nts

whi

ch

acce

ss

this

se

rver

. W

ith

this

arch

itec

ture

, th

e se

rver

is

ofte

n re

spon

sibl

e fo

r a

data

bas

e st

ored

in

the

serv

er,

and

the

oper

atio

ns,

whi

ch t

he c

lien

ts p

erfo

rm o

n th

e se

rver

, m

ay g

et

och

stor

e da

ta f

rom

thi

s co

mm

on d

ata

base

. T

his

arch

itec

ture

is

sim

ple

to i

mpl

emen

t be

caus

e th

ere

is o

nly

one

prot

ocol

, the

cli

ent-

serv

er p

roto

col,

and

only

a s

ingl

e da

ta b

ase,

and

thu

s no

pro

blem

s w

ith

co-o

rdin

atin

g in

form

atio

n in

dif

fere

nt d

ata

base

s.

(Unl

ess

the

clie

nts

have

the

ir o

wn

data

bas

es,

then

co-o

rdin

atio

n

may

b

e n

eed

ed

and

th

e

impl

emen

tati

on a

nd p

roto

cols

may

bec

ome

mor

e

com

plex

.)

The

arc

hite

ctur

e in

Fig

ure

11 i

s co

mm

on i

n

loca

l ar

ea n

etw

orks

, w

here

use

rs a

t lo

cal

wor

k

stat

ions

acc

ess

a co

mm

on lo

cal d

ata

base

.

24M

essa

ge H

andl

ing

11

12

22

Use

r ag

ent

Use

r ag

ent

Use

r ag

ent

Serv

ice

agen

t

Serv

ice

agen

tSe

rvic

e ag

ent

Fig

ure

12:

Arc

hite

ctur

e w

ith s

ever

al in

terc

onne

cted

ser

vers

Fig

ure

12 s

how

s a

mor

e co

mpl

ex a

rchi

tect

ure,

whi

ch i

s us

ed b

y so

me

Inte

rnet

app

lica

tion

s, f

or

exam

ple

e-m

ail

and

Use

net

New

s. T

here

are

(at

leas

t) t

wo

type

s of

con

nect

ions

, la

bell

ed ❿

and

in F

igur

e 12

. C

onne

ctio

ns b

etw

een

a us

er a

gent

and

a se

rvic

e ag

enti

s la

bell

ed ❿

in

the

the

figu

re,

and

conn

ecti

ons

betw

een

two

serv

ice

agen

tis

labe

lled

in

the

figu

re.

The

ne

eds

may

be

diff

eren

t, an

d be

caus

e of

thi

s, d

iffe

rent

pro

toco

ls

may

be

used

.

In

the

case

of

e-

mai

l,

this

is

ev

en

mor

e

com

plex

, be

caus

e di

ffer

ent

prot

ocol

s ar

e us

ed f

or

send

ing

mai

l fr

om a

use

r ag

ent

to a

ser

vice

age

nt,

and

for

deli

veri

ng m

ail

from

a s

ervi

ce a

gent

to

a

user

age

nt.

In s

ome

case

s, t

he s

ame

prot

ocol

def

init

ion

is

used

bot

h fo

r th

e pr

otoc

ol b

etw

een

user

age

nt a

nd

serv

ice

agen

t, an

d be

twee

n tw

o se

rvic

e ag

ents

. But

in s

uch

case

s, s

omet

imes

dif

fere

nt e

lem

ents

in

the

prot

ocol

s ar

e us

ed,

so t

hat

in r

eali

ty t

he p

roto

col

betw

een

user

age

nt a

nd s

ervi

ce a

gent

is n

ot e

xact

ly

the

sam

e as

bet

wee

n tw

o se

rvic

e ag

ents

.

Compendium 4 page 12

Page 13: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

Eve

n w

hen

a pr

otoc

ol i

s us

ed b

etw

een

two

sim

ilar

age

nts,

suc

h as

bet

wee

n tw

o se

rvic

e ag

ents

,

one

of t

he a

gent

s is

usu

ally

the

cli

ent

and

the

othe

r

the

serv

er i

n th

at p

arti

cula

r co

nnec

tion

. T

he c

lien

t

is

the

agen

t w

hich

op

ens

the

conn

ecti

on

and

requ

ests

ser

vice

s fr

om t

he s

erve

r, t

he s

erve

r is

the

agen

t w

hich

rec

eive

s re

ques

ts f

rom

the

cli

ent

and

perf

orm

s th

em

(or

refu

ses

them

).

Thu

s,

in

a

conn

ecti

on

betw

een

two

serv

ice

agen

ts

in

Fig

ure

12,

one

of t

hem

may

be

the

clie

nt d

urin

g

that

par

ticu

lar

conn

ecti

on, a

nd th

e ot

her

may

be

the

serv

er.

Tha

t is

why

the

box

es i

n F

igur

e 12

are

labe

lled

“U

ser

agen

t” a

nd “

Ser

vice

age

nt”

and

not

sim

ply

“Clie

nt”

and

“Ser

ver”

.

1.8

C

ha

inin

g,

Re

ferr

al

or

Mu

ltic

ast

ing

If a

dat

abas

e is

dis

trib

uted

on

seve

ral

serv

ers,

the

n

the

info

rmat

ion,

whi

ch t

he c

lien

t ne

eds,

may

not

be

avai

labl

e on

a s

ingl

e se

rver

. In

form

atio

n m

ay h

ave

to b

e fe

tche

d fr

om a

noth

er o

r m

ulti

ple

serv

ers.

The

re a

re t

hree

com

mon

arc

hite

ctur

es f

or t

his,

chai

ning

, ref

erra

l an

d m

ulti

cast

ing

as s

how

n in

Fig

ure

13.

Wit

h ch

aini

ng, t

he u

ser

agen

t co

nnec

ts

to o

nly

one

serv

er.

Thi

s se

rver

may

con

nect

to

anot

her

serv

er,

and

that

ser

ver

may

con

nect

to

yet

anot

her

serv

er,

to g

et t

he i

nfor

mat

ion

requ

este

d.

The

inf

orm

atio

n is

the

n re

turn

ed r

ecur

sive

ly t

he

reve

rse

way

. W

ith

Ref

erra

l, th

e cl

ient

con

nect

s to

the

firs

t se

rver

. If

thi

s se

rver

doe

s no

t ha

ve t

he

requ

este

d in

form

atio

n, i

t w

ill

tell

the

cli

ent

to t

ry

anot

her

serv

er. T

his

is c

onti

nued

unt

il a

ser

ver

wit

h

26M

essa

ge H

andl

ing

the

requ

este

d in

form

atio

n is

fo

und.

W

ith

mul

tica

stin

g, t

he s

ame

ques

tion

is

sent

to

seve

ral

diff

eren

t se

rver

s, h

opin

g th

at o

ne o

f th

em h

as t

he

answ

er.

Mul

tica

stin

g ca

n co

st a

lot

, if

you

sen

d a

quer

y to

a n

umbe

r of

ser

vers

for

inf

orm

atio

n w

hich

whi

ch y

ou o

nly

need

fro

m o

ne o

f th

e se

rver

s. I

n

such

a c

ase,

it

can

be m

ore

effi

cien

t to

org

aniz

e th

e

data

str

uctu

re s

o th

at i

nfor

mat

ion

can

be l

ocat

ed

wit

hout

mul

tica

stin

g. M

ulti

cast

ing

can

how

ever

be

very

eff

icie

nt i

f th

e sa

me

info

rmat

ion

is t

o be

sen

t

at t

he s

ame

tim

e to

man

y re

cipi

ents

. In

thi

s ca

se,

spec

ial

mul

tica

stin

g fe

atur

es i

n th

e lo

wer

lay

ers

are

used

, so

tha

t th

e sa

me

info

rmat

ion

need

onl

y be

sent

onc

e be

twee

n ro

uter

s, e

ven

if t

here

are

man

y

user

s of

the

dat

a on

eac

h ro

uter

. E

xam

ples

of

this

is s

imul

tane

ous

vide

o an

d au

dio

broa

dcas

ting.

Mul

tica

stin

g

Serv

erSe

rver

Serv

er

Serv

er

Clie

nt

Ref

erra

l(i

tera

tive

)C

lient

Serv

erSe

rver

Serv

er

Cha

inin

g(r

ecur

sive

)C

lient

Serv

er

Serv

erSe

rver

Fig

ure

13: C

hain

ing,

Ref

erra

l and

Mul

ticas

ting

Cha

inin

g ha

s an

ad

vant

age

com

pare

d to

refe

rral

. T

his

is

that

si

nce

the

info

rmat

ion

is

retu

rned

rec

ursi

vely

thr

ough

the

ser

vers

, a

serv

er

Compendium 4 page 13

Page 14: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

can

cach

e th

e in

form

atio

n, s

o th

at t

he n

ext

requ

est

can

be h

andl

ed f

rom

the

cac

he i

nste

ad o

f th

roug

h

chai

ning

. A

noth

er a

dvan

tage

is

that

use

r ag

ents

ofte

n ha

ve s

low

er c

onne

ctio

ns t

han

serv

ers,

and

it

is t

hen

mor

e ef

fici

ent

wit

h ch

aini

ng,

beca

use

less

netw

ork

oper

atio

ns h

ave

to b

e pe

rfor

med

by

the

user

age

nt th

roug

h lo

w-b

andw

idth

con

nect

ions

.

An

exam

ple

of

a sy

stem

w

hich

us

es

both

chai

ning

and

ref

erra

l is

the

Int

erne

t D

omai

n N

ame

Sys

tem

, th

e D

NS

(s

ee

page

37)

. T

he

DN

S

stan

dard

s sp

ecif

y th

at

refe

rral

is

m

anda

tory

,

chai

ning

is o

ptio

nal.

But

in r

eali

ty, c

hain

ing

is u

sed

mor

e of

ten

than

ref

erra

l, be

caus

e of

the

adv

anta

ges

men

tione

d ab

ove.

1.9

S

ymm

etr

ic a

nd

Asy

mm

etr

ic P

roto

cols

Nea

rly

all

appl

icat

ion

laye

r pr

otoc

ols

are

asym

met

ric.

By

this

is m

eant

that

the

lang

uage

sen

t

in o

ne d

irec

tion

is

diff

eren

t fr

om t

he l

angu

age

sent

in t

he o

ther

dir

ecti

on.

Usu

ally

, w

hen

desc

ribi

ng a

n

asym

etri

c pr

otoc

ol,

one

of t

he a

gent

s is

nam

ed

“cli

ent”

and

one

is

nam

ed “

serv

er”.

Not

e th

at e

ven

betw

een

two

sim

ilar

age

nts,

suc

h as

bet

wee

n tw

o

serv

ice

agen

ts,

the

prot

ocol

can

sti

ll b

e as

sym

etri

c.

The

y ar

e eq

ual,

but

at a

cer

tain

mom

ent

of t

ime,

the

prot

ocol

is

asym

etri

c. F

or e

xam

ple,

whe

n

send

ing

e-m

ail

betw

een

two

serv

ice

agen

ts,

the

clie

nt i

s th

e se

nder

, w

hich

can

sen

d m

essa

ges,

the

serv

er i

s th

e re

ciev

ing

agen

t w

hich

can

ref

use

or

acce

pt m

essa

ges.

28M

essa

ge H

andl

ing

The

e-m

ail

stan

dard

for

the

pro

toco

l be

twee

n

two

serv

ice

agen

ts h

as a

TU

RN

com

man

d. T

he

TU

RN

com

man

d al

low

s th

e tw

o ag

ents

to

chan

ge

role

s, s

o th

at t

he c

lien

t be

com

es s

erve

r an

d th

e

reve

rse.

The

TU

RN

com

man

d is

how

ever

not

use

d

very

m

uch,

an

d it

m

ight

di

sapp

ear

from

th

e

stan

dard

s in

the

fut

ure.

And

the

TU

RN

com

man

d

does

not

mak

e th

e pr

otoc

ol s

ymm

etri

c, b

ecau

se a

t

a ce

rtai

n m

omen

t, th

e pr

otoc

ol is

stil

l asy

met

ric.

The

rea

son

for

the

TU

RN

com

man

d w

as t

hat

earl

ier,

man

y co

nnec

tion

s w

ere

mad

e th

roug

h di

al-

up p

hone

con

nect

ions

. S

ince

suc

h a

conn

ecti

ons

are

expe

nsiv

e an

d ti

me-

cons

umin

g to

ope

n, o

ne

conn

ecti

on c

ould

rep

lace

tw

o w

ith

the

TU

RN

com

man

d. T

oday

, ho

wev

er,

alm

ost

all

conn

ecti

ons

betw

een

serv

ice

agen

ts i

s th

roug

h le

ased

lin

es,

so

the

TU

RN

com

man

d is

not

nee

ded

any

mor

e.

1.10

T

ran

sfe

r o

fR

esp

on

sib

ilit

y

Pro

toco

ls a

re s

peci

fica

tion

s of

whi

ch s

tate

men

ts

are

sent

bet

wee

n tw

o ag

ents

, us

uall

y a

clie

nt a

nd a

serv

er.

Pro

toco

ls t

hen

spec

ify

wha

t th

e cl

ient

can

send

, an

d w

hat

the

serv

er c

an r

espo

nd w

ith.

The

inte

ract

ion

betw

een

clie

nt a

nd s

erve

r m

ay, i

n so

me

prot

ocol

s, i

nclu

de s

ever

al i

nter

acti

ons

sent

bac

k

and

forw

ard.

An

exam

ple

of a

com

mon

typ

e of

int

erac

tion

,

spec

ifie

d by

a

prot

ocol

, is

th

e tr

ansf

er

of

resp

onsi

bili

ty. T

his

occu

rs i

n e-

mai

l, w

here

a s

tore

-

and-

forw

ard

tech

niqu

e is

oft

en u

sed,

as

show

n in

Fig

ure

14.

The

ori

gina

l se

nder

sen

ds t

he m

essa

ge

Compendium 4 page 14

Page 15: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

to a

loc

al m

ail

serv

er c

lose

to

the

orig

inal

sen

der.

Thi

s lo

cal

mai

l se

rver

sen

ds t

he m

essa

ge t

o a

loca

l

mai

l se

rver

clo

se t

o th

e fi

nal

reci

pien

t. A

nd t

his

serv

er d

eliv

ers

the

mes

sage

to

the

fina

l re

icpi

ent.

Som

etim

es,

mor

e th

an t

wo

tran

sfer

age

nts

are

invo

lved

on

the

rout

e fr

om t

he o

rigi

nal

send

er t

o

the

fina

l re

cipi

ent.

Wit

h su

ch a

sto

re-a

nd-f

orw

ard

stru

ctur

e, it

is v

ery

impo

rtan

t tha

t the

mes

sage

doe

s

not

disa

ppea

r in

to a

“bl

ack

hole

”. A

t an

y m

omen

t

of ti

me,

one

com

pute

r m

ay c

rash

, or

the

conn

ecti

on

betw

een

two

com

pute

rs

may

br

eak.

S

uch

occu

renc

es

shou

ld

not

caus

e a

mes

sage

to

disa

ppea

r. T

o en

sure

thi

s, e

ach

agen

t st

ores

the

mes

sage

on

disk

in

a su

ch a

way

, th

at t

he m

essa

ge

wil

l st

ill

be t

here

whe

n th

e ag

ent

is r

esta

rted

aft

er a

cras

h. T

his

mea

ns t

hat

the

resp

onsi

bili

ty t

o de

live

r

the

mes

sage

is

tran

sfer

red

betw

een

the

agen

ts.

A

typi

cal

prot

ocol

for

suc

h a

tran

sfer

of

resp

onsi

bili

ty

may

wor

k in

the

follo

win

g w

ay:

Fina

lre

cipi

ent

Ori

gina

lse

nder

Tran

sfer

agen

tTr

ansf

erag

ent

Fig

ure

14: St

ore-

and-

forw

ard

of e

-mai

l mes

sage

s

1a.

The

clie

nts

send

the

mes

sage

to th

e se

rver

.

1b.

The

ser

ver

rece

ives

the

mes

sage

and

sto

res

it on

non

-vol

atile

dis

k.

2a.

The

ser

ver

then

sen

ds a

res

pons

e to

the

clie

nt th

at th

e m

essa

ge h

as b

een

rece

ived

.

30M

essa

ge H

andl

ing

2b.

The

clie

nt r

ecei

ves

this

res

pons

e, a

nd n

otes

that

the

serv

er h

as ta

ken

over

res

pons

ibili

ty f

or th

is

mes

sage

.

Unt

il s

tep

2b,

the

clie

nt c

onti

nues

to

rega

rd t

he

mes

sage

as

unse

nt.

It w

ill

thus

, if

nee

ded,

try

to

send

the

mes

sage

onc

e m

ore,

unt

il i

t ha

s re

ceiv

ed

conf

irm

atio

n in

ste

p 2b

tha

t th

e se

rver

has

tak

en

over

res

pons

ibil

ity

for

the

mes

sage

. T

his

mea

ns

that

th

ere

is

very

li

ttle

ri

sk

that

a

mes

sage

disa

ppea

rs.

The

re i

s, i

nste

ad,

a sm

all

risk

tha

t a

mes

sage

is s

ent m

ore

than

onc

e.

The

tr

ansf

er

of

resp

onsi

bili

ty

as

desc

ribe

d

abov

e co

nsis

ts o

f tw

o in

tera

ctio

n st

eps,

one

fro

m

the

clie

nt t

o th

e se

rver

, an

d on

e fr

om t

he s

erve

r to

the

clie

nt.

Som

e in

tera

ctio

ns u

se m

ore

than

tw

o

step

s.

1.11

Id

en

tifi

cati

on

An

oth

er

com

mo

n

kin

d

of

inte

ract

ion

is

iden

tifi

cati

on b

etw

een

two

agen

ts.

For

exa

mpl

e, a

clie

nt c

an t

ell

the

serv

er i

ts n

ame,

and

the

ser

ver

wan

ts t

o co

nfir

m t

hat

this

is

real

ly t

he c

lien

t it

clai

ms

to b

e. I

dent

ific

atio

n ca

n co

nsis

t of

the

follo

win

g st

eps:

1.T

he c

lient

con

nect

s to

the

serv

er a

nd s

ends

its n

ame.

2.T

he s

erve

r se

nds

a st

ring

of

rand

om d

igits

to th

e cl

ient

.

3.T

he c

lient

enc

rypt

s th

is s

trin

g, a

nd s

end

the

encr

ypte

d st

ring

bac

k to

the

serv

er.

Compendium 4 page 15

Page 16: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

4.T

he s

erve

r de

cryp

ts th

e en

cryp

ted

stri

ng,

chec

ks th

at it

has

rec

eive

d th

e sa

me

rand

om d

igits

it se

nt in

ste

p 2,

and

tells

the

clie

nt th

at

iden

tific

atio

n ha

s su

ccee

ded.

Sin

ce

only

th

e ri

ght

clie

nt

know

s ho

w

to

encr

ypt

the

rand

om s

trin

g in

the

rig

ht w

ay,

the

serv

er k

now

s th

at t

he c

lien

t is

the

one

it

clai

ms

to

be.

Her

e,

four

in

tera

ctio

ns

back

an

d fo

rwar

d

betw

een

clie

nt a

nd s

erve

r w

ere

need

ed.

1.12

T

ran

sact

ion

s a

nd

Se

ssio

ns

A s

essi

on i

s a

seri

es o

f in

tera

ctio

ns b

ack

and

forw

ard

betw

een

two

agen

ts,

usua

lly

a cl

ient

and

a

serv

er.

The

int

erac

tion

s ar

e pe

rfor

med

, on

e af

ter

each

oth

er i

n ti

me.

A s

essi

on o

ften

sta

rts

wit

h

iden

tifi

cati

on (

som

etim

es c

alle

d “l

ogin

” or

“bi

nd”)

,

afte

r th

at

som

etim

es

com

es

nego

tiat

ion

of

capa

bili

ties

, and

the

n a

num

ber

of t

rans

acti

ons,

and

fina

lly th

e en

d of

the

sess

ion.

Fig

ure

15

show

s th

e ba

sic

step

s an

d st

ate

chan

ges

whe

n tr

ansf

erri

ng a

n em

ail

mes

sage

fro

m

one

agen

t to

ano

ther

usi

ng t

he S

MT

P p

roto

col.

As

seen

in

the

figu

re, t

here

are

a n

umbe

r of

sta

tes,

and

in e

ach

stat

e, o

nly

cert

ain

com

man

ds a

re p

erm

itte

d.

Suc

h a

prot

ocol

is

call

ed a

sta

tefu

l pr

otoc

ol.

A

prot

ocol

whi

ch h

as n

o st

ates

is

call

ed a

sta

tele

ss

prot

ocol

.

32M

essa

ge H

andl

ing

1.St

ate:

Exp

ects

bin

d

Iden

tific

atio

nof

sen

der

Bin

d =

Iden

tific

atio

n of

clie

nt a

nd s

erve

r,ca

pabi

lity

nego

tiatio

n

2. S

tate

: E

xpec

ts s

ende

r

3. S

tate

: E

xpec

ts r

ecip

ient

Iden

tific

atio

nof

one

reci

pien

t

4. S

tate

: Exp

ects

mor

e re

cipi

ents

or c

onte

nt

Tran

smis

sion

of c

onte

nt

5. S

tate

: E

xpec

ts s

ende

r

End

of

sess

ion

Fig

ure

15: Se

ssio

n w

ith c

hang

es o

fst

ate

Thi

s fi

gure

sho

ws

(som

ewha

t si

mpl

ifie

d) t

hech

ange

s of

sta

te d

urin

g th

e tr

ansm

issi

on o

f an

e-

mai

l mes

sage

usi

ng t

he S

MT

P pr

otoc

ol.

1.In

the

initi

al s

tate

, the

ser

v eex

pect

s th

e cl

ient

to

iden

tifit

self

. No

othe

r co

mm

ands

are

allo

wed

in

this

sta

te.

2.W

hen

iden

tific

atio

n is

read

y, th

e se

rver

exp

ects

th

addr

ess

of t

he s

ende

r.3.

Whe

n th

e se

nder

has

bee

ngi

ven,

the

serv

er e

xpec

ts t

had

dres

s of

the

fir

st r

ecip

ien

4.W

hen

a re

cipi

ent

addr

ess

has

been

giv

en, t

he s

erve

rex

pect

s ei

ther

an

addi

tiona

lre

cipi

ent

addr

ess,

or

the

mes

sage

con

tent

.5.

Whe

n m

essa

ge c

onte

nt.

has

been

sen

t, an

othe

r m

essa

geca

n be

sen

t. T

his

is th

e sa

mst

ate

as s

tate

2.

Fina

lly,

a s

essi

on e

nds

wit

h a

com

man

d fr

om t

h ecl

ient

to

end

the

sess

ion.

Thi

s ca

n be

giv

en i

nan

y st

ate.

Sta

tefu

l pr

otoc

ols

allo

w s

ever

al i

nter

acti

ons

back

and

for

war

d du

ring

a s

essi

on.

Wit

h ot

her

prot

ocol

s, t

he c

lien

t se

nds

a tr

ansa

ctio

n, g

ets

a

resu

lt b

ack,

and

the

con

nect

ion

is t

hen

term

inat

ed.

Such

pro

toco

ls a

re s

tate

less

.

Compendium 4 page 16

Page 17: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

Wit

h so

me

prot

ocol

s, a

ses

sion

is

kept

ope

n

whi

le w

aiti

ng f

or t

he u

ser

to l

ook

at t

he r

esul

ts

from

pre

viou

s in

tera

ctio

ns.

Suc

h w

aits

can

be

very

long

. Thi

s is

cos

tly

for

a se

rver

, whi

ch m

ay h

ave

to

keep

man

y se

ssio

ns g

oing

at

the

sam

e ti

me.

To

redu

ce t

his

cost

, se

rver

s w

ill

ofte

n sh

ut d

own

the

sess

ion

if n

othi

ng h

as b

een

sent

for

a c

erta

in t

ime.

Thi

s m

axim

um w

ait t

ime

is c

alle

d a

tim

eout

. If

you

have

bee

n us

ing

FT

P,

you

may

hav

e no

tice

d th

at

man

y F

TP

ser

vers

wil

l te

rmin

ate

the

FT

P s

essi

on

auto

mat

ical

ly,

if y

ou d

o no

t se

nd a

ny c

omm

ands

for

a ce

rtai

n tim

e.

To

open

and

clo

se a

ses

sion

wil

l, ho

wev

er,

also

cost

net

wor

k an

d co

mpu

ter

reso

urce

s. I

f se

vera

l

inte

ract

ions

are

nee

ded

to p

erfo

rm a

n ac

tion

, it

may

be

less

cos

tly

to p

erfo

rm t

hem

aft

er e

ach

othe

r

in t

he s

ame

sess

ion.

But

if

the

wai

t ti

mes

are

too

long

, it

may

be

less

cos

tly

to a

bort

the

ses

sion

and

star

t a n

ew s

essi

on.

To

keep

a s

essi

on o

pen

and

wai

t for

a ti

meo

ut is

also

cos

tly

for

a se

rver

. It

is

mor

e ef

fici

ent,

if t

he

clie

nt s

ends

a c

omm

and

to a

bort

the

ses

sion

, so

that

the

serv

er n

eed

not w

ait f

or a

ny ti

meo

ut.

1.13

T

urn

-aro

un

d T

ime

,P

ipe

lin

ing

an

dW

ind

ow

ing

A p

roto

col

may

use

a s

essi

on w

ith

a nu

mbe

r of

inte

ract

ions

bac

k an

d fo

rwar

d be

twee

n cl

ient

and

serv

er. T

he c

lien

t ha

s to

wai

t fo

r th

e re

spon

se f

rom

the

serv

er o

n th

e pr

evio

us c

omm

and,

bef

ore

the

clie

nt c

an s

end

the

next

com

man

d. T

his

take

s ti

me,

34M

essa

ge H

andl

ing

beca

use

the

turn

-aro

und

tim

e to

sen

d a

com

man

d

and

get

the

resp

onse

bac

k ca

n be

one

or

seve

ral

seco

nds.

The

tot

al t

ime

for

a se

ssio

n w

ith

man

y

such

inte

ract

ions

will

then

be

long

.

Her

e ar

e th

ree

met

hods

to r

educ

e th

ese

dela

ys:

1.Sp

ecif

y m

ore

pow

erfu

l com

man

ds, w

here

mor

e is

done

in o

ne c

omm

and,

so

that

few

er in

tera

ctio

nsar

e ne

eded

.

2.O

pen

seve

ral p

aral

lel c

onne

ctio

ns. H

TT

P cl

ient

s(w

eb b

row

sers

) of

ten

keep

fou

r pa

ralle

lco

nnec

tions

for

dow

nloa

ding

the

diff

eren

t par

ts o

fa

web

pag

e (t

ext,

pict

ures

, app

lets

). T

oo m

any

para

llel c

onne

ctio

ns is

cos

tly in

res

ourc

es f

or b

oth

the

clie

nt a

nd s

erve

r, b

ut w

ith to

o fe

wco

nnec

tions

, dea

d tim

e m

ay o

ccur

whe

n th

e cl

ient

is w

aitin

g fo

r da

ta f

rom

all

the

conn

ectio

ns.

3.In

pro

tooc

ols

whi

ch u

se m

any

smal

l int

erac

tions

,su

ch a

s SM

TP

and

NN

TP,

the

dela

y ca

n be

use

dw

ith a

met

hod

calle

d pi

peli

ning

or

win

dow

ing.

Som

etim

es th

e w

ord

stre

amin

g is

als

o us

ed f

orth

is, a

lthou

gh th

e w

ord

stre

amin

g al

so h

as o

ther

uses

(se

e pa

ge 2

9).

By

pipe

lini

ng i

s m

eant

tha

t th

e cl

ient

can

sen

d th

e

next

com

man

d, w

itho

ut w

aiti

ng f

or t

he r

espo

nse

from

the

ser

ver

on t

he p

revi

ous

com

man

d w

ithi

n

the

sam

e se

ssio

n.

Com

man

ds

are

sent

an

d

resp

onse

s re

ceiv

ed a

sync

hron

ousl

y. T

his

is s

how

n

in F

igur

e 16

. (N

ote:

the

OK

res

pons

es a

re s

till

sen

t,

they

are

not

sho

wn

in t

he f

igur

e to

the

lef

t to

mak

e

the

figu

re l

ess

com

plex

.) T

he a

dvan

tage

wit

h th

is

met

hod

is t

hat

the

sess

ion

is p

erfo

rmed

fas

ter.

The

disa

dvan

tage

is

that

the

err

or h

andl

ing,

if

one

of

the

com

man

ds i

s no

t ac

cept

ed b

y th

e se

rver

, w

ill

be m

ore

com

plex

. A

lso,

dat

a m

ay h

ave

been

sen

t

unne

cces

sari

ly i

f a

prev

ious

com

man

d is

rej

ecte

d

by

the

serv

er.

Pip

elin

ing

is

som

etim

es

call

ed

win

dow

ing,

whe

n th

e nu

mbe

r of

com

man

ds s

ent

in

adva

nce

is l

imit

ed.

For

exa

mpl

e, i

f th

e w

indo

w

Compendium 4 page 17

Page 18: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

size

is

thre

e, t

his

mea

ns t

hat

afte

r se

ndin

g th

ree

com

man

ds,

the

clie

nt m

ust

wai

t fo

r re

spon

e to

the

firs

t co

mm

and

befo

re s

endi

ng t

he f

ourt

h co

mm

and.

Pip

elin

ing

shou

ld o

nly

be d

one

whe

n th

e pr

otoc

ol

spec

ific

atio

n ex

plic

itly

sa

ys

that

pi

peli

ning

is

allo

wed

.

C L I E N T

S E R V E R

C L I E N T

S E R V E R

Wai

t for

espo

nse

efor

een

ding

he n

ext

omm

and

Pip

elin

ing:

Sen

dco

mm

ands

with

out

wai

ting

for

resp

onse

Fig

ure

16: Pi

pelin

ing

redu

ces

the

time

TC

P

itse

lf

supp

orts

pi

peli

ning

. A

T

CP

conn

ecti

on i

s ac

tual

ly t

wo

pipe

s, o

ne i

n ea

ch

dire

ctio

n, a

nd t

he s

endi

ng p

roce

ss c

an s

end

mor

e

data

tha

n th

e re

ceiv

ing

proc

ess

can

hand

le.

TC

P

wil

l ju

st s

tore

the

ove

rflo

w,

but

can

also

tel

l th

e

send

ing

proc

ess

to m

ake

a br

eak

in s

endi

ng i

f it

s

buff

ers

are

full.

%%

%ch

eck

if th

is is

true

%%

%

1.14

T

erm

ina

tin

g a

Co

nn

ect

ion

A c

onne

ctio

n ca

n be

ter

min

ated

if

one

of t

he t

wo

part

ners

clo

ses

the

conn

ecti

on.

For

exa

mpl

e, t

he

serv

er c

an c

lose

the

conn

ecti

on if

ther

e ha

s be

en n

o

acti

vity

for

a c

erta

in t

imeo

ut p

erio

d. M

ore

reli

able

and

effi

cien

t is

if th

e cl

ient

sen

ds a

com

man

d to

the

serv

er, a

skin

g fo

r th

e co

nnec

tion

to b

e cl

osed

. Suc

h

36M

essa

ge H

andl

ing

com

man

ds a

re u

sual

ly n

amed

EX

IT o

r Q

UIT

. T

he

serv

er t

hen

clos

es t

he c

onne

ctio

n. T

he a

dvan

tage

wit

h th

is

is

that

bo

th

agen

ts

know

th

at

the

conn

ecti

on

is

clos

ed,

and

no

tim

eout

is

nece

ssar

y.F

igur

e 17

sho

ws

how

a s

essi

on i

s en

ded

for

HT

TP

1.0

(W

orld

Wid

e W

eb)

and

SM

TP

(e-

mai

l).

Not

e th

at i

n bo

th c

ases

, th

e se

rver

kno

ws

whe

n to

end

the

con

nect

ion,

and

no

tim

eout

is

need

ed.

QU

ITQ

UIT

Get

info

<G

ET

>[<

GE

T>

]<

GE

T>

<In

fo>

[<In

fo>

]<

Info

><

Info

>

U S E R

A P P L

T C P

A P P L

T C P

Clo

seco

nnec

tion

TC

P-c

lose

EO

F

HTT

P 1

.0 G

ET O

per

atio

n

Sen

d a

mes

sage

250,

250,

250,

250

U S E R

A P P L

T C P

A P P L

T C P

SMTP

Sen

din

g a

mes

sage

HE

LO,

MA

IL F

RO

M,

RC

PT

TO

,D

AT

A

[HE

LO],

[MA

IL F

RO

M],

[RC

PT

TO

],[D

AT

A]

HE

LO,

MA

IL F

RO

M,

RC

PT

TO

,D

AT

A

250,

250,

250,

250

250,

250,

250,

250

TC

P C

lose

TC

P C

lose

QU

IT

Fig

ure

17:

How

a c

onne

ctio

n is

end

ed in

HT

TP

1.0

and

SMT

P. I

n H

TT

P, th

e se

rver

clo

ses

the

conn

ectio

n as

soo

n as

TC

P te

lls it

that

all

data

has

been

tran

sfor

med

. In

SMT

P, th

e cl

ient

sen

ds a

QU

IT c

omm

and,

whi

ch te

lls th

e se

rver

to c

lose

the

conn

ectio

n. N

ote

that

no

timeo

ut w

ait i

s ne

cess

ary

in e

ither

cas

e.

1.15

S

tre

am

ing

By

stre

amin

g is

mea

nt t

hat

resu

lts

are

disp

laye

d fo

r

the

user

, be

fore

all

dat

a ha

s be

en r

ecei

ved

by t

he

user

age

nt.

For

exa

mpl

e, t

he t

op o

f a

web

pag

e

may

be

show

n, w

hile

con

tinu

ing

to d

ownl

oad

the

rest

of

the

page

.

Compendium 4 page 18

Page 19: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

The

wor

d st

ream

ing

is a

lso

som

etim

es u

sed

in

anot

her

mea

ning

, to

den

ote

wha

t in

thi

s bo

ok i

s

call

ed p

ipel

inin

g (s

ee p

age

28).

The

con

cept

s ar

e

conn

ecte

d, s

ince

pip

elin

ing

is a

way

of

assi

stin

g

stre

amin

g.

In o

rder

to

mak

e st

ream

ing

wor

k be

tter

, the

use

r

agen

t m

ust

get

a su

itab

le s

elec

tion

of

data

whi

ch

give

s in

itia

l in

form

atio

n ab

out

a w

eb p

age.

Thi

s is

one

reas

on w

hy w

eb b

row

sers

oft

en o

pen

seve

ral

conn

ecti

ons

at

the

sam

e ti

me;

th

ey

can

then

dow

nloa

d im

ages

in

the

begi

nnin

g of

a l

arge

web

page

, w

itho

ut w

aiti

ng f

or t

he e

nd o

f th

e do

wnl

oad

of th

e bo

ttom

of

the

web

pag

e.

The

cre

ator

of

data

can

all

evia

te s

trea

min

g by

send

ing

data

in

a fo

rmat

whe

re s

ome

data

can

be

disp

laye

d be

fore

all

has

arr

ived

. F

or e

xam

ple,

man

y w

eb b

row

sers

are

not

abl

e to

sho

w a

HT

ML

tabl

e be

fore

the

end

of

the

tabl

e. E

ven

if t

hey

can

disp

lay

the

tabl

e ea

rlie

r, n

ew i

nfor

mat

ion

late

r in

the

tabl

e m

ay f

orce

the

m t

o re

form

at t

he t

able

. It

is

ther

efor

e of

ten

not

good

to

star

t a

web

pag

e w

ith

a

long

tab

le.

Inst

ead,

one

can

hav

e tw

o ta

bles

, a

smal

l ta

ble

for

the

firs

t w

indo

w,

an d

one

or

mor

e

addi

tiona

l tab

les

for

the

rest

of

the

web

pag

e.

Ano

ther

met

hod

to s

uppo

rt s

trea

min

g, w

hich

is

used

wit

h pi

ctur

es,

is t

o fi

rst

send

for

exa

mpl

e

ever

y 4t

h li

ne,

then

ev

ery

2nd

line

, th

en

the

rem

aini

ng l

ines

. T

his

allo

ws

the

web

bro

wse

r to

show

a p

ictu

re f

irst

in

a m

ore

blur

red

form

, an

d

then

sha

rper

and

sha

rper

. Thi

s m

etho

d of

sen

ding

a

pict

ure

is c

alle

d in

terl

acin

g.

38M

essa

ge H

andl

ing

In s

ome

case

s, i

t is

not

pos

sibl

e to

kno

w t

he f

ull

cont

ent

of a

doc

umen

t, w

hile

sen

ding

it.

Whe

n, f

or

exam

ple,

sen

ding

sim

ulta

neou

s vi

deo

and

audi

o,

the

end

of t

he s

endi

ng h

as n

ot y

et h

appe

nded

whe

n

the

begi

nnin

g is

sen

t. I

n su

ch a

cas

e, r

ecip

ient

s

ofte

n w

ant

to s

ee t

he b

egin

ning

of

the

broa

dcas

t

befo

re i

ts e

nd,

and

then

str

eam

ing

prot

ocol

s ar

e

need

ed.

1.16

In

term

ed

iari

es

Som

etim

es,

the

com

mun

icat

ion

betw

een

two

agen

ts i

s pa

ssed

thr

ough

one

or

mor

e in

term

edia

ry

com

pute

rs.

The

re

are

thre

e m

ain

kind

s of

inte

rmed

iari

es (

Figu

re 1

8):

Tunn

el

Pro

xy

Gat

eway

Clie

ntS

erve

r

Fig

ure

18: In

term

edia

ries

bet

wee

n cl

ient

and

ser

ver

Pro

xy:

A s

erve

r w

hich

cac

hes

data

, so

tha

t

som

e re

ques

ts f

rom

the

cli

ent

can

be a

nsw

ered

dire

ctly

by

the

prox

y. P

roxi

es a

lso

som

etim

es

cont

rols

and

lim

its

wha

t is

all

owed

for

var

ious

regu

lato

ry o

r se

curi

ty r

easo

ns.

Suc

h re

gula

ting

prox

ies

are

calle

d fi

rew

alls

.

Compendium 4 page 19

Page 20: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

Gat

eway

: A

gat

eway

usu

ally

is

plac

ed b

etw

een

two

netw

orks

wit

h di

ffer

ent

tech

nolo

gy,

and

may

tran

sfor

m th

e fo

rmat

of

data

fro

m th

e fo

rmat

in o

ne

of t

he n

etw

orks

to

the

form

at i

n th

e ot

her

netw

ork.

For

exa

mpl

e, a

n in

tern

al e

-mai

l ne

twor

k w

ithi

n a

com

pany

may

use

a n

umbe

r of

enh

ance

d fe

atur

es,

whi

ch h

ave

to b

e m

appe

d on

the

mor

e re

stri

cted

Inte

rnet

mai

l pr

otoc

ols

for

mai

l go

ing

outs

ide

the

com

pany

.

Tu

nn

el:

A t

unne

l is

a w

ay o

f tr

ansf

erri

ng

info

rmat

ion

betw

een

two

netw

orks

, whi

ch b

oth

use

the

sam

e pr

otoc

ol,

thro

ugh

som

e in

term

edia

te

netw

ork

whi

ch

uses

an

othe

r pr

otoc

ol.

At

the

entr

ance

of

the

tunn

el,

the

data

is

tran

sfor

med

to

a

form

at w

hich

is o

nly

used

for

tran

spor

tati

on. A

t the

end

of t

he t

unne

l, da

ta i

s tr

ansf

orm

ed b

ack

agai

n to

the

orig

inal

for

mat

.

The

adv

anta

ge w

ith

tunn

els,

as

com

pare

d to

gate

way

s, i

s th

at n

o in

form

atio

n is

los

t. W

ith

gate

way

s, o

nly

the

info

rmat

ion

supp

orte

d by

the

prot

ocol

s at

bot

h si

des

can

be c

onve

yed.

The

adva

ntag

e w

ith

gate

way

s, a

s co

mpa

red

to t

unne

ls,

is

that

yo

u ca

n re

ach

agen

ts

usin

g di

ffer

ent

prot

ocol

s at

eith

er e

nd.

1.17

N

am

es

an

d A

dd

ress

es

Uni

que

nam

es a

re u

sefu

l in

man

y ap

plic

atio

ns. T

he

two

mos

t w

ell-

kow

n un

ique

nam

es o

n th

e In

tern

et

are

WW

W a

ddre

sses

(U

RL

s) a

nd e

-mai

l ad

dres

ses.

Bot

h of

the

m a

re g

loba

lly

uniq

ue. B

y th

is i

s m

eant

that

no

two

obje

cts

on t

he I

nter

net

have

the

sam

e

nam

e. B

oth

are

also

add

ress

es, t

hey

can

be u

sed

to

40M

essa

ge H

andl

ing

loca

te a

n ob

ject

. The

WW

W a

ddre

ss is

use

d to

fin

d

a w

eb p

age,

the

e-m

ail

addr

ess

to s

end

a m

essa

ge

to it

s re

cipi

ent.

Som

etim

es,

abbr

evia

ted

nam

es a

re u

sed

whi

ch

are

uniq

ue o

nly

wit

hin

a ce

rtai

n do

mai

n. F

or

exam

ple,

if

a

user

ha

s th

e e-

mai

l ad

dres

s

“joh

np@

hone

ymel

ons.

com

”, t

hen

a m

essa

ge i

nsid

e

this

com

pany

nee

d so

met

imes

onl

y be

add

ress

ed t

o

“joh

np”.

It

is

ho

wev

er

advi

sabl

e to

tr

ansl

ate

“joh

np”

to “

john

p@ho

neym

elon

s.co

m”

as s

oon

as

poss

ible

. O

ther

wis

e th

ere

is a

ris

k th

at a

mes

sage

wit

h “T

o: j

ohnp

” in

the

hea

der

gets

ina

dver

tent

ly

sent

to

the

Inte

rnet

, w

here

the

sho

rt,

non-

uniq

ue

form

of

the

addr

ess

will

not

wor

k.

Uni

que

nam

es a

re u

sed

to l

ocat

e ob

ject

s, b

ut

they

are

als

o us

ed t

o re

cogn

ize

dupl

icat

es o

f th

e

sam

e ob

ject

, as

too

ls f

or t

raci

ng t

he t

rans

fer

of

obje

cts,

and

as

tool

s fo

r ha

ndli

ng l

inks

bet

wee

n

obje

cts.

1.17

.1

Do

ma

in m

eth

od

of

all

oca

tin

g g

lob

all

y u

niq

ue

na

me

s

Mos

t un

ique

nam

es a

re c

reat

ed u

sing

som

e ki

nd o

f

hie

rarc

hic

al

stru

ctu

re.

Th

e d

om

ain

n

ame

“dcs

.ed.

ac.u

k”

repr

esen

ts

for

exam

ple

the

depa

rtm

ent

of

com

pute

r sc

ienc

e (d

cs),

w

ithi

n

Edi

nbur

gh U

nive

rsit

y (e

d) w

ithi

n th

e ac

adem

ic

com

mun

ity

(ac)

wit

hin

the

Uni

ted

Kin

gdom

(uk

).

And

an

e-m

ail

addr

ess

“mar

y.sm

ith@

dcs.

ed.a

c.uk

Compendium 4 page 20

Page 21: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

can

repr

esen

t a

cert

ain

pers

on

wit

hin

this

depa

rtm

ent.

Oth

er E

dinb

urgh

univ

ersi

ty d

epar

tmen

ts

The

wor

ld

dcs.

ed.a

c.uk

(Dep

artm

ent o

f C

ompu

ter

Scie

nce)

uk (

Uni

ted

Kin

gdom

)O

ther

cou

ntri

es

ac.u

k (

Aca

dem

ic C

omm

unity

)O

ther

com

mun

ities

with

inth

e U

nite

d K

ingd

om

ed.a

c.uk

(E

dinb

urgh

Uni

vers

ity)

Oth

er u

nive

rsiti

es w

ithin

the

Uni

ted

Kin

gdom

Mar

y.Sm

ith@

dcs.

ed.a

c.uk

Mar

y Sm

ith a

t the

dcs

dep

artm

ent)

Oth

er e

-mai

l add

ress

esat

the

dcs

depa

rtm

ent

Fig

ure

19: H

iera

rchi

cal d

omai

n na

mes

The

mai

n ad

vant

age

wit

h su

ch h

iera

rchi

cal

nam

es is

that

the

crea

tion

of

new

uni

que

nam

es c

an

be

dece

ntra

lise

d.

The

“d

cs”

depa

rtm

ent

at

Edi

nbur

gh

Uni

vers

ity

can

itse

lf

crea

te

new

glob

ally

uni

que

nam

es, a

s lo

ng a

s th

ese

nam

es e

nd

wit

h “d

cs.e

d.ac

.uk”

. A

nd E

dinb

urgh

uni

vers

ity

can

itse

lf

crea

te

glob

ally

un

ique

na

mes

fo

r it

s

depa

rtm

ents

, as

lo

ng

as

the

nam

es

end

wit

h

“ed.

ac.u

k”.

Hie

rarc

hica

l na

mes

are

als

o ea

sy t

o un

ders

tand

for

hum

ans

and

can

be u

sed

to g

uide

the

loo

kup

of

a na

me

in

a da

ta

base

. If

th

e na

mes

ar

e

42M

essa

ge H

andl

ing

hier

arch

ical

, th

e da

ta b

ase

can

be d

istr

ibut

ed.

For

exam

ple,

one

ser

ver

can

stor

e na

mes

in

Edi

nbur

gh

univ

ersi

ty,

the

end

of t

he d

omai

n na

me

“ed.

ac.u

k”

can

then

di

rect

a

retr

ieva

l to

th

e E

dinb

urgh

univ

ersi

ty n

ame

data

bas

e. T

his

mea

ns t

hat

no

inef

fici

ent

mul

tica

stin

g is

nec

essa

ry t

o lo

ok u

p a

dom

ain

nam

e.

Idea

lly,

a n

ame

shou

ld b

e un

ique

not

onl

y in

spac

e bu

t al

so i

n ti

me.

No

obje

ct s

houl

d ev

er g

et

the

sam

e na

me

as a

noth

er o

bjec

t al

read

y ha

s ha

d.

Sch

emes

whi

ch e

nsur

e th

is i

s ho

wev

er n

ot m

uch

used

. T

here

is

one

such

sch

eme

call

ed O

bjec

t

Iden

tifi

ers.

It

is s

omet

imes

use

d to

cre

ate

uniq

ue

nam

es f

or d

ata

type

s an

d ot

her

prot

ocol

ele

men

ts.

The

Mes

sage

-ID

s of

e-m

ail

mes

sage

s ar

e us

uall

y

also

des

igne

d to

be

uniq

ue i

n bo

th t

ime

and

spac

e.

For

obj

ect

iden

tifi

ers,

the

uni

quen

ess

is g

uara

ntee

d

by t

he r

egis

trat

ion

proc

edur

e. F

or e

-mai

l M

essa

ge-

IDs,

the

tim

e w

hen

the

mes

sage

was

sen

t is

usu

ally

incl

uded

in

the

Mes

sage

-ID

, to

ens

ure

that

it

wil

l

be u

niqu

e fo

r an

unl

imite

d tim

e in

the

futu

re.

Fo

r o

rdin

ary

w

eb

add

ress

es,

ho

wev

er,

uniq

uene

ss i

n ti

me

is n

ot g

uara

ntee

d. I

f a

com

pany

nam

ed S

unsh

ine

Flo

wer

s ob

tain

s a

dom

ain

nam

e

sf.c

om,

and

that

com

pany

goe

s ba

nkru

pt,

the

nam

e

sf.c

om m

ay b

e so

ld t

o a

com

pany

nam

ed S

exy

Fee

t. A

nd t

his

may

be

emba

rras

sing

to

thos

e w

ho

have

put

link

s to

ww

w.s

f.co

m in

thei

r w

eb p

ages

.

Som

e co

mm

on t

ypes

of

uniq

ue n

ames

on

the

Inte

rnet

are

:

Compendium 4 page 21

Page 22: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

Nam

e ty

peU

sage

Exa

mpl

e

Dom

ain

nam

esN

ames

of

orga

nisa

tions

, hos

ts a

ndse

rver

sdc

s.ed

.ac.

uk, w

ww

.dcs

.ed.

ac.u

k

E-m

ail a

ddre

sses

Nam

es o

f e-

mai

l sen

ders

and

reci

pien

tsM

ary.

Smith

@dc

s.ed

u.ac

.uk

Mes

sage

-ID

sN

ames

of

e-m

ail m

essa

ges.

Use

d to

reco

gniz

e du

plic

ates

, and

han

dle

repl

y lin

ks b

etw

een

mes

sage

s.

1999

-07-

23-1

3164

3*M

ary.

Smi

th@

dcs.

edu.

ac.u

k

IP a

ddre

ssPh

ysic

al a

ddre

ss o

f a

host

[129

.215

.160

.98]

(IP

add

ress

esar

e ac

tual

ly 3

2-bi

t wor

ds in

IP4

and

128-

bit w

ords

in I

P6, b

utth

ey a

re u

sual

ly w

ritte

n in

the

form

at a

bove

)

UR

LA

com

bina

tion

nam

e w

hich

can

be

used

for

mos

t kin

ds o

f na

mes

, but

mos

tly u

sed

as a

ddre

sses

for

web

page

s

http

://cm

c.ds

v.su

.se/

iaps

/

1.17

.2

Ha

sh C

od

e M

eth

od

of

Cre

ati

ng

Glo

ba

lly

Un

iqu

eN

am

es

An

alte

rnat

ive

to t

he d

omai

n m

etho

d is

to

give

an

obje

ct a

glo

ball

y un

ique

nam

e by

com

puti

ng a

has

h

code

of

the

obje

ct.

The

re a

re m

etho

ds o

f cr

eati

ng

hash

cod

es,

whi

ch w

ill

mak

e it

ver

y un

prob

able

that

tw

o di

ffer

ent

obje

cts

wil

l ha

ve t

he s

ame

hash

code

. T

he m

ostl

y us

ed m

etho

d fo

r th

is i

s th

e M

D5

[2]

algo

rith

m f

or c

ompu

ting

hash

cod

es.

Not

e th

at a

has

h co

de w

ill

only

ind

icat

e th

at

two

obje

cts

are

iden

tica

l, no

t th

at t

hey

are

the

sam

e

obje

ct. B

efor

e co

mpu

ting

a h

ash

code

, obj

ects

nee

d

som

etim

es b

e co

nver

ted

to a

can

onic

al f

orm

at.

For

44M

essa

ge H

andl

ing

exam

ple,

dif

fere

nt p

latf

orm

s in

dica

te l

ine

brea

ks i

n

a te

xt w

ith

eith

er o

nly

Car

riag

e R

etur

n (C

R),

onl

y

Lin

e F

eed

(LF

), o

r a

Car

riag

e R

etur

n fo

llow

ed b

y a

Lin

e F

eed

(CR

LF

). T

he h

ash

code

may

be

diff

eren

t

if

two

docu

men

ts

diff

er

in

thei

r en

d-of

-lin

e

enco

ding

s. T

he a

lgor

ithm

may

the

n sp

ecif

y th

at a

ll

line

bre

aks

mus

t be

con

vert

ed t

o C

RL

F b

efor

e

com

putin

g th

e ha

sh c

ode.

Impo

rtan

t is

als

o w

hich

par

ts o

f an

obj

ect

is

incl

uded

w

hen

com

puti

ng

a ha

sh

code

. F

or

exam

ple,

if

the

hash

cod

e is

onl

y co

mpu

ted

on t

he

body

of

a m

essa

ge,

two

mes

sage

s w

ith

diff

eren

t

head

ers

may

be

rega

rded

as

iden

tica

l, w

hich

can

caus

e se

riou

s pr

oble

ms.

E

xam

ple:

A

m

essa

ge

sayi

ng “

Oh,

I l

ove

him

” m

ay h

ave

a ve

ry d

iffe

rent

mea

ning

if

it i

s a

repl

y to

a m

essa

ge s

ayin

g “D

o

you

like

Sad

dam

Hus

sein

” or

to

a m

essa

ge s

ayin

g

“Do

you

like

Jesu

s”.

1.17

.3

Use

r-fr

ien

dly

Na

me

s

Nam

es w

hich

are

glo

ball

y un

ique

wil

l ne

ver

be

quit

e us

er f

rien

dly.

We

are

accu

stom

ed t

o re

ferr

ing

to

obje

cts

wit

h no

n-un

ique

na

mes

li

ke

“Eli

za

Cla

rk”

or “

The

Lon

don

offi

ce”.

Peo

ple

wil

l st

ill

know

wha

t w

e m

ean,

bec

ause

of

the

cont

ext

in

whi

ch w

e us

e th

em.

Com

pute

rs a

re n

ot v

ery

good

at t

hat

capa

bili

ty.

Inst

ead,

obj

ects

on

the

Inte

rnet

ofte

n ha

ve t

wo

nam

es,

one

user

-fri

endl

y na

me

and

one

glob

ally

uni

que

nam

e. T

he h

eade

r of

an

e-m

ail

mes

sage

may

for

exa

mpl

e lo

ok l

ike

in F

igur

e 20

Compendium 4 page 22

Page 23: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

wit

h bo

th a

use

r-fr

iend

ly n

ame

and

a gl

obal

ly

uniq

ue e

-mai

l ad

dres

s. A

noth

er e

xam

ple

is t

he t

itle

of a

web

pag

e (u

ser

frie

ndly

) an

d it

s U

RL

(gl

obal

ly

uniq

ue).

From: Eliza Clark <[email protected]>

Use

r-fr

iend

ly n

ame

Glo

bally

uni

que

e-m

ail a

ddre

ss

Fig

ure

20: U

se o

f na

mes

in e

-mai

l hea

ders

1.18

T

he

Do

ma

in N

am

eS

yste

m

The

Dom

ain

Nam

e S

yste

m (

DN

S)

is a

dis

trib

uted

data

bas

e w

hich

con

vert

s do

mai

n na

mes

to

IP

addr

esse

s. F

or e

xam

ple,

you

can

ask

the

DN

S f

or

the

IP a

ddre

ss o

f th

e do

mai

n “d

cs.e

d.ac

.uk”

and

the

DN

S w

ill r

etur

n “[

129.

215.

160.

98]”

.

The

DN

S i

s pr

obab

ly t

he m

ost

com

mon

ly u

sed

serv

ice

on t

he I

nter

net.

Whe

neve

r a

prog

ram

use

s a

dom

ain

nam

e, f

or e

xam

ple

“htt

p://

ww

w.e

d.ac

.uk/

or “

eliz

ac@

foo.

bar.

net”

the

fir

st s

tep

in l

ocat

ing

the

obje

ct

is

to

conv

ert

the

dom

ain

nam

e

“ww

w.e

d.ac

.uk”

in

th

e fi

rst

exam

ple,

an

d

“foo

.bar

.net

” in

the

sec

ond

exam

ple,

to

an I

P

addr

ess.

Thi

s IP

add

ress

is

then

use

d to

phy

sica

lly

loca

te t

he h

ost

whe

n a

netw

ork

conn

ecti

on i

s

esta

blis

hed

to r

etri

eve

a w

eb p

age

or s

end

an e

-

mai

l mes

sage

.

Fig

ure

21

show

s a

smal

l ex

cerp

t fr

om

the

conc

eptu

al s

truc

ture

of

the

DN

S.

Con

cept

uall

y, a

sear

ch f

or t

he I

P a

ddre

ss o

f w

ww

.dsv

.su.

se s

houl

d

star

t at

the

int

erna

tion

al r

oot

serv

er,

and

then

be

46M

essa

ge H

andl

ing

dire

cted

fro

m t

here

to

the

.se

serv

er,

from

the

re t

o

the

.su.

se s

erve

r, a

nd f

rom

the

re t

o th

e .d

sv.s

u.se

serv

er.

In

real

ity,

it

do

es

not

wor

k li

ke

that

,

othe

rwis

e th

e to

p-le

vel

nam

e se

rver

s w

ould

be

seve

rly

over

load

ed.

Thr

ough

cac

hing

, m

ost

nam

e

serv

ers

have

cac

hes

of t

he a

ddre

sses

of

the

mos

t

com

mon

ly u

sed

nam

es,

incl

udin

g th

e to

p-le

vel

dom

ains

, so

tha

t th

ey o

nly

have

to

conn

ect

to t

he

top-

leve

l do

mai

ns o

nce

or t

wic

e a

day

to v

erif

y

thei

r ca

che.

Als

o, a

ll n

ame

serv

ers

are

at l

east

dupl

icat

ed,

so t

hat

if o

ne i

s do

wn,

ano

ther

can

repl

ace

it.

The

hig

her

leve

l na

me

serv

ers

are

dupl

icat

ed m

ore

than

tw

ice

to d

istr

ibut

e lo

ad a

nd

incr

ease

rel

iabi

lity.

Inte

rnat

iona

lR

oot s

erve

r

U.S

. ser

ver

for

the

.us

dom

ains

U.S

. mili

tary

ser

ver

for t

he .m

il do

mai

ns

DSV

dep

artm

ent

for t

he .d

sv.s

u.se

dom

ains

Stoc

khol

m U

nive

rsit

yfo

r the

.su.

.se

dom

ains

Mic

roso

ft s

erve

rfo

r th

e .m

icro

soft

.com

dom

ains

Swed

ish

serv

erfo

r th

e .s

e do

mai

nsIn

tern

atio

nal s

erve

rfo

r th

e .c

om d

omai

ns

Fig

ure

21: C

once

ptua

l str

uctu

re o

f th

e D

NS

serv

er h

iera

rchy

Compendium 4 page 23

Page 24: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

The

DN

S u

ses

cach

ing,

cha

inin

g an

d re

ferr

al

(See

“C

hain

ing,

R

efer

ral

or

Mul

tica

stin

g”

on

page

20)

.

One

pe

culi

ar

effe

ct

has

been

th

at

smal

l

coun

trie

s w

ith

few

In

tern

et

serv

ers

sell

th

eir

dom

ain

nam

e st

ruct

ures

to

orga

nisa

tion

s ou

tsid

e

thei

r ow

n co

untr

y. A

wel

l-kn

own

exam

ple

is t

he

“.nu

” do

mai

n, w

hich

bel

ongs

to

the

smal

l Is

land

Niu

e in

the

Pac

ific

, bu

t it

s do

mai

n na

mes

hav

e

beco

me

popu

lar

all

over

the

wor

ld,

I m

ysel

f ha

ve

regi

ster

ed th

e do

mai

n “p

alm

e.nu

” fo

r m

y fa

mily

.

1.18

.1

MX

re

cord

s

The

DN

S a

ctua

lly

cons

ists

of

two

data

bas

es

stor

ed t

oget

her.

The

se t

wo

data

bas

es a

re s

o-ca

lled

A r

ecor

ds, w

hich

are

use

d to

loo

k up

Int

erne

t ho

sts

(for

exa

mpl

e to

loc

ate

a w

eb p

age)

and

MX

(M

ail

Exc

hang

e) r

ecor

ds,

whi

ch a

re u

sed

to l

ook

up e

-

mai

l se

rver

s. T

he r

easo

n fo

r th

is i

s th

at c

ompa

nies

ofte

n ha

ve a

cen

tral

e-m

ail

serv

er,

to w

hich

all

e-

mai

l is

to

be d

eliv

ered

, and

whi

ch m

ay b

e di

ffer

ent

than

hos

ts u

sed

for

othe

r se

rvic

es.

E-m

ail

is a

lso

som

etim

es s

ent

thro

ugh

stor

e-

and-

forw

ard.

Thu

s, a

loo

k-up

in

the

DN

S f

or a

n

MX

rec

ord

may

ret

urn

a li

st o

f ho

sts

in a

pri

orit

y

orde

r. T

he m

ail

shou

ld b

e se

nt t

o th

e fi

rst

host

, but

if t

his

host

is

not

avai

labl

e, m

ail

can

be s

ent

to t

he

seco

nd

host

. T

he

seco

ndar

y ho

st

wil

l us

uall

y

forw

ard

the

mai

l to

the

pri

mar

y ho

st,

but

it m

ay

48M

essa

ge H

andl

ing

also

hav

e th

e ca

pabi

lity

to

deli

ver

the

mai

l to

its

fina

l rec

ipie

nt.

Sin

ce e

-mai

l ca

n be

sen

t th

roug

h st

ore-

and-

forw

ard,

a

serv

er

whi

ch

rece

ives

an

e-

mai

l

mes

sage

may

con

tact

ano

ther

ser

ver

and

send

the

mes

sage

to

this

ser

ver.

To

find

out

whi

ch h

ost

to

send

the

mes

sage

to,

the

int

erm

edia

te s

erve

r m

ay

also

use

the

DN

S.

The

re i

s th

en a

ris

k th

at a

n e-

mai

l m

essa

ge w

ill

be s

ent

for

ever

in

a lo

op b

ack

and

forw

ard

betw

een

two

serv

ers.

If

serv

er A

fin

ds

that

B c

an h

andl

e th

is e

-mai

l ad

dres

s, a

nd s

erve

r B

find

s th

at A

can

han

dle

this

e-m

ail

addr

ess,

suc

h a

loop

may

occ

ur.

To

avoi

d th

is,

the

DN

S a

ssig

ns a

prio

rity

to

each

MX

rec

ord.

Sup

pose

you

loo

k fo

r

the

MX

rec

ord

for

the

dom

ain

“cs.

edu.

ac.u

k”.

You

may

then

be

give

n th

e fo

llow

ing

tabl

e:

8 m

ailr

elay

.ed.

ac.u

k

5 m

ailh

ub.d

cs.e

d.ac

.uk

6 m

h2.d

cs.e

d.ac

.uk

A h

ost

is n

ot a

llow

ed t

o fo

rwar

d a

mes

sage

from

a h

ost

wit

h a

low

er p

rior

ity

num

ber

in t

he

DN

S t

o a

host

wit

h a

high

er p

rior

ity

num

ber.

Thi

s

will

pro

tect

aga

inst

mai

l for

war

ding

loop

s.

1.19

T

op

-le

ve

l d

om

ain

s

The

top

-lev

el d

omai

ns (

the

last

dom

ain

in t

he

dom

ain

nam

es

in

DN

S)

are

eith

er

two-

lett

er

coun

try

code

s or

m

ore-

than

-tw

o-le

tter

co

des

repr

esen

ting

dif

fere

nt k

inds

of

orga

nisa

tion

s. F

or

exam

ple,

the

dom

ain

nam

e “d

sv.s

u.se

” ha

s th

e to

p

leve

l do

mai

n “s

e” w

hich

is

the

coun

try

code

for

Compendium 4 page 24

Page 25: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

Sw

eden

, an

d th

e do

mai

n na

me

“iet

f.or

g” h

as t

he

top

leve

l do

mai

n “o

rg”

whi

ch r

epre

sent

s no

n-

com

mer

cial

org

aniz

atio

ns.

Eac

h co

untr

y ha

s an

org

anis

atio

n re

spon

sibl

e

for

dist

ribu

ting

the

sec

ond-

leve

l do

mai

n na

mes

for

that

cou

ntry

. It

is

thus

thi

s or

gani

sati

on,

whi

ch h

as

give

n S

tock

holm

Uni

vers

ity

the

dom

ain

nam

e

“su.

se”.

The

co

untr

y co

des

are

take

n fr

om

an

ISO

stan

dard

for

cou

ntry

cod

es,

the

sam

e co

des

whi

ch

are

used

in

car

regi

stra

tion

num

bers

, in

tern

atio

nal

post

al c

odes

, et

c. T

here

is

how

ever

one

exc

epti

on.

The

Uni

ted

Kin

gdom

of

Gre

at B

rita

in h

as t

he I

SO

coun

try

code

“gb

” bu

t ha

s he

DN

S t

op l

evel

dom

ain

“uk”

. B

oth

are

abbr

evia

tion

s of

dif

fere

nt

part

s of

the

ful

l co

untr

y na

me.

The

rea

son

for

this

is h

isto

rica

l. It

sta

rted

tha

t w

ay, a

nd i

t w

as t

hen

too

late

to

chan

ge i

t. P

ossi

bly

the

diff

eren

ce i

s al

so

beca

use

this

cou

ntry

is

know

n fo

r w

anti

ng t

o do

thin

gs t

heir

ow

n w

ay.

The

y dr

ive

to t

he l

eft,

whi

le

mos

t ot

her

Eur

opea

n co

untr

ies

driv

e to

the

rig

ht,

and

for

a lo

ng t

ime

they

ins

iste

d in

wri

ting

the

dom

ains

in

re

vers

e or

der.

T

hus,

in

stea

d of

“dcs

.ed.

ac.u

k” t

hey

wro

te “

uk.a

c.ed

.dcs

”. T

his

caus

ed l

ots

of p

robl

ems,

but

now

aday

s th

ey h

ave

chan

ged

to t

he s

ame

orde

r as

the

res

t of

the

wor

ld

uses

. The

set

of

mor

e-th

an-t

wo

lett

er c

odes

may

be

exte

nded

. H

ere

are

the

code

s de

cide

d on

whe

n th

is

is w

as w

ritte

n (J

uly

1999

):

OM

com

mer

cial

ent

ities

DU

4-ye

ar c

olle

ges

and

univ

ersi

ties

50M

essa

ge H

andl

ing

NE

Tor

gani

zatio

ns d

irec

tly in

volv

ed in

Int

erne

t ope

ratio

ns, s

uch

as n

etw

ork

prov

ider

san

d ne

twor

k in

form

atio

n ce

nter

s

OR

Gm

isce

llane

ous

orga

niza

tions

that

don

't fi

t any

oth

er c

ateg

ory,

suc

h as

non

-pr

ofit

grou

ps

GO

VU

nite

d St

ates

Fed

eral

Gov

ernm

ent e

ntiti

es

MIL

Uni

ted

Stat

es m

ilita

ry

FIR

MB

usin

esse

s

STO

RE

Onl

ine

stor

es a

nd m

alls

WE

BW

eb-r

elat

ed o

rgan

izat

ions

AR

TS

Cul

tura

l and

ent

erta

inm

ent o

rgan

izat

ions

RE

CR

ecre

atio

nal o

rgan

izat

ions

like

par

k di

stri

cts

INFO

Org

aniz

atio

ns w

ho a

re p

rim

arily

sou

rces

of

info

rmat

ion

NO

MIn

divi

dual

, per

sona

l dom

ain

nam

es

The

fir

st s

even

dom

ains

in

this

lis

t ha

ve b

een

in

use

for

man

y ye

ars,

the

res

t ha

ve b

een

adde

d

rece

ntly

.

The

dis

trib

utio

n of

dom

ain

nam

es e

ndin

g in

thes

e ge

neri

c do

mai

ns

has

been

a

very

cont

rove

rsia

l po

liti

cal

issu

e. A

new

org

aniz

atio

n,

ICA

NN

(T

he I

nter

net

Cor

pora

tion

for

Ass

igne

d

Nam

es a

nd N

umbe

rs )

. IC

AN

N w

ill

also

tak

e ov

er

the

task

s of

reg

istr

atio

n of

oth

er k

inds

of

uniq

ue

iden

tifi

ers

for

Inte

rnet

, w

hich

pr

evio

usly

w

as

perf

orm

ed b

y IA

NA

(In

tern

et A

ssig

ned

Num

bers

Aut

hori

ty).

W

hen

you

read

th

is,

ICA

NN

ha

s

prob

ably

alr

eady

sta

rted

ope

rati

on a

nd t

aken

ove

r

the

task

s of

IA

NA

.

1.2

0

Th

e o

ld v

ers

us

ne

wp

rob

lem

Inte

rnet

pro

toco

ls a

re u

sual

ly i

mpl

emen

ted

on

man

y di

ffer

ent

com

pute

rs,

whi

ch a

re c

o-w

orki

ng

usin

g th

e pr

otoc

ols.

Thi

s ca

n m

ake

it d

iffi

cult

to

upda

te t

o ne

w v

ersi

ons

of t

he p

roto

cols

. S

uch

a

Compendium 4 page 25

Page 26: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

chan

ge m

ight

req

uire

tha

t al

l th

e ex

isti

ng a

gent

s

have

to

be r

epla

ced

at e

xact

ly t

he s

ame

tim

e. W

ith

stan

dard

pro

toco

ls,

like

tho

se f

or e

-mai

l an

d w

eb

acce

ss,

ther

e ar

e m

illi

ons

of a

gent

s, a

nd s

uch

a

repl

acem

ent

of a

ll o

f th

em a

t on

e ti

me

wit

h ne

w

vers

ions

is v

ery

diff

icul

t.

Old

clie

ntso

ftw

are

Old

ser

ver

soft

war

e

New

clie

ntso

ftw

are

New

ser

ver

soft

war

e

Fig

ure

22: D

iffi

culty

of

co-w

orki

ng b

etw

een

diff

eren

tve

rsio

ns o

f pr

otoc

ols

If

diff

eren

t co

mpa

nies

de

velo

p th

e va

riou

s

agen

ts,

this

wil

l of

cou

rse

be e

ven

mor

e di

ffic

ult.

Som

e co

mpa

nies

may

wan

t to

sup

port

new

fea

ture

s

in a

new

ver

sion

of

the

prot

ocol

, ot

her

com

pani

es

may

not

wan

t to

do th

is.

It i

s po

ssib

le t

o de

sign

a p

roto

col

so t

hat

it i

s

prep

ared

for

fut

ure

exte

nsio

ns.

If t

his

is d

one

in a

suit

able

way

, it m

ay b

e si

mpl

er to

add

new

fea

ture

s

to a

sta

ndar

d w

ith le

ss c

o-w

orki

ng p

robl

ems.

Man

y In

tern

et s

tand

ards

had

few

, or

bad

ly

plan

ned,

fea

ture

s to

sup

port

ext

ensi

ons,

in

thei

r

firs

t ve

rsio

ns.

Thi

s fo

rced

de

velo

pers

of

ne

w

vers

ions

of

the

stan

dard

s to

use

ver

y m

essy

and

untid

y so

lutio

ns.

52M

essa

ge H

andl

ing

A h

orro

r ex

ampl

e is

the

MIM

E s

tand

ard

for

send

ing

bina

ry a

ttac

hmen

ts i

n e-

mai

l. S

ince

the

old

e-m

ail

stan

dard

s on

ly s

uppo

rted

7-b

it c

hara

cter

s in

e-m

ail

mes

sage

s, a

nyth

ing

else

had

to

be e

ncod

ed

in a

7-b

it fo

rmat

.

Her

e is

an

over

view

of

good

and

bad

met

hods

of s

olvi

ng th

ese

prob

lem

s.

1.2

0.1

V

ers

ion

nu

mb

er

Clie

ntSe

rver

I am

usi

ng v

ersi

on 1

.1

I am

usi

ng v

ersi

on 1

.0

Con

tinue

d in

tera

ctio

nsus

ing

vers

ion

1.0

Fig

ure

23: V

ersi

on n

umbe

r m

etho

d

Wit

h th

e ve

rsio

n nu

mbe

r m

etho

d (s

ee F

igur

e 23

),

chan

ges

in a

pro

toco

l m

ean

that

the

pro

toco

l is

give

n a

new

ve

rsio

n nu

mbe

r.

Com

mun

icat

ion

betw

een

clie

nt a

nd s

erve

r st

arts

wit

h ex

chan

ging

vers

ion

num

bers

. A

fter

th

is,

both

ag

ents

ar

e

expe

cted

to

adhe

re t

o th

e lo

wes

t ve

rsio

n an

y of

them

use

d. T

his

mea

ns t

hat

all

agen

ts h

ave

to

supp

ort

both

the

new

pro

toco

l an

d m

any

or a

ll

olde

r ve

rsio

ns, w

hich

oth

er a

gent

s m

ay b

e us

ing.

Compendium 4 page 26

Page 27: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

The

mai

n di

sadv

anta

ge w

ith

the

vers

ion

num

ber

met

hod

is

that

if

yo

u m

ake

two

inde

pend

ent

exte

nsio

ns,

ther

e is

no

way

of

indi

cati

ng t

hat

you

can

use

only

the

firs

t, or

onl

y th

e se

cond

ext

ensi

on.

Gen

eral

ly,

vers

ion

num

bers

are

use

d fo

r m

ajor

chan

ges.

A

nd

maj

or

chan

ges

are

diff

icul

t to

hand

le.

Som

e pr

otoc

ols

have

ver

sion

num

bers

, bu

t

neve

r ge

t an

y ne

w v

ersi

on n

umbe

r. A

n ex

ampl

e of

this

is

MIM

E.

All

MIM

E e

-mai

l he

ader

s m

ust

co

nt

ai

n

th

e

st

at

em

en

t

MIME-Version:

1.0

but

ther

e m

ay n

ever

be

any

new

ver

sion

of

MIM

E.

Thi

s st

atem

ent

is, h

owev

er, n

ot m

eani

ngle

ss. I

t is

a

way

of

sayi

ng t

hat

this

mes

sage

is

in M

IME

form

at. S

o it

cann

ot b

e re

mov

ed f

rom

the

stan

dard

.

1.2

0.2

F

ea

ture

Se

lect

ion

Me

tho

d

Clie

ntSe

rver

I ca

n do

ext

ensi

on A

, B a

nd C

I ca

n do

ext

ensi

on B

, C a

nd F

Con

tinue

d in

tera

ctio

nsus

ing

exte

nsio

ns B

and

C o

nly

Fig

ure

24: Fe

atur

e se

lect

ion

met

hod

54M

essa

ge H

andl

ing

Wit

h th

e fe

atur

e se

lect

ion,

int

erac

tion

sta

rts

wit

h

the

clie

nt a

nd t

he s

erve

r ea

ch l

isti

ng w

hich

fea

ture

s

they

sup

port

. A

fter

tha

t ex

ecut

ion

cont

inue

s, u

sing

only

fe

atur

es

supp

orte

d by

bo

th

(plu

s ce

rtai

n

man

dato

ry b

asic

fun

ctio

ns w

hich

do

not

have

a

feat

ure

nam

e).

The

adv

anta

ge w

ith

this

met

hod

is t

hat

an a

gent

can

choo

se t

o su

ppor

t an

y se

t of

fea

ture

s, a

nd s

till

co-w

ork

wit

h an

othe

r ag

ent

whi

ch h

as c

hose

n an

y

othe

r se

t of

fea

ture

s. C

o-w

orki

ng w

ill

of c

ours

e

wor

k be

tter

, if

the

two

agen

ts h

ave

mor

e fe

atur

es in

com

mon

.

Ano

ther

adv

anta

ge w

ith

feat

ure

sele

ctio

n is

tha

t

stan

dard

s de

velo

ping

org

aniz

atio

ns c

an s

peci

fy

new

ext

ensi

ons,

wit

hout

bei

ng s

ure

that

the

y w

ill

be w

idel

y ac

cept

ed.

The

im

plem

ento

rs o

n th

e

mar

ket

can

choo

se w

heth

er t

o su

ppor

t a

new

feat

ure.

The

re i

s ho

wev

er a

ris

k w

ith

stan

dard

izin

g

too

man

y do

ubpt

ful

feat

ures

, si

nce

the

mor

e

feat

ures

are

im

plem

ente

d by

som

e, b

ut n

ot a

ll

vend

ors,

the

less

wei

ll w

ill t

heir

sof

twar

e be

abl

e to

com

mun

icat

e.

Wit

h fe

atur

e se

lect

ion,

and

if

the

feat

ures

are

give

n na

mes

whi

ch a

re m

ore

than

one

cha

ract

er,

it

is e

ven

poss

ible

tha

t di

ffer

ent

impl

emen

tors

can

spec

ify

and

deve

lop

new

fea

ture

s, i

ndep

ende

ntly

of

each

oth

er.

If t

hey

do t

his,

how

ever

, th

ere

is a

ris

k

that

tw

o im

plem

ento

rs w

ill

choo

se t

he s

ame

nam

e

for

two

inco

mpa

tibl

e fe

atur

es.

Thi

s is

sue

is

disc

usse

d m

ore

in th

e ne

xt s

ectio

n.

Compendium 4 page 27

Page 28: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

1.2

0.3

F

ea

ture

na

min

g

Wit

h m

any

met

hods

to

al

low

ex

tens

ions

to

stan

dard

s in

an

orde

rly

man

ner,

new

fea

ture

s ar

e

iden

tifi

ed b

y so

me

kind

of

feat

ure

nam

e. T

here

is

then

a n

eed

to a

void

tw

o di

ffer

ent

impl

emen

tors

from

def

inin

g tw

o di

ffer

ent

feat

ures

, bu

t gi

ving

them

the

sam

e na

me.

If

they

do

this

, an

d th

eir

appl

icat

ions

are

to

inte

ract

, se

riou

s pr

oble

ms

can

occu

r.

To

avoi

d th

is

prob

lem

, th

ere

is

a ne

ed

to

gene

rate

glo

ball

y un

ique

nam

es f

or n

ew f

eatu

res.

The

re a

re t

wo

met

hods

com

mon

ly u

sed

to d

o th

is.

One

met

hod

is t

o us

e a

hier

arch

ical

nam

ing

spac

e.

For

exa

mpl

e, a

fea

ture

can

be

iden

tifi

ed b

y a

UR

L.

Thi

s U

RL

ca

n re

fer

to

a w

eb

obje

ct

whi

ch

desc

ribe

s th

is f

eatu

re. A

noth

er w

ell-

know

n m

etho

d

of o

btai

ning

glo

ball

y un

ique

fea

ture

tag

s is

the

“obj

ect

iden

tifi

ers”

sta

ndar

dize

d as

par

t of

the

AS

N.1

st

anda

rd.

Obj

ect

iden

tifi

ers

have

go

t

wid

espr

ead

use

even

whe

n A

SN

.1 i

s no

t us

ed.

The

adva

ntag

e w

ith

obje

ct i

dent

ifie

rs i

s th

at t

hey

are

glob

ally

uni

que

in b

oth

spac

e an

d ti

me,

and

tha

t

anyo

ne c

an o

btai

n a

new

obj

ect

iden

tifi

er.

Sin

ce

obje

ct i

dent

ifie

rs c

onsi

st o

f a

seri

es o

f nu

mbe

rs,

not

a st

ring

of

char

acte

rs,

they

do

not

impl

y an

y

mea

ning

in

the

choi

ce o

f w

ords

, an

d th

ere

is t

hus

not

so l

arge

ris

k of

con

flic

ts a

s th

ere

has

been

wit

h

dom

ain

nam

es (

whe

n on

e co

mpa

ny o

bjec

ts t

o

som

eone

els

e us

ing

the

nam

e of

tha

t co

mpa

ny a

s a

dom

ain

nam

e.)

56M

essa

ge H

andl

ing

Ano

ther

m

etho

d is

to

us

e a

regi

stra

tion

auth

orit

y. A

nyon

e w

ho c

reat

es a

new

fea

ture

,

regi

ster

s it

s na

me

wit

h th

e re

gist

rati

on a

utho

rity

.

For

Int

erne

t IA

NA

(to

be

take

n ov

er b

y IC

AN

N)

is

the

mai

n or

gani

sati

on f

or r

egis

teri

ng s

uch

nam

es.

IAN

A h

as a

lar

ge n

umbe

r of

dif

fere

nt f

eatu

re l

ists

,

whi

ch i

t ha

ndle

s fo

r th

is m

etho

d. H

ere

are

som

e of

the

mos

t im

port

ant f

eatu

re li

sts

hand

led

by I

AN

A:

Cha

ract

er s

ets

See

page

%%

HT

TP

para

met

ers

HT

TP

tran

sfer

com

pres

sion

met

hods

IMA

P 4

capa

bilit

ies

Feat

ure

sele

ctio

n fo

r IM

AP,

see

pag

e %

%

Lan

guag

esT

ags

indi

catin

g th

e la

ngua

ge o

f a

docu

men

t

Med

ia f

eatu

re ta

gsT

ags

indi

catin

g fe

atur

es o

f pr

int a

nd d

ispl

ay m

edia

,su

ch a

s pa

per

size

, col

or d

epth

, etc

.

Med

ia ty

pes

See

page

%%

Port

num

bers

See

page

12

Tra

nsfe

r en

codi

ngs

Enc

odin

gs o

f bi

nary

and

8-b

it ch

arac

ters

in M

IME

,se

e pa

ge %

%

UR

L s

chem

esV

alue

s al

low

ed in

the

initi

al s

trin

g (b

efor

e th

e “:

”)in

UR

Ls

to in

dica

te v

ario

us a

cces

s pr

otoc

ols,

see

page

%%

Som

e fe

atur

es a

llow

any

one,

inc

ludi

ng a

com

pany

or a

noth

er s

tand

ards

org

anis

atio

n, t

o re

gist

er a

valu

e. O

ther

fea

ture

s on

ly a

ccep

t va

lues

def

ined

by

offi

cial

sta

ndar

ds.

All

owin

g no

n-st

anda

rd f

eatu

res

has

som

e pr

oble

ms.

Jus

t by

reg

iste

ring

a f

eatu

re

nam

e,

it

is

give

n a

kind

of

of

fici

al

back

ing.

Sta

ndar

ds o

rgan

isat

ions

oft

en w

ant

to m

ake

som

e

kind

of

chec

king

of

how

rea

sona

ble

new

fea

ture

s

are.

But

if

they

do

such

che

ckin

g, t

he a

llow

ed

feat

ures

bec

ome

a ki

nd o

f st

anda

rd.

And

thi

s m

ay

not

be g

ood

eith

er,

beca

use

the

new

fea

ture

s ha

ve

not

been

ch

ecke

d th

orou

ghly

en

ough

fo

r an

acce

pted

sta

ndar

d.

Compendium 4 page 28

Page 29: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

One

so

luti

on

to

this

di

lem

ma,

w

hich

ha

s

beco

me

mor

e co

mm

on

in

rece

nt

year

s,

is

to

prec

ede

non-

stan

dard

fe

atur

es

wit

h “v

nd.”

foll

owed

by

the

vend

or n

ame.

A v

endo

r na

me

as a

pref

ix o

f a

feat

ure

clea

rly

says

that

this

is a

ven

dor-

spec

ific

fea

ture

, no

t a

stan

dard

fea

ture

. E

xam

ple:

“app

lica

tion

/vnd

.ibm

.mod

cap”

is

a m

edia

typ

e

regi

ster

ed b

y IB

M f

or “

Mix

ed O

bjec

t D

ocum

ent

Con

tent

Arc

hite

ctur

e”.

Sin

ce t

his

met

hod

has

not

alw

ays

been

use

d, s

ome

earl

y ve

ndor

-spe

cifi

c

form

ats

do n

ot h

ave

such

nam

es,

for

exam

ple

the

med

ia

type

“a

ppli

cati

on/p

df”

for

the

PD

F

docu

men

t for

mat

use

d by

Ado

be A

crob

at.

A

com

mon

pr

acti

ce,

expl

icit

ly

spec

ifie

d in

som

e st

anda

rds,

is

to a

llow

non

-reg

iste

red

feat

ure

tags

, if

the

y be

gin

wit

h “X

-”.

Thi

s w

as i

nten

ded

to

allo

w e

xper

imen

ts w

ith

new

fea

ture

tag

s be

fore

they

ar

e re

gist

ered

. T

he

expe

rien

ce

wit

h th

is

met

hod,

how

ever

, is

not

goo

d. S

ome

expe

rim

enta

l

feat

ure

tags

sta

rted

out

wit

h “X

-” a

nd b

ecam

e so

wid

ely

acce

pted

, th

at t

hey

had

to k

eep

the

“X-”

even

whe

n th

ey b

ecam

e so

com

mon

tha

t th

ey

shou

ld h

ave

been

sta

ndar

dize

d.

1.2

0.4

B

uil

t-in

Ex

ten

sio

nP

oin

ts

One

met

hod

of m

akin

g ex

tens

ions

eas

ier

is t

o

prov

ide

buil

t-in

ext

ensi

on p

oint

s in

a p

roto

col.

A

buil

t-in

ext

ensi

on p

oint

is

a pa

rt o

f a

prot

ocol

,

whe

re e

xten

sion

s ca

n be

add

ed i

n su

ch a

way

tha

t

58M

essa

ge H

andl

ing

old

impl

emen

tati

ons,

whi

ch d

o no

t un

ders

tand

the

new

ext

ensi

ons,

can

at

leas

t re

cogn

ize

that

the

y ar

e

exte

nsio

ns.

Som

e st

anda

rds

have

suc

h ex

tens

ion

poin

ts

buil

t in

to t

he s

tand

ard

in s

peci

fied

pla

ces.

But

eve

n

stan

dard

s w

itho

ut s

uch

expl

icit

ly s

peci

fied

bui

lt-i

n

exte

nsio

n po

ints

, m

ay s

till

in

thei

r pr

acti

cal

usag

e

allo

w e

xten

sion

s at

cer

tain

poi

nts.

Tw

o ex

ampl

es:

E-m

ail h

eade

r fi

elds

E-m

ail

head

ers

cont

ain

stan

dard

ized

fie

lds

like

“To:

”, “

Fro

m:”

and

“D

ate:

”. B

ut a

nyon

e ca

n ad

d

thei

r ow

n he

ader

fie

lds.

The

sta

ndar

d fo

r e-

mai

l

only

all

ows

such

non

-sta

ndar

d he

ader

fie

lds

if t

hey

begi

n w

ith

“X-”

, bu

t th

ere

are

com

mon

non

-

stan

dard

e-m

ail

head

er f

ield

s w

hich

do

not

begi

n

wit

h “X

-”,

such

as

“Mai

ler”

or

“Ret

urn-

Rec

eipt

-

To”

. S

ome

of

thes

e ar

e no

t m

uch

like

d by

stan

dard

s m

akin

g or

gani

sati

ons,

but

the

y do

exi

st

and

can

be u

sed

betw

een

agen

ts w

hich

sup

port

them

in th

e sa

me

way

.

HT

TP

req

uest

met

hods

The

HT

TP

pro

toco

l is

pro

babl

y th

e In

tern

et

prot

ocol

whi

ch m

ost

ofte

n is

ext

ende

d to

var

ious

spec

ial

prot

ocol

s fo

r sp

ecia

l ne

eds.

One

way

of

exte

ndin

g H

TT

P

is

to

add

your

ow

n re

ques

t

met

hods

. H

TT

P

has

cert

ain

buil

t-in

st

anda

rd

requ

est

met

hods

, w

hich

are

ind

icat

ed i

n th

e fi

rst

line

of

a H

TT

P r

eque

st.

Exa

mpl

es o

f su

ch b

uilt

-in

requ

est

met

hods

ar

e G

ET

, H

EA

D

and

PO

ST

.

Peo

ple

who

ext

end

HT

TP

oft

en d

o th

is b

y ad

ding

thei

r ow

n re

ques

t m

etho

d na

mes

. T

here

is,

of

cour

se,

alw

ays

the

risk

tha

t th

e sa

me

nam

e yo

u

Compendium 4 page 29

Page 30: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

have

cho

sen

for

your

ow

n re

ques

t m

etho

d, g

ets

used

by

the

stan

dard

s m

akin

g or

gani

sati

ons

for

a

diff

eren

t re

ques

t m

etho

d in

the

fut

ure,

so

this

exte

nsio

n m

etho

d is

not

ver

y sa

fe.

A w

ay t

o m

ake

it s

afer

is

to s

tart

the

new

req

uest

met

hod

wit

h “X

-

” or

“V

ND

.” f

ollo

wed

by

a ve

ndor

nam

e. T

his

met

hod

of m

akin

g ne

w m

etho

ds s

afer

is

not

(in

the

case

of

HT

TP

) sp

ecif

ied

in a

ny o

ffic

ial

stan

dard

,

but

is b

ecom

ing

mor

e an

d m

ore

com

mon

pra

ctic

e,

for

vari

ous

feat

ure

tags

.

1.21

Stan

dard

s T

erm

inol

ogy

Cer

tain

wor

ds h

ave

a %

%%

1.2

1 T

he

Sta

nd

ard

s M

ak

ing

Pro

cess

1.2

1.1

Wh

o m

ak

es

the

sta

nd

ard

s?

Man

y di

ffer

ent

orga

nisa

tion

s ca

n be

inv

olve

d in

mak

ing

stan

dard

s. S

tand

ards

can

be

deve

lope

d by

:

60M

essa

ge H

andl

ing

Dev

elop

erE

xam

ple

Adv

anta

ges

Dis

adva

ntag

es

Com

pani

esPo

stsc

ript

and

PD

F w

ere

deve

lope

d by

Ado

be, R

TF

was

dev

elop

ed b

y M

icro

soft

.

Fast

deve

lopm

ent,

avoi

ds s

ome

disa

dvan

tage

sw

ith s

tand

ards

orga

nisa

tions

.

May

be

too

spec

ializ

ed f

orth

e pr

oduc

ts o

fth

e de

velo

per.

Con

sort

ium

s of

com

pani

esU

nico

de c

onso

rtiu

m w

hich

deve

lops

the

Uni

code

char

acte

r se

t sta

ndar

d.

If m

any

lead

ing

vend

ors

are

mem

bers

, suc

hst

anda

rds

can

get w

idel

yac

cept

ed.

Som

etim

es tw

oor

mor

eco

mpe

ting

cons

ortiu

ms

deve

lop

com

petin

gst

anda

rds.

Gen

eral

sta

ndar

dsor

gani

satio

nsIn

tern

et E

ngin

eeri

ng T

ask

Forc

e (I

ET

F), I

nter

natio

nal

Stan

dard

s O

rgan

isat

ion

(ISO

),In

tern

atio

nal

Tel

ecom

mun

icat

ions

Uni

on(I

TU

), W

orld

Wid

e W

ebC

onso

rtiu

m (

W3C

).

Man

y di

ffer

ent

orga

nisa

tions

can

cont

ribu

teth

eir

know

ledg

ean

d ex

peri

ence

.

Con

flic

ts a

reso

met

imes

reso

lved

by

ambi

guity

in th

est

anda

rd o

rm

ultip

le w

ays

of d

oing

the

sam

e th

ing.

1.2

1.2

S

tan

da

rds

sta

ge

s

Gen

eral

sta

ndar

ds o

rgan

isat

ions

hav

e ru

les

whi

ch

regu

late

the

sta

ndar

ds m

akin

g pr

oces

s. E

xam

ples

of s

uch

rule

s ar

e st

ages

of

prog

ress

of

stan

dard

s.

For

exam

ple,

IE

TF

has

the

follo

win

g st

ages

:

IET

F D

raft

An

IET

F D

raft

is a

doc

umen

t pro

duce

d du

ring

the

wor

k of

deve

lopi

ng a

sta

ndar

d. I

ET

F dr

afts

are

not

sta

ndar

ds a

nd it

can

be d

ange

rous

to d

evel

op b

ased

on

them

. Man

y of

them

neve

r be

com

e st

anda

rds

or a

re c

hang

ed m

uch

befo

re th

eybe

com

e st

anda

rds.

IE

TF

dele

tes

all I

ET

F dr

afts

aft

erus

ually

6 m

onth

s, in

ord

er to

cla

rify

that

they

are

not

rea

lst

anda

rds.

Exp

erim

enta

lSt

anda

rdM

ost s

tand

ards

nev

er g

et in

to th

is s

tage

. It i

s us

ed w

hen

itis

unc

lear

if th

is s

houl

d be

com

e a

stan

dard

or

not,

or f

orst

anda

rds

of in

tere

st to

ver

y lim

ited

com

mun

ities

. If

anE

xper

imen

tal S

tand

ard

is s

ucce

ssfu

l, it

can

prog

ress

to a

Prop

osed

Sta

ndar

d.

Compendium 4 page 30

Page 31: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

Prop

osed

Sta

ndar

dPr

opos

ed S

tand

ards

is th

e fi

rst s

tage

of

a ne

w s

tand

ard.

The

nam

e gi

ves

the

impr

essi

on th

at th

is is

onl

y a

prop

osal

, but

in r

ealit

y, p

ropo

sed

stan

dard

s ar

e of

ten

wid

ely

impl

emen

ted.

For

exa

mpl

e, th

e H

TT

P st

anda

rd w

as f

irst

publ

ishe

d as

Inf

orm

atio

nal i

n 19

96, t

hen

as P

ropo

sed

Stan

dard

in 1

997,

then

as

Dra

ft S

tand

ard

in 1

999.

Dra

ft S

tand

ard

A P

ropo

sed

Stan

dard

can

pro

gres

s to

Dra

ft S

tand

ard

if it

isw

idel

y im

plem

ente

d. I

ET

F ha

s a

rule

that

any

fac

ility

of

aPr

opos

ed S

tand

ard

whi

ch is

not

sup

port

ed b

y tw

oin

depe

nden

t co-

wor

king

impl

emen

tatio

ns h

as to

be

rem

oved

whe

n th

e St

anda

rd p

rogr

esse

s to

Dra

ft S

tand

ard.

Thi

s m

eans

that

sta

ndar

ds a

re o

ften

sim

plif

ied

whe

npr

ogre

ssed

fro

m P

ropo

sed

to D

raft

. For

exa

mpl

e, th

eC

onte

nt-B

ase

HT

TP

head

er w

as r

emov

ed, w

hen

the

HT

TP/

1.1

stan

dard

pro

gess

ed f

rom

Pro

pose

d to

Dra

ft in

1999

.

Thi

s ru

le is

ver

y va

luab

le, s

ince

it h

elps

wee

ding

out

faci

litie

s in

sta

ndar

ds w

hich

are

not

muc

h us

ed. S

uch

faci

litie

s ca

n ca

use

prob

lem

in c

o-w

orki

ng b

etw

een

impl

emen

tatio

ns u

sing

them

and

not

usi

ng th

em.

Stan

dard

Thi

s is

the

last

sta

ge o

f a

wid

ely

supp

orte

d st

anda

rd. I

tus

ually

take

s a

long

tim

e be

fore

a s

tand

ard

reac

hes

this

stag

e. H

TT

P, f

or e

xam

ple,

has

not

yet

(w

hen

this

is w

ritte

n,Fe

brua

ry 2

001)

bec

ome

a fu

ll St

anda

rd in

spi

te o

f its

wid

eac

cept

ance

.

His

tori

cal S

tand

ard

An

old

stan

dard

, whi

ch s

houl

d no

t be

used

any

mor

e.

Bes

t Cur

rent

Pra

ctic

eA

rec

omm

enda

tion

on h

ow to

use

one

or

mor

e st

anda

rds

for

a pa

rtic

ular

pur

pose

, or

any

othe

r do

cum

ents

whi

chne

ed to

be

deve

lope

d in

the

sam

e ca

refu

l way

as

a Pr

opos

edSt

anda

rd, b

ut s

houl

d no

t its

elf

be a

Pro

pose

d St

anda

rd.

Info

rmat

iona

lO

ther

doc

umen

ts o

f in

tere

st to

impl

emen

tors

. For

exa

mpl

e,a

sum

mar

y of

info

rmat

ion

from

oth

er s

tand

ards

can

be

publ

ishe

d as

an

Info

rmat

iona

l Sta

ndar

d. T

he r

easo

n su

ch a

docu

men

t doe

s no

t bec

ome

a St

anda

rd, i

s th

at if

the

sam

eth

ing

is s

peci

fied

in m

ore

than

one

sta

ndar

d, th

ere

is a

ris

kof

con

trad

ictio

n. A

dvic

e on

how

to b

est i

mpl

emen

t ast

anda

rd c

an a

lso

be p

ublis

hed

as I

nfor

mat

iona

l doc

umen

ts.

Som

e In

form

atio

nal d

ocum

ents

are

not

rev

iew

ed a

sca

refu

lly a

s st

anda

rds

are.

IET

F h

as f

ace-

to-f

ace

mee

ting

s th

ree

tim

es a

yea

r.

At

thes

e m

eeti

ngs,

man

y di

ffer

ent

stan

dard

s gr

oup

mee

t. E

ach

grou

p is

giv

en 1

-6 h

ours

, an

d th

ere

are

man

y pa

rall

el

sess

ions

du

ring

a

who

le

wee

k.

Gro

ups

are

of t

wo

kind

s, e

stab

lish

ed I

ET

F w

orki

ng

62M

essa

ge H

andl

ing

grou

ps

and

BO

F

(Bir

ds

Of

a F

eath

er).

B

OF

mee

ting

s ha

ve t

he t

ask

of d

ecid

ing

if I

ET

F s

houl

d

form

a

wor

king

gr

oup.

S

omet

imes

, a

who

le

stan

dard

s is

dev

elop

ed d

urin

g a

few

suc

ceed

ing

BO

FS w

ithou

t eve

r fo

rmin

g a

wor

king

gro

up.

Eac

h w

orki

ng g

roup

has

an

e-m

ail

mai

ling

lis

t,

thro

ugh

whi

ch m

uch

of t

he w

ork

is d

one

betw

een

IET

F m

eeti

ngs.

Som

etim

es,

but

not

com

mon

ly,

a

wor

king

gro

up c

an h

ave

a fa

ce-t

o-fa

ce m

eeti

ng

betw

een

the

thre

e re

gula

r IE

TF

mee

tings

.

The

fir

st v

ersi

on o

f a

new

sta

ndar

d is

pub

lish

ed a

s

an I

ET

F d

raft

. A

fter

thi

s, t

he p

ropo

sal

is u

sdua

lly

disc

usse

d in

a B

OF

. Bas

ed o

n th

e B

OF

, a c

hart

er i

s

deve

lope

d fo

r th

e w

orki

ng g

roup

. T

he c

hart

er

spec

ifie

s cl

earl

y w

hat

is t

o be

sta

ndar

dize

d, a

nd

also

usu

ally

wha

t is

not

to

be i

nclu

ded

in t

he

stan

dard

. The

cha

rter

als

o co

ntai

ns a

tim

e pl

an.

IET

F t

ries

to

deve

lop

sim

ple

stan

dard

s on

wel

l-

know

n to

pics

, an

d tr

ies

to a

void

res

earc

h an

d

unte

sted

ide

as.

How

ever

, so

me

stan

dard

s te

nd t

o

beco

me

rath

er l

arge

an

com

plex

, fo

r ex

ampl

e

HT

TP

.

1000

0020

0000

3000

0040

0000

5000

0060

0000

1.0

Info

rmat

iona

l

1.1

Prop

osed

Stan

dard

1.1

Dra

ftSt

anda

rd

Compendium 4 page 31

Page 32: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

The

fig

ure

abov

e sh

ows

the

num

ber

of c

hara

cter

s in

the

tex

t of

HT

TP

1.0

, H

TT

P 1

.

Pro

pose

d an

d H

TT

P 1

.1 D

raft

. H

TT

P 1

.1 D

raft

is

a 17

7 pa

ges

larg

e an

d ve

ry c

ompl

e

docu

men

t. O

ne r

easo

n w

hy I

ET

F s

tand

ard

size

s ha

s gr

own

is t

hat

the

stan

dard

s w

ork

wa

dom

inat

ed b

y un

iver

sity

peo

ple

in t

he 1

980s

, bu

t no

w p

eopl

e fr

om l

arge

com

mer

cia

com

pani

es h

ave

larg

ely

take

n ov

er.

Lar

ge c

omm

erci

al c

ompa

nies

hav

e an

int

eres

t in

lar

g

and

com

plex

sta

ndar

ds, b

ecau

se t

his

wil

l ke

ep s

mal

ler

com

peti

tors

fro

m t

he m

arke

t. H

TT

P

1.1

also

som

etim

es d

escr

ibes

mor

e th

an o

ne m

etho

d to

do

the

sam

e th

ing,

som

etim

es w

it

smal

l di

ffer

ence

s. O

ne r

easo

n fo

r th

is i

s th

e ne

ed f

or c

ompa

tibi

lity

wit

h ea

rlie

r ve

rsio

ns

whi

ch m

eans

that

a n

ew b

ette

r m

etho

d is

spe

cifi

ed to

geth

er w

ith

an o

lder

, les

s go

od w

ay o

doin

g th

e sa

me

thin

g.

Wor

k in

sta

ndar

ds o

rgan

isat

ions

oft

en s

triv

es a

t

reac

hing

“c

onse

nsus

”,

spec

ific

atio

ns

whi

ch

ever

yone

agr

ees

on.

How

ever

, tr

ying

to

reac

h

cons

ensu

s so

met

imes

re

sult

s in

st

anda

rds

spec

ifyi

ng m

ore

than

one

way

to

do s

omet

hing

, in

orde

r to

mak

e ev

eryb

ody

happ

y. T

his

is o

f co

urse

not

good

. If

the

sta

ndar

d sp

ecif

ies

two

way

s of

doin

g th

e sa

me

thin

g,

one

impl

emen

tor

may

choo

se o

nly

one

of t

he w

ays,

ano

ther

im

plem

ento

r

only

ano

ther

of

the

way

s, a

nd t

heir

im

plem

enta

tion

wil

l th

en c

o-w

ork

wel

l. S

triv

ing

to r

each

con

sens

us

also

ca

n re

sult

in

am

bigu

ous

stat

emen

ts

on

cont

rove

rsia

l is

sues

. T

hese

pr

oble

ms

wit

h

cons

ensu

s ha

s ca

used

IE

TF

to

say

that

it

stri

ves

afte

r “r

ough

con

sens

us”.

By

this

is

mea

nt t

hat

mos

t

com

pete

nt p

eopl

e ag

ree,

but

a f

ew d

evia

nts

are

acce

pted

, esp

ecia

lly

if th

e de

vian

ts a

re k

now

n to

be

peop

le w

ho a

re u

nwil

ling

to

acce

pt a

ggre

emen

ts o

r

know

n to

oft

en h

ave

diff

eren

t ide

as.

IET

F i

s ra

ther

res

tric

tive

in

its

use

of v

otin

g.

Str

aw p

olls

are

som

etim

es m

ade,

but

in

gene

ral

IET

F t

ries

to

reso

lve

issu

es b

y ot

her

way

s th

an

64M

essa

ge H

andl

ing

votin

g.

IET

F

is

very

op

en,

anyo

ne

is

allo

wed

to

part

icip

ate

in t

he s

tand

ards

wor

k, a

nd a

nyon

e w

ho

can

affo

rd t

he c

ost

is a

llow

ed t

o go

to

the

IET

F

mee

ting

s. I

SO

has

ano

ther

org

anis

atio

n, i

t is

bas

ed

on n

atio

nal

orga

nisa

tion

s in

dif

fere

nt c

ount

ries

(fo

r

exam

ple

AN

SI

in t

he U

.S.

and

DIN

in

Ger

man

y)

who

sen

d re

pres

enta

tive

s to

the

sta

ndar

ds m

eeti

ngs

and

vote

on

the

stan

dard

s.

An

impo

rtan

t re

ason

why

IE

TF

has

bee

n m

ore

succ

esfu

l th

an I

SO

in

the

area

of

netw

orki

ng

stan

dard

s is

th

at

IET

F

plac

es

mor

e st

ress

on

impl

emen

tati

on

expe

rien

ce

befo

re

fina

lizi

ng

a

stan

dard

. T

his

mea

ns t

hat

IET

F s

tand

ards

ten

d to

be s

impl

er a

nd h

ave

less

bug

s th

an I

SO s

tand

ards

.

ISO

has

a d

iffe

rent

set

of

stag

es o

f st

anda

rds:

Com

mit

tee

Doc

umen

tD

raft

whi

ch h

as b

een

care

full

y pr

epar

ed,

but

is n

ot

read

y to

bec

ome

a st

anda

rd y

et.

Dra

ft S

tand

ard

Firs

t ver

sion

of

a st

anda

rd.

Full

Stan

dard

Fina

l ver

sion

of

a st

anda

rd

Fun

ctio

nal S

tand

ard

An

addi

tion

al d

ocum

ent,

whi

ch r

estr

icts

an

exis

ting

stan

dard

and

cla

rifi

es a

mbi

guou

s is

sues

. F

unct

iona

l

stan

dard

s ar

e us

uall

y de

velo

ped

by

othe

r

orga

nisa

tion

s th

an I

SO

. T

he f

unct

iona

l st

anda

rds

are

need

ed

to

coun

tera

ct

the

com

plex

ity

and

ambi

guity

in th

e or

igin

al s

tand

ards

.

Bec

ause

IS

O d

oes

not

test

im

plem

ent

as m

uch

befo

re

publ

ishi

ng

a st

anda

rd,

bug

corr

ecti

ons

(usu

ally

nam

ed “

Add

endu

ms”

) ar

e co

mm

on.

Compendium 4 page 32

Page 33: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

1.2

2

OS

I v

ers

us

the

In

tern

et

Dur

ing

the

1980

s, m

any

peop

le u

nder

stoo

d th

at

com

pute

r ne

twor

ks w

ere

goin

g to

rev

olut

ioni

ze

soci

ety.

Tw

o se

ts o

f st

anda

rds

wer

e de

velo

ped.

The

Int

erna

tion

al S

tand

ards

Org

anis

atio

n (I

SO

)

and

the

Inte

rnat

iona

l T

elec

omm

unic

atio

ns U

nion

(IT

U)

dev

elo

ped

th

e O

pen

S

yst

ems

Inte

rcon

nect

ion

(OS

I) s

et o

f st

anda

rds.

And

peo

ple

at u

nive

rsit

ies

and

rese

arch

lab

orat

orie

s de

velo

ped

the

Inte

rnet

. And

we

all

know

tha

t th

e In

tern

et w

on

over

the

OS

I, e

ven

if s

ome

Inte

rnet

sta

ndar

ds h

ave

take

n ov

er s

ome

func

tions

fro

m O

SI.

Som

e re

ason

s fo

r th

e su

cces

s of

Int

erne

t:

(Thi

s is

a v

ery

cont

rove

rsia

l is

sue,

whi

ch m

any

peop

le f

eel

very

str

ongl

y ab

out.

Som

e of

the

m w

ill

stro

ngly

cla

im t

hat

one

of t

he r

easo

ns l

iste

d be

low

is

the

prim

ary

reas

on

for

the

succ

ess

of

the

Inte

rnet

. O

ther

peo

ple

wil

l cl

aim

tha

t an

othe

r

reas

on is

pri

mar

y.)

1.In

tern

et s

tand

ards

wer

e in

the

beg

inni

ngm

uch

sim

pler

an

d ea

sier

to

im

plem

ent.

How

ever

, dur

ing

the

1990

s, t

here

has

bee

n a

tend

ency

to

mak

e In

tern

et s

tand

ards

mor

eco

mpl

ex.

2.T

he I

nter

net

proc

ess

of d

evel

opin

g st

anda

rds

requ

ires

, th

at a

ll f

eatu

res

not

impl

emen

ted

by t

wo

inde

pend

ent

impl

emen

tati

ons

mus

tbe

rem

oved

whe

n a

stan

dard

is

prog

ress

edfr

om “

prop

osed

” to

“dr

aft”

sta

ndar

ds s

tatu

s.T

his

m

ean

s th

at

un

nec

essa

ry

or

unim

plem

enta

ble

feat

ures

will

be

rem

oved

.

66M

essa

ge H

andl

ing

3.T

he

rule

s fo

r co

nsen

sus

form

ing

are

dif

fere

nt

in

the

inv

olv

ed

stan

dar

ds

org

anis

atio

ns.

IE

TF

re

qu

ires

“r

ou

gh

con

sen

sus”

w

hic

h

mea

ns

that

m

ost

reas

onab

le

peop

le

agre

e.

ITU

re

quir

esab

solu

te c

onse

nsus

am

ong

ITU

mem

bers

.IS

O r

elie

s on

maj

orit

y vo

ting

. T

oo s

tron

gre

quir

emen

ts o

n co

nsen

sus

can

lead

to

the

incl

usio

n, i

n th

e st

anda

rds,

of

diff

eren

t w

ays

of d

oing

the

sam

e th

ings

, in

ord

er t

o m

ake

ever

ybod

y ha

ppy.

But

suc

h st

anda

rds

are

ofco

urse

not

goo

d, a

goo

d st

anda

rd s

houl

dav

oid

diff

eren

t w

ays

of d

oing

the

sam

eth

ing.

Oth

erw

ise,

the

re i

s a

risk

tha

t so

me

impl

emen

tors

wil

l ch

oose

one

opt

ion,

oth

erim

plem

ento

rs

anot

her

opti

on,

and

thei

rpr

oduc

ts m

ay th

en n

ot b

e ab

le to

inte

rwor

k.

4.O

SI

trie

d to

dev

elop

gen

eral

and

uni

vers

also

luti

ons

to

cope

w

ith

all

exis

ting

an

dan

tici

pate

d fu

ture

us

age.

T

his

mad

e th

est

anda

rds

mor

e lo

gica

l an

d w

ells

truc

ture

d,bu

t als

o m

uch

mor

e co

mpl

ex.

5.In

tern

et w

as i

n th

e 19

80s

and

the

begi

nnin

gof

the

199

0s h

eavi

ly s

ubsi

dize

d by

the

U.S

.go

vern

men

t.

6.B

ecau

se I

nter

net

star

ted

as a

uni

vers

ity

and

rese

arch

net

wor

k, m

uch

valu

able

and

fre

ein

form

atio

n w

as m

ade

avai

labl

e to

giv

eIn

tern

et a

goo

d st

art.

7.O

SI

was

de

velo

ped

by

larg

e te

leco

mco

mpa

nies

, w

ho w

ante

d to

use

the

sta

ndar

dto

dom

inat

e th

e m

arke

t. I

nter

net,

on

the

othe

r si

de,

has

alw

ays

been

ver

y st

rong

lyor

ient

ed

tow

ards

le

ttin

g m

any

diff

eren

tpr

ovid

ers

deve

lop

thei

r ow

n so

luti

ons,

and

lett

ing

the

mar

ket

deci

de w

hich

of

them

wou

ld s

ucce

ed.

Whe

n co

nsid

erin

g th

is i

ssue

, it

mig

ht b

e in

tere

stin

g

to n

ote

that

in

one

coun

try,

Fra

nce,

a n

etw

ork

wit

h

man

y si

mil

arit

ies

to I

nter

net

was

ver

y su

cces

sful

Compendium 4 page 33

Page 34: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Mes

sage

Han

dlin

g

alre

ady

in t

he 1

980s

. T

hat

netw

ork

was

nam

ed

Min

itel

. It

was

som

ewha

t O

SI-

orie

nted

, bu

t a

very

sim

ple

subs

et,

and

it w

as a

lso

subs

idiz

ed i

n th

e

begi

nnin

g by

Fra

nce

Tel

ecom

. M

init

el w

as a

lso

stro

ngly

ori

ente

d to

war

ds l

etti

ng m

any

diff

eren

t

prov

ider

s pu

t up

dif

fere

nt s

ervi

ces.

And

sim

ilar

netw

orks

to

Min

itel

, bu

t m

ore

orie

nted

tow

ards

cent

ral

cont

rol,

did

not

succ

eed

as w

ell

as M

init

el

in o

ther

cou

ntri

es,

like

Ger

man

y an

d th

e U

nite

d

Kin

gdom

.

Bec

ause

of

this

, I

pers

onal

ly b

elie

ve t

hat

the

bott

om-u

p st

ruct

ure

of t

he I

nter

net

(ite

m 7

in

the

list

abo

ve)

was

the

mos

t im

port

ant

fact

or f

or i

ts

succ

ess.

7.1

Re

fere

nce

s

[1]

IA

NA

Por

t num

bers

Reg

istr

y. h

ttp://

ww

w.ia

na.n

et/n

umbe

rs.h

tml#

P.

[2]

R

ives

t, R

. The

MD

5 M

essa

ge-D

iges

t Alg

orit

hm. I

nter

net R

FC 1

321,

Apr

il 19

92

Compendium 4 page 34

Page 35: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

A B

egin

ner'

s G

uide

to

UR

Ls

Wha

t's a

UR

L?

A U

RL

is

a U

nifo

rm R

esou

rce

Loc

ator

. Thi

nk o

f it

as

a ne

twor

ked

exte

nsio

n of

the

sta

ndar

d fi

lena

me

conc

ept:

not

onl

y ca

n yo

u po

int t

o a

file

in a

dir

ecto

ry,

but t

hat f

ile

and

that

dir

ecto

ry c

an e

xist

on

any

mac

hine

on

the

netw

ork,

can

be

serv

ed v

ia

any

of s

ever

al d

iffe

rent

met

hods

, and

mig

ht n

ot e

ven

be s

omet

hing

as

sim

ple

as a

fil

e:

UR

Ls

can

also

poi

nt to

que

ries

, doc

umen

ts s

tore

d de

ep w

ithi

n da

taba

ses,

the

resu

lts

of a

fi

nger

or

arch

ie c

omm

and,

or

wha

teve

r.

Sin

ce th

e U

RL

con

cept

rea

lly

pret

ty s

impl

e ("

if it

's o

ut th

ere,

we

can

poin

t at i

t"),

this

be

ginn

er's

gui

de is

just

a q

uick

wal

k th

roug

h so

me

of th

e m

ore

com

mon

UR

L ty

pes

and

shou

ld a

llow

you

to b

e cr

eati

ng a

nd u

nder

stan

ding

UR

Ls

in a

var

iety

of

cont

exts

ver

y qu

ickl

y.

File

UR

Ls

Supp

ose

ther

e is

a d

ocum

ent c

alle

d "f

ooba

r.tx

t"; i

t sit

s on

an

anon

ymou

s ft

p se

rver

cal

led

"ftp

.yoy

odyn

e.co

m"

in d

irec

tory

"/p

ub/f

iles

". T

he U

RL

for

thi

s fi

le i

s th

en:

file://ftp.yoyodyne.com/pub/files/foobar.txt

The

topl

evel

dir

ecto

ry o

f th

is F

TP

serv

er is

sim

ply:

file://ftp.yoyodyne.com/

The

"pu

b" d

irec

tory

of

this

FT

P se

rver

is th

en:

file://ftp.yoyodyne.com/pub

Tha

t's a

ll t

here

is

to i

t.

Gop

her

UR

Ls

Gop

her

UR

Ls

are

a li

ttle

mor

e co

mpl

icat

ed th

an f

ile

UR

Ls,

sin

ce G

ophe

r se

rver

s ar

e a

litt

le

tric

ker

to d

eal w

ith

than

FT

P s

erve

rs. T

o vi

sit a

par

ticu

lar

goph

er s

erve

r (s

ay, t

he g

ophe

r se

rver

on

goph

er.y

oyod

yne.

com

), u

se t

his

UR

L:

gopher://gopher.yoyodyne.com/

Som

e go

pher

ser

vers

may

res

ide

on u

nusu

al n

etw

ork

port

s on

thei

r ho

st m

achi

nes.

(T

he

defa

ult g

ophe

r po

rt n

umbe

r is

70.

) If

you

kno

w th

at th

e go

pher

ser

ver

on th

e m

achi

ne

"gop

her.

banz

ai.e

du"

is o

n po

rt 1

234

inst

ead

of p

ort

70, t

hen

the

corr

espo

ndin

g U

RL

wou

ld

be:

0-1

0-1

1

12

.56

A B

eg

inn

er's

Gu

ide

to

UR

Ls

Pa

ge

1 o

f 3

htt

p:/

/tis

.eh

.do

e.g

ov

/NE

PA

/tu

tor/

info

/url

pri

mr.

htm

gopher://gopher.banzai.edu:1234/

New

s U

RL

sT

o po

int

to a

Use

net

new

sgro

up (

say,

"re

c.ga

rden

ing"

), t

he U

RL

is

sim

ply:

news:rec.gardening

Cur

rent

ly, n

etw

ork

clie

nts

like

NC

SA

Mos

aic

don'

t all

ow y

ou to

spe

cify

a n

ews

serv

er li

ke

you

wou

ld n

orm

ally

exp

ect (

e.g.

, news://news.yoyodyne.com/rec.gardening

); th

is m

ay b

e co

min

g do

wn

the

road

but

in th

e m

eant

ime

you

wil

l hav

e to

spe

cify

you

r lo

cal n

ews

serv

er

via

som

e ot

her

met

hod.

The

mos

t com

mon

met

hod

is to

set

the

envi

ronm

ent v

aria

ble

NNTPSERVER

to th

e na

me

of y

our

new

s se

rver

bef

ore

you

star

t Mos

aic.

HT

TP

UR

Ls

HT

TP

sta

nds

for

Hyp

erT

ext T

rans

port

Pro

toco

l. H

TT

P s

erve

rs a

re c

omm

only

use

d fo

r se

rvin

g hy

pert

ext d

ocum

ents

, as

HT

TP

is a

n ex

trem

ely

low

-ove

rhea

d pr

otoc

ol th

at

capi

tali

zes

on th

e fa

ct th

at n

avig

atio

n in

form

atio

n ca

n be

em

bedd

ed in

suc

h do

cum

ents

di

rect

ly a

nd th

us th

e pr

otoc

ol it

self

doe

sn't

have

to s

uppo

rt f

ull n

avig

atio

n fe

atur

es li

ke th

e F

TP

and

Gop

her

prot

ocol

s do

.

A f

ile

call

ed "

foob

ar.h

tml"

on

HT

TP

ser

ver

"ww

w.y

oyod

yne.

com

" in

dir

ecto

ry "

/pub

/fil

es"

corr

espo

nds

to th

is U

RL

:

http://www.yoyodyne.com/pub/files/foobar.html

The

def

ault

HT

TP

netw

ork

port

is 8

0; if

a H

TT

P se

rver

res

ides

on

a di

ffer

ent n

etw

ork

port

(s

ay, p

ort

1234

on

ww

w.y

oyod

yne.

com

), t

hen

the

UR

L b

ecom

es:

http://www.yoyodyne.com:1234/pub/files/foobar.html

Par

tial

UR

Ls

Onc

e yo

u ar

e vi

ewin

g a

docu

men

t loc

ated

som

ewhe

re o

n th

e ne

twor

k (s

ay, t

he d

ocum

ent

http://www.yoyodyne.com/pub/afile.html

), yo

u ca

n us

e a

part

ial,

or r

elat

ive,

UR

L to

poi

nt

to a

noth

er f

ile

in th

e sa

me

dire

ctor

y, o

n th

e sa

me

mac

hine

, bei

ng s

erve

d by

the

sam

e se

rver

so

ftw

are.

For

exa

mpl

e, i

f an

othe

r fi

le e

xist

s in

tha

t sa

me

dire

ctor

y ca

lled

"a

noth

erfi

le.h

tml"

, the

n anotherfile.html

is a

val

id p

artia

l UR

L a

t tha

t poi

nt.

Thi

s pr

ovid

es a

n ea

sy w

ay to

bui

ld s

ets

of h

yper

text

doc

umen

ts. I

f a

set o

f hy

pert

ext

docu

men

ts a

re s

itti

ng i

n a

com

mon

dir

ecto

ry, t

hey

can

refe

r to

one

ano

ther

(i.e

., be

hy

perl

inke

d) b

y ju

st t

heir

fil

enam

es -

- ho

wev

er a

rea

der

got t

o on

e of

the

docu

men

ts, a

0-1

0-1

1

12

.56

A B

eg

inn

er's

Gu

ide

to

UR

Ls

Pa

ge

2 o

f 3

htt

p:/

/tis

.eh

.do

e.g

ov

/NE

PA

/tu

tor/

info

/url

pri

mr.

htm

Compendium 4 page 35

Page 36: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

jum

p ca

n be

mad

e to

any

oth

er d

ocum

ent i

n th

e sa

me

dire

ctor

y by

mer

ely

usin

g th

e ot

her

docu

men

t's f

ilen

ame

as th

e pa

rtia

l UR

L a

t tha

t poi

nt. T

he a

ddit

iona

l inf

orm

atio

n (a

cces

s m

etho

d, h

ostn

ame,

por

t nu

mbe

r, d

irec

tory

nam

e, e

tc.)

wil

l be

ass

umed

bas

ed o

n th

e U

RL

us

ed t

o re

ach

the

firs

t do

cum

ent.

Oth

er U

RL

sM

any

othe

r U

RL

s ar

e po

ssib

le, b

ut w

e've

cov

ered

the

mos

t com

mon

one

s yo

u m

ight

hav

e to

con

stru

ct b

y ha

nd. A

t the

top

of e

ach

Mos

aic

docu

men

t vie

win

g w

indo

w is

a te

xt f

ield

ca

lled

"D

ocum

ent U

RL

"; if

you

wat

ch th

e co

nten

ts o

f th

at a

s yo

u na

viga

te th

roug

h in

form

atio

n on

the

netw

ork,

you

'll g

et to

obs

erve

how

UR

Ls

are

put t

oget

her

for

man

y di

ffer

ent

type

s of

inf

orm

atio

n.

The

cur

rent

IE

TF

UR

L s

pec

is h

ere;

mor

e in

form

atio

n on

UR

Ls

can

be f

ound

her

e.

Ret

urn

to t

he o

verv

iew

0-1

0-1

1

12

.56

A B

eg

inn

er's

Gu

ide

to

UR

Ls

Pa

ge 3

of

3h

ttp

://t

is.e

h.d

oe

. go

v/N

EP

A/t

uto

r/in

fo/u

rlp

rim

r.h

tm

Compendium 4 page 36

Page 37: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

http

://w

ww

.eec

s.ha

rvar

d.ed

u/~v

ino/

web

/pus

h.ca

che/

node

7.ht

ml

Pag

e 1

of 4

Nex

t: I

nter

net C

achi

ng S

chem

es U

p: R

elat

ed W

ork

Prev

ious

: Rel

ated

Wor

k

Dis

trib

uted

file

syst

ems

Muc

h of

the

curr

ent r

esea

rch

in a

uton

omou

s re

plic

atio

n ha

s be

en h

eavi

ly in

flue

nced

by

the

cach

ing

subs

yste

ms

of d

istr

ibut

ed f

ile

syst

ems.

Eff

icie

nt o

pera

tion

of

a la

rge-

scal

edi

stri

bute

d fi

le s

yste

m d

epen

ds o

n ca

ches

to r

educ

e th

e lo

ad o

n fi

le s

erve

rs; s

yste

ms

like

Sun

's N

FS

[26

][27

] th

at d

o no

t rel

y on

long

-ter

m c

ache

s ca

nnot

sup

port

mor

e th

an a

few

hund

red

clie

nts,

whe

reas

sys

tem

s li

ke X

FS

[35

] th

at d

istr

ibut

e se

rver

load

am

ong

clie

nts

are

desi

gned

to s

cale

to m

any

thou

sand

s of

cli

ents

.

The

cha

llen

ge o

f of

feri

ng d

istr

ibut

ed a

cces

s to

a w

ide-

area

info

rmat

ion

syst

em is

alm

ost

iden

tica

l to

that

fac

ed b

y a

larg

e-sc

ale

dist

ribu

ted

file

sys

tem

; som

e re

sear

cher

s ha

vepr

opos

ed m

akin

g th

is r

elat

ion

expl

icit

[33

] by

act

uall

y se

rvin

g fi

les

from

the

Web

ove

r a

wid

e-ar

ea d

istr

ibut

ed s

yste

m li

ke th

e A

ndre

w F

ile

Sys

tem

[28

]. A

ltho

ugh

this

idea

is s

till

cont

enti

ous,

bec

ause

it is

not

yet

cle

ar if

the

sem

anti

cs o

f th

e W

orld

Wid

e W

eb m

atch

thos

eof

a d

istr

ibut

ed f

ile

syst

em c

lose

ly e

noug

h to

sim

ply

impl

emen

t the

one

on

top

of th

e ot

her,

we

can

neve

rthe

less

sti

ll le

arn

muc

h fr

om th

e di

stri

bute

d fi

le s

yste

m c

omm

unit

y ab

out

buil

ding

larg

e-sc

ale

syst

ems.

In

the

next

sec

tion

we

disc

uss

the

larg

e-sc

ale

dist

ribu

ted

file

syst

em d

esig

ned

by B

laze

as

his

PhD

thes

is.

Lar

ge-S

cale

Hie

rarc

hica

l File

Sys

tem

sT

he g

oal o

f B

laze

's th

esis

[4]

was

to d

esig

n a

dist

ribu

ted

file

sys

tem

that

cou

ld s

cale

to a

very

larg

e si

ze, o

pera

ting

acr

oss

the

Inte

rnet

. He

achi

eves

this

goa

l by

buil

ding

an

expl

icit

hier

arch

ical

sys

tem

whe

re c

lien

ts c

an n

ot o

nly

cach

e it

ems

inde

fini

tely

but

can

als

o se

rve

them

to o

ther

cli

ents

.

Bla

ze in

trod

uces

fou

r ty

pes

of s

cale

that

a s

ucce

ssfu

l sys

tem

mus

t add

ress

: pop

ulat

ion,

traf

fic,

adm

inis

trat

ive,

and

geo

grap

hic.

Eac

h ty

pe h

as it

s ow

n se

t of

issu

es. S

cali

ng w

ith

popu

lati

on r

equi

res

a sy

stem

to c

ope

wit

h gr

owth

in th

e to

tal n

umbe

r of

pot

enti

al c

lien

ts a

ndim

plie

s th

at a

ser

ver

shou

ld n

ot n

eed

to s

tore

any

info

rmat

ion

abou

t its

cli

ents

. Sca

ling

wit

htr

affi

c re

quir

es th

e ab

ilit

y to

han

dle

the

wor

kloa

d ge

nera

ted

by a

ll c

lien

ts. S

cali

ngad

min

istr

ativ

ely

impl

ies

an a

bili

ty to

spa

n au

tono

mou

s en

titi

es. F

inal

ly, c

opin

g w

ith

geog

raph

ic s

cale

req

uire

s th

e ab

ilit

y to

spa

n la

rge

dist

ance

s, w

ith

the

inhe

rent

late

ncy

impl

ied

ther

ein.

The

Web

in it

s cu

rren

t for

m a

ddre

sses

thes

e by

eli

min

atin

g m

any

of th

e fr

ills

ass

ocia

ted

wit

hdi

stri

bute

d fi

le s

yste

ms

such

as

cach

ing

and

wri

te-s

hari

ng. O

ur c

hall

enge

is a

ddin

g an

opti

miz

ed c

achi

ng s

yste

m to

the

Web

wit

hout

red

ucin

g it

s ab

ilit

y to

sca

le a

long

thes

e li

nes.

http

://w

ww

.eec

s.ha

rvar

d.ed

u/~v

ino/

web

/pus

h.ca

che/

node

7.ht

ml

Pag

e 2

of 4

Bef

ore

desi

gnin

g hi

s sy

stem

, Bla

ze g

athe

red

trac

es f

rom

an

NF

S s

erve

r to

det

erm

ine

opti

mal

cach

e st

rate

gies

for

his

dis

trib

uted

fil

e sy

stem

. The

mos

t im

port

ant t

rend

that

em

erge

d is

that

file

s te

nd to

dis

play

str

ong

iner

tia,

whe

re th

e sa

me

file

s te

nd to

be

open

ed a

gain

in th

e sa

me

mod

e by

the

sam

e us

er. I

t app

ears

, for

exa

mpl

e, th

at k

eepi

ng a

fil

e in

a c

lien

t cac

he f

or tw

oho

urs

resu

lts

in a

n av

erag

e hi

t rat

e of

ove

r 66

%.

Bla

ze a

lso

foun

d th

at m

ost f

iles

are

ove

rwri

tten

whi

le s

till

ver

y yo

ung.

If

a fi

le is

not

chan

ged

soon

aft

er c

reat

ion,

it b

ecom

es u

nlik

ely

that

it w

ill b

e ch

ange

d. F

iles

ther

efor

era

pidl

y m

ove

tow

ard

a st

ate

of b

eing

rea

d-on

ly, a

nd s

hare

d-re

ad f

iles

are

eve

n le

ss li

kely

tobe

wri

tten

to. F

inal

ly, f

iles

ope

ned

for

read

ing

by o

ther

mac

hine

s ar

e li

kely

to b

e op

ened

for

read

ing

by s

till

oth

ers.

The

se r

esul

ts im

ply

that

cac

he s

hari

ng w

ill w

ork

wel

l for

adi

stri

bute

d fi

le s

yste

m b

ecau

se it

is u

nnec

essa

ry to

upd

ate

shar

ed-r

ead

file

s of

ten.

We

wil

lse

e th

at a

sim

ilar

ana

lysi

s dr

ives

the

Ale

x F

TP

cac

he a

s w

ell,

and

in c

hapt

er

we

wil

l see

that

thes

e re

sult

s al

so a

pply

to th

e W

orld

Wid

e W

eb s

ince

Web

fil

es th

at a

re g

loba

lly

popu

lar

chan

ges

less

oft

en th

an th

ose

that

are

pri

mar

ily

used

loca

lly.

Bla

ze a

naly

zed

a va

riet

y of

dif

fere

nt c

achi

ng s

chem

es. F

lat f

ile

syst

ems

such

as

NF

S, w

here

all c

lien

ts c

onne

ct to

the

sam

e se

rver

, ine

vita

bly

suff

er f

rom

a b

ottl

enec

k ef

fect

. Eve

ntua

lly

the

sing

le s

erve

r is

una

ble

to c

ope

wit

h th

e co

nnec

ted

clie

nts.

Eve

n w

ith

opti

mal

cli

ent

cach

ing

stra

tegi

es th

e se

rver

mus

t sti

ll d

eal w

ith

8-12

% o

f th

e or

igin

al tr

affi

c, a

nd a

s th

esy

stem

gro

ws

this

wil

l eve

ntua

lly

over

whe

lm a

ny s

erve

r.

The

onl

y so

luti

on is

to d

istr

ibut

e lo

ad a

mon

g se

vera

l ser

vers

. Rep

lica

ting

all

fil

es a

tse

cond

ary

serv

ers,

how

ever

, is

too

expe

nsiv

e, a

nd c

achi

ng it

ems

at s

econ

dary

ser

vers

as

requ

ests

pas

s th

roug

h th

em c

reat

es d

elay

s. F

urth

erm

ore,

hav

ing

an in

term

edia

te c

ache

proc

ess

all r

esul

ts y

ield

s su

rpri

sing

ly li

ttle

gai

n. M

ost o

bjec

ts n

ot in

the

clie

nt c

ache

wer

eno

t in

the

inte

rmed

iate

cac

he e

ither

. The

ans

wer

is to

look

in th

e ca

ches

mai

ntai

ned

by o

ther

clie

nts:

60-

80%

of

clie

nt c

ache

mis

ses

wer

e of

fil

es a

lrea

dy in

ano

ther

cac

he.

Whe

n th

e lo

ad a

t a s

erve

r be

com

es to

o hi

gh, t

he s

erve

r st

ops

serv

ing

the

file

dir

ectly

and

only

sen

ds o

ut li

sts

of o

ther

ser

vers

that

it k

now

s ha

ve c

ache

d th

e fi

le. C

lien

ts a

ttem

pt to

firs

t sat

isfy

fil

e re

ques

ts f

rom

thei

r ow

n fi

le c

ache

s, th

en f

rom

ser

vers

list

ed in

a lo

cal n

ame

cach

e, a

nd f

inal

ly f

rom

the

file

's p

rim

ary

host

. If

a cl

ient

rec

eive

s a

list

of

serv

ers

inst

ead

ofth

e fi

le, i

t cac

hes

that

list

in it

s na

me

cach

e, a

nd a

ttem

pts

to c

onta

ct o

ne o

f th

ose

serv

ers

inst

ead.

We

wil

l use

a s

imil

ar m

etho

d in

sec

tion

to

loca

te r

epli

cate

d fi

les

on th

e W

eb.

The

re a

re s

ever

al f

ailu

re m

odes

wit

h th

is s

yste

m. W

hen

two

mac

hine

s ca

n bo

th ta

lk to

ase

rver

but

not

to e

ach

othe

r, th

e se

rver

may

red

irec

t one

to a

noth

er c

ausi

ng a

fau

lt. T

here

are

also

som

e co

nsis

tenc

y is

sues

that

ari

se f

rom

com

pute

rs th

at a

re te

mpo

rari

ly d

isco

nnec

ted

from

the

serv

er a

nd c

an n

ot r

ecei

ve in

vali

dati

on m

essa

ges.

The

re a

re s

ever

al w

ays

in w

hich

we

appl

y B

laze

's w

ork

to o

ur o

wn.

We

alre

ady

men

tion

edth

at w

e us

e hi

s m

etho

d fo

r lo

cati

ng n

earb

y re

plic

as, a

nd th

at W

eb f

iles

beh

ave

in a

sim

ilar

way

to f

iles

in a

dis

trib

uted

sys

tem

. We

also

use

his

def

init

ions

of

scal

abil

ity

in s

ecti

on

todi

scus

s ou

r ow

n sy

stem

's s

cala

bili

ty.

Compendium 4 page 37

Page 38: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

http

://w

ww

.eec

s.ha

rvar

d.ed

u/~v

ino/

web

/pus

h.ca

che/

node

7.ht

ml

Pag

e 3

of 4

Ale

xO

ne s

yste

m th

at b

ridg

es th

e ga

p be

twee

n da

ta r

epli

cati

on a

nd d

istr

ibut

ed f

ile

syst

ems

is A

lex

[8].

Ale

x al

low

s a

rem

ote

FT

P s

erve

r to

be

map

ped

into

the

loca

l fil

e sy

stem

so

that

its

file

sca

n be

acc

esse

d us

ing

trad

itio

nal f

ile

oper

atio

ns.

Thi

s is

acc

ompl

ishe

d th

roug

h an

Ale

x se

rver

that

com

mun

icat

es w

ith

rem

ote

FT

P s

ites

thro

ugh

FT

P, c

ache

s fi

les

loca

lly,

and

com

mun

icat

es w

ith

the

loca

l fil

e sy

stem

thro

ugh

NF

S.

Whe

n th

e us

er w

ants

to a

cces

s a

rem

ote

file

, Ale

x do

wnl

oads

the

file

usi

ng F

TP

and

then

cach

es th

e fi

le lo

call

y.

Whe

n an

othe

r us

er n

eeds

the

sam

e fi

le a

nd tr

ies

to a

cces

s it

thro

ugh

the

file

sys

tem

, it i

s no

tne

cess

ary

to F

TP

it a

gain

. The

file

wil

l sim

ply

be ta

ken

out o

f A

lex'

s ca

che

rath

er th

an f

rom

the

orig

inal

FT

P s

ite.

Thi

s pr

ovid

es a

sav

ings

in b

andw

idth

and

als

o pr

ovid

es a

mea

sure

of

reli

abil

ity

in th

e fa

ce o

f ne

twor

k fa

ilur

es. I

t als

o sa

ves

loca

l dis

k sp

ace

beca

use

user

s do

not

have

to m

ake

pers

onal

cop

ies

of th

e sa

me

FT

P f

iles

; use

rs s

impl

y us

e th

e fi

les

dire

ctly

off

of

the

Ale

x se

rver

.

Ale

x's

mos

t nov

el f

eatu

re is

its

cach

e-co

nsis

tenc

y m

echa

nism

. As

we

have

alr

eady

see

nfr

om B

laze

's th

esis

, the

old

er a

fil

e is

the

less

like

ly it

is to

be

chan

ged.

The

refo

re, t

he o

lder

a fi

le is

, the

less

oft

en A

lex

has

to p

oll t

o in

sure

that

its

loca

l cop

y is

up-

to-d

ate.

To

avoi

dex

cess

ive

poll

ing,

Ale

x on

ly g

uara

ntee

s th

at a

fil

e is

nev

er m

ore

than

10%

out

of

date

wit

hit

s re

port

ed a

ge. A

s an

exa

mpl

e, if

a f

ile

is 1

mon

th o

ld, t

hen

alex

wil

l ser

ve th

e fi

le f

or u

p to

thre

e da

ys (

day

s =

3 d

ays)

bef

ore

chec

king

to s

ee if

it is

sti

ll v

alid

.

Thi

s is

eff

icie

nt, n

ot o

nly

beca

use

FT

P s

erve

rs d

o no

t hav

e to

pro

paga

te f

ile

upda

tes,

but

beca

use

it a

lso

avoi

ds th

e ne

cess

ity

of m

odif

ying

FT

P s

erve

rs to

add

mor

e so

phis

tica

ted

inva

lida

tion

tech

niqu

es. I

n ch

apte

r

we

find

that

this

cac

he-c

onsi

sten

cy s

chem

e is

appl

icab

le to

the

Wor

ld W

ide

Web

as

wel

l as

to F

TP

. In

the

next

sec

tion

we

exam

ine

Web

prox

ies

that

ext

end

the

conc

epts

of

Ale

x to

the

Wor

ld W

ide

Web

by

cach

ing

Web

dat

a ju

stas

Ale

x ca

ched

FT

P d

ata.

Wor

ld W

ide

Web

pro

xies

A W

eb p

roxy

[25

] ac

ts a

s a

gate

way

to th

e W

orld

Wid

e W

eb f

or a

cam

pus-

size

d ne

twor

k or

corp

orat

ion

behi

nd a

fir

ewal

l. T

he p

roxy

acc

epts

a r

eque

st f

or a

Web

doc

umen

ts f

rom

acl

ient

, ret

riev

es a

nd c

ache

s th

e do

cum

ent,

and

then

mak

es it

ava

ilab

le to

its

clie

nt. W

hen

anot

her

clie

nt w

ants

the

sam

e do

cum

ent t

he p

roxy

can

ser

ve it

fro

m it

s ca

che

wit

hout

hav

ing

to r

e-re

ques

t the

sam

e do

cum

ent.

The

mos

t pop

ular

Web

bro

wse

rs s

uppo

rt p

roxi

es [

29]

and

the

HT

TP

pro

toco

l has

pro

visi

ons

to h

elp

mai

ntai

n ca

che

cons

iste

ncy.

If

the

prox

y w

ishe

s to

det

erm

ine

if it

s ca

ched

pag

e is

up-

to-d

ate,

a li

ghtw

eigh

t pro

toco

l exi

sts

know

n as

the

Con

diti

onal

GE

T. A

bro

wse

r m

ay m

ake

aget

req

uest

to th

e se

rver

that

incl

udes

a ti

mes

tam

p, th

e If

-Mod

ifie

d-Si

nce

fiel

d. I

f th

e pa

geha

s be

en m

odif

ied

sinc

e th

at ti

me,

the

serv

er w

ill r

e-se

nd th

e pa

ge. O

ther

wis

e th

e se

rver

wil

lre

spon

d w

ith

a N

ot M

odif

ied

mes

sage

.

http

://w

ww

.eec

s.ha

rvar

d.ed

u/~v

ino/

web

/pus

h.ca

che/

node

7.ht

ml

Pag

e 4

of 4

Net

scap

e, a

com

pany

that

sel

ls a

com

mer

cial

web

-pro

xy, c

laim

s th

at a

2 o

r 3

giga

byte

Web

prox

y ca

n su

ppor

t tho

usan

ds o

f in

tern

al u

sers

and

can

pro

vide

a c

ache

hit

rat

io a

s hi

gh a

s65

%. R

esul

ts s

uch

as th

ese

indi

cate

that

web

pro

xies

can

hel

p al

levi

ate

som

e w

eb s

cala

bili

tyco

ncer

ns, b

ut a

s w

e sa

w a

bove

wit

h B

laze

, eve

n op

tim

al c

lien

t cac

hing

sti

ll d

oes

not h

elp

dist

ribu

te s

erve

r lo

ad s

uffi

cien

tly.

Web

pro

xies

als

o re

quir

e hu

man

inte

rven

tion

to s

et u

ppr

oper

ly, a

nd p

roxi

es m

ust s

till

sat

isfy

cac

he m

isse

s fr

om th

e pr

imar

y ho

st.

Nex

t: I

nter

net C

achi

ng S

chem

es U

p: R

elat

ed W

ork

Prev

ious

: Rel

ated

Wor

k

Jam

es G

wer

tzm

anW

ed A

pr 1

2 00

:26:

11 E

DT

199

5

Compendium 4 page 38

Page 39: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

http

://w

ww

.eec

s.ha

rvar

d.ed

u/~v

ino/

web

/pus

h.ca

che/

node

9.ht

ml

Pag

e 1

of 2

Nex

t: R

esou

rce

loca

tion

Up:

Rel

ated

Wor

k Pr

evio

us: I

nter

net C

achi

ng S

chem

es

Cac

he C

onsi

sten

cy M

echa

nism

s

The

re a

re s

ever

al w

ays

to m

aint

ain

cach

e co

nsis

tenc

y of

fil

es p

lace

d in

a c

ache

: tim

e-to

-liv

efi

elds

, act

ive

inva

lida

tion

pro

toco

ls, a

nd c

lien

t pol

ling

. Tim

e-to

-liv

e fi

elds

are

eff

icie

nt to

set

up, b

ut a

re in

accu

rate

. It i

s ha

rd to

tell

in a

dvan

ce h

ow lo

ng a

n ob

ject

wil

l be

vali

d fo

r;th

eref

ore,

it is

nec

essa

ry to

set

the

TT

L f

ield

to a

rel

ativ

ely

shor

t int

erva

l and

rel

oad

the

obje

ct f

requ

entl

y to

avo

id r

etur

ning

sta

le d

ata.

Cli

ent p

olli

ng, a

s w

e sa

w w

ith

the

Ale

xsy

stem

[8]

, dep

ends

on

the

fact

that

the

olde

r a

file

is th

e le

ss li

kely

it is

to c

hang

e, a

ndth

eref

ore

the

less

oft

en th

e cl

ient

nee

ds to

che

ck to

see

if it

s co

py is

sta

le.

The

Ale

x pr

otoc

ol d

ynam

ical

ly a

djus

ts to

min

imiz

e th

e am

ount

of

stal

e da

ta r

etur

ned,

but

any

wea

k-co

nsis

tenc

y pr

otoc

ol in

evit

ably

res

ults

in c

lien

ts r

ecei

ving

som

e st

ale

data

. We

stud

y it

s po

tent

ial f

or c

ache

con

sist

ency

on

the

Web

in c

hapt

er.

Inva

lida

tion

pro

toco

ls a

re r

equi

red

whe

n lo

ose

cons

iste

ncy

is n

ot s

uffi

cien

t; m

any

dist

ribu

ted

file

sys

tem

s re

ly o

n in

vali

dati

on p

roto

cols

to in

sure

that

cac

hed

copi

es d

o no

tbe

com

e st

ale.

Inv

alid

atio

n pr

otoc

ols

depe

nd o

n th

e se

rver

kee

ping

trac

k of

cac

hed

copi

es o

fit

s da

ta; w

hen

a fi

le c

hang

es th

e se

rver

not

ifie

s th

e ca

ches

that

thei

r co

pies

are

no

long

erva

lid.

Kur

t Wor

rell

, as

part

of

his

mas

ter's

thes

is [

37],

exa

min

ed th

e us

e of

TT

L f

ield

s on

the

Wor

ld W

ide

Web

and

det

erm

ined

thro

ugh

sim

ulat

ion

that

they

are

not

as

effe

ctiv

e as

an

inva

lida

tion

pro

toco

l tha

t all

ows

the

prim

ary

host

to in

form

cac

hes

whe

n ob

ject

s th

at th

eyar

e ca

chin

g be

com

e st

ale.

We

chal

leng

e th

is r

esul

t in

chap

ter.

One

fla

w w

ith

this

wor

k is

its

focu

s on

inva

lida

tion

to th

e ex

clus

ion

of o

ther

loos

eco

nsis

tenc

y pr

otoc

ols

like

the

Ale

x po

llin

g pr

otoc

ol. I

nval

idat

ion

prot

ocol

s m

ight

be

effi

cien

t in

term

s of

net

wor

k ba

ndw

idth

, but

they

hav

e se

vera

l non

-tri

vial

cos

ts. F

or o

neth

ing,

ser

vers

mus

t be

rew

ritt

en to

take

adv

anta

ge o

f se

rver

-dir

ecte

d in

vali

dati

on. C

lien

t-di

rect

ed p

roto

cols

suc

h as

the

Ale

x po

llin

g pr

otoc

ol o

r T

TL

fie

lds

do n

ot r

equi

re s

erve

rm

odif

icat

ion.

For

ano

ther

thin

g, s

erve

rs m

ust k

eep

trac

k of

whe

re th

eir

obje

cts

are

curr

ently

cach

ed. T

his

coul

d be

a p

rohi

biti

ve li

mit

to a

sys

tem

's s

cala

bili

ty u

nles

s ca

ches

are

org

aniz

edhi

erar

chic

ally

, suc

h th

at th

e se

rver

onl

y ha

s to

kee

p tr

ack

of th

e hi

ghes

t lev

el c

ache

s. W

orre

llas

sum

es th

is m

odel

to m

ake

his

inva

lida

tion

pro

toco

ls f

easi

ble,

but

in d

oing

so

pres

ents

anot

her

pote

ntia

l obs

tacl

e to

sca

labi

lity

: he

also

ass

umes

that

the

incl

usio

n pr

inci

ple

hold

s.T

he in

clus

ion

prin

cipl

e st

ates

that

any

tim

e an

obj

ect i

s lo

cate

d in

a lo

wer

leve

l cac

he, i

tm

ust a

lso

resi

de in

any

hig

her

leve

l cac

hes

in th

e hi

erar

chy.

Thi

s m

akes

sen

se f

or th

e pu

rpos

es o

f an

inva

lida

tion

pro

toco

l, si

nce

inva

lida

tion

mes

sage

s

http

://w

ww

.eec

s.ha

rvar

d.ed

u/~v

ino/

web

/pus

h.ca

che/

node

9.ht

ml

Pag

e 2

of 2

only

nee

d to

be

pass

ed o

n to

the

high

est l

evel

cac

hes,

but

cou

ld le

ad to

an

inef

fici

ent u

se o

fdi

sk s

pace

. All

obj

ects

cac

hed

on lo

wer

leve

ls m

ust a

lso

be c

ache

d on

hig

her

leve

ls, b

ut d

ueto

the

natu

re o

f hi

erar

chic

al c

ache

s, th

is m

eans

that

one

hig

h le

vel c

ache

mig

ht p

oten

tial

lyne

ed to

cac

he th

e eq

uiva

lent

of

hund

reds

of

low

leve

l cac

hes.

Oth

erw

ise

page

s m

ay b

eki

cked

out

of

a lo

w le

vel c

ache

sim

ply

beca

use

a hi

gh le

vel c

ache

is f

ull,

and

it m

ay b

eim

poss

ible

to f

ully

uti

lize

spa

ce a

vail

able

in lo

w le

vel c

ache

s du

e to

a la

ck o

f sp

ace

in a

hig

hle

vel c

ache

.

Onc

e ag

ain,

a c

lien

t-di

rect

ed c

ache

con

sist

ency

mec

hani

sm r

emov

es th

is r

equi

rem

ent s

ince

ther

e is

no

need

to o

rgan

ize

cach

es h

iera

rchi

call

y. T

his

also

mea

ns th

at c

ache

s ca

n be

crea

ted

and

take

n do

wn

quit

e ea

sily

sin

ce th

ere

is n

o re

ason

to f

it th

em in

to a

hie

rarc

hica

ltr

ee. C

lien

t-di

rect

ed c

ache

con

sist

ency

als

o re

mov

es o

ne o

f th

e m

ost s

erio

us e

rror

-mod

es o

fin

vali

dati

on p

roto

cols

: if

a ca

che

is te

mpo

rari

ly u

nava

ilab

le th

en th

e se

rver

mus

t con

tinu

etr

ying

to r

each

it; w

ith

no o

ther

mec

hani

sm f

or in

vali

dati

on a

vail

able

the

cach

e w

ill n

otkn

ow to

inva

lida

te th

e ob

ject

unl

ess

it h

ears

fro

m th

e se

rver

.

Har

vest

The

Har

vest

sys

tem

[5]

was

des

igne

d to

sol

ve th

e pr

oble

m o

f re

sour

ce lo

cati

on, b

ut a

lso

incl

udes

a h

iera

rchi

cal c

achi

ng s

ubsy

stem

. Thi

s ca

chin

g su

bsys

tem

fun

ctio

ns li

ke a

Web

prox

y in

that

cli

ents

req

uest

Web

obj

ects

thro

ugh

the

loca

l Har

vest

cac

he. I

f th

is q

uery

resu

lts

in a

cac

he m

iss,

the

loca

l cac

he in

turn

que

ries

eac

h of

its

neig

hbor

s, it

s pa

rent

, and

the

obje

ct's

pri

mar

y ho

st. E

ach

neig

hbor

cac

he a

nd p

aren

t ret

urns

wit

h a

``hi

t'' o

r ``

mis

s'',

and

if th

e ho

me

site

is r

unni

ng a

n S

NT

P e

cho

serv

er it

ret

urns

a `

`hit

'' as

wel

l. T

he o

bjec

t is

then

req

uest

ed f

rom

whi

chev

er s

ite

retu

rns

a ``

hit''

fas

test

. If

all c

ache

s m

isse

d, a

nd th

epr

imar

y ho

st is

slo

wer

than

the

pare

nt c

ache

, the

n th

e lo

cal c

ache

wil

l req

uest

the

obje

ctth

roug

h th

e pa

rent

cac

he. O

ther

wis

e th

e lo

cal c

ache

wil

l req

uest

the

item

fro

m th

e ob

ject

'spr

imar

y ho

st.

Thi

s sy

stem

is s

till

pri

mit

ive;

cac

hes

mus

t be

conf

igur

ed m

anua

lly

wit

h in

form

atio

n ab

out

pare

nts

and

neig

hbor

s se

t in

a co

nfig

urat

ion

file

. On

the

othe

r ha

nd, t

heir

res

ults

sho

w th

atth

e H

arve

st c

ache

is tw

ice

as f

ast a

t ret

urni

ng d

ata

as o

ther

wid

ely

avai

labl

e W

eb p

roxi

es f

orm

ost r

eque

sts.

Nex

t: R

esou

rce

loca

tion

Up:

Rel

ated

Wor

k Pr

evio

us: I

nter

net C

achi

ng S

chem

es

Jam

es G

wer

tzm

an

Wed

Apr

12

00:2

6:11

ED

T 1

995

Compendium 4 page 39

Page 40: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

2 Message Handling

1.1 Message Handling Overview

1.1.1 Same time and different time communication

Message handling systems (MHS systems) are systems for people to send messages to each

other. (Some messages are sometimes sent to or from applications without human

intervention, but the main functionality is messaging between people.) The table below gives

an overview of some major categories of MHS systems:

All participants are active atthe same time.

Different participants canread and write at differenttimes.

Non-computer equivalent Face-to-face meeting,telephone call.

Letter, newspaper.

Two or very fewparticipants.

Simple chat systems, instantmessaging systems.

Ordinary e-mail.

Communication withingroups of people.

More advanced chat systems. E-mail with mailing lists.Usenet News, forumsystems.

There is no absolute sharp limit. Ordinary e-mail with multiple recipients is often used to

communicate in small groups. And some chat systems contain archiving, so that those not

present can read the discussion at a later time. There is, however, certain design features

which are more important for the different categories.

For same-time messaging, it is important that everything said or written is immediately

shown to the other participants. Some e-mail systems have an alert facility, where the

recipient is reminded by a beep, whenever a new message arrives. Such e-mail systems can be

used for almost same-time communication. But most textual same-time communication is

done through specially designed chat or instant messaging systems. There are usually separate

panes for reading and writing, sometimes the full discussion scrolls in a combined window for

all participants, sometimes (only with very few participants) there is a separate window

showing what each participant has written.

For different-time messaging, the set of messages which has been read and not read at a

certain time will be different for each participant. People usually want to read only unseen

messages, but they also want to be able to go back and reread old messages. Because of this,

such systems have data bases, where messages are stored, and this data base knows what each

user has seen and not seen and provides facilities for rapidly scanning unseen messages and

searching for old messages. This can either be a personal data base with copies of all the

messages for each recipient, or joint data bases, where the text of the messages need only be

stored once, even if they have multiple recipients.

With different-time messaging, some people will get more messages than they can

comfortably read. It is easier for them to handle the message load, if the software gives them

facilities to organize the message data base. Typical such organization tools sort the messages

from each discussion group into a separate folder, and sort messages in a thread (messages

Compendium 4 page 64

Page 41: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Message Handling 3

which are direct or indirect replies to each other) together in some ways. The application layer

standards have special facilities to support such functions.

With same-time messaging, it is important that messages are transmitted fast. The store-

and-forward methods used by ordinary e-mail is too slow, but some systems for same-time

messaging use their own kind of fast store-and-forward transmission. Particular for same-time

messaging is that, since messages are written and arrive asynchronously, there can be dead

times when the recipient receives nothing. It is, therefore, useful to have a reminder system,

perhaps using a sound signal, when a new message arrives. There is also a tendency, in same-

time messaging, that the on-going conversation handles more than one topic at the same time.

People unaccustomed to chat systems react against this, and sometimes requre “one subject at

a time”, just as at face-to-face meetings following an agenda. However, one subject at a time

is not the most efficient use of the time. Because people write slower than they read, and

because dead times occurs when you are waiting for a reply to a message, the time is used

more efficiently if you allow multiple topics in parallel.

Two main variants of the user interface are common in chat systems. One variant has a

separate window for each participant, showing what that person has written. The other has a

single sequential window all participants of the same chat channel, listing all messages in

cronological order. A channel is a group of people who are discussing with each other in a

chat system.

1.2 Protocols Used for Message Transport

DNS = DomainName Service

MTA = MessageTransfer Agent

MTA = MessageTransfer Agent

Mailboxes

UA =User Agent,

clientsoftware usedby the user

UA =User Agent,

clientsoftware usedby the user

SMTP = Simple MailTransport Protocol

SMTP

DNSlookup

POP orIMAP

Many application layer protocols are commonly used for message transport. Here are some of

the most commonly used:

� The DNS is used to look up the mail host for the recipient, and then in a second step to

look up the IP address of the MTAs of the receiving MTA.

� SMTP is used to transport the message to each handling MTA.

� POP or IMAP is used to download the message from the last MTA to the recipient UA.

� MSGFMT (previously known as RFC822) and MIME are used to format the content of a

message.

� NNTP for distribution of Usenet News Articles.

In addition to this, messages may also be conveyed through usenet news, which has its own

Compendium 4 page 65

Page 42: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

4 Message Handling

set of protocols, partially based on the email protocols, but often with their own variations,

and messages may be conveyed through instant messaging protocols.

Note that the receiving user's mailbox is usually on the last MTA, and not in the personal

computer of the recipient. This is important, because personal computers are not always

online. Messages can thus be delivered to a mailboxes, even if the actual recipients are

travelling, ill, playing games or otherwise not connected to the Internet.

1.3 E-mail: Message Transport

Typical for e-mail and other non-simultaneous message systems is that the messages are

copied, often several times, between different mail servers on their way from sender to

recipient. The picture below shows a simple and common route which a message may take

from sender to recipient:

Sendermail client

Recipientmail client

E-mail router closeto the sender

E-mail router closeto the recipient

➁➀ ➂

When senders submit messages, their mail clients connect to nearby e-mail routers. The

whole message is then copied ➀ from sender to mail router. When this copying is ready, this

mail router takes over responsibility for delivering the message to its recipients:

When the receiving computer confirms reciept of the message, the sending computer is no

longer involved. In the second step, the mail router close to the sender connects to a mail

router close to the recipient. Again, the whole message is copied ➁, and responsibility is then

transferred to the e-mail router close to the recipient. In the final step, the message is copied

➂ to the recipient mail client.

Sendingcomputer

Receivingcomputer

Storage ofincomingmessages ona secure

Removal ofthe message

Confirmationthat the message

is received and stored

Here isthe message

Time

In e-mail terminology, the client of the sender and the recipient are often called User

Agents (UA), and the e-mail routers are often called Message Transfer Agents (MTA).

The transport of a message does not always pass exactly two MTAs as in the figure above.

Compendium 4 page 66

Page 43: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Message Handling 5

If both sender and recipient work at the same place, only one MTA may be needed. And in a

complex networking situation, more than two MTAs can be used, like in the figure below:

Sender UA Recipient UA

MTA close tothe sender

MTA close tothe recipient

Firewall Gateway Firewall

A firewall is a security device, and a typical function in it may be to scan for viruses in the

message. A gateway is a device which moves messages between two networks using different

e-mail protocols, for example from an Internet mail network to an X.400 mail network.

(X.400 is a mail protocol developed by the Telecom companies, it is today only used in

certain local areas.)

When a message is passed between two agents, a protocol is needed. The most common

protocol for passing messages between agents i SMTP. SMTP is almost universally used for

all mail transport except the transport from the last MTA to the recipient UA. The reason for

this is that SMTP is mainly designed for sender-controlled transmission. In every step

between two agents, it is the sending agent which controls the transmission and decides when

and where to send a message. This is not suitable for recipient UAs, since they often reside on

personal computers, which are not always connected to the network and willing to receive

messages. Thus, for the final step from the last MTA to the recipient UA, a protocol is needed

where the recipient has more control of when to receive a message. The two most common

protocols for this are POP and IMAP. In some cases, the last MTA and the recipient UA run

on the same computer or on two computers sharing a common file storage. In that case, the

protocol for the last step can just be simply file retrieval. Another important difference

between SMTP and the last-step delivery protocols is that identification (usually with a

password) is needed to stop messages being retrieved by other than its intended recipients.

An MTA which receives a message will usually store it in a so-called spool area. This is a

file storage area for temporarily storing files, and where there is some process, the spool

process, which goes through the spool areas and forwards messages from them. An exception

to this is the last MTA before delivery to the final recipient UA. That MTA usually stores

messages in mailboxes, separate areas for each recipient. This means that when a UA

connects to an MTA to download mail, all mail for that particular UA is stored in a special

place and can easily be located. See the figure below:

Sendermail UA

Recipientmail UA

MTA close tothe sender

MTA close tothe recipient

SMTPSMTP

POP or IMAP

Storage in separatemailbox for each

recipientStorage inspool area

If a message is sent with more than one recipient, the message may be split into separate

Compendium 4 page 67

Page 44: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

6 Message Handling

copies by one or more MTAs, as shown in the figure below:

Sendermail client

Recipientmail client

MTA close tothe sender

MTA close to thesecond recipient

MTA close to thefirst recipient

Recipientmail client

Message withtwo recipients

Message withone recipient

Message withone recipient

The advantage with splitting is that the same message need not be transported more than once

before the splitting. This can also be used to reduce transport costs across large distances. If a

sender in Europe sends a message to two or more recipients in North America, only one copy

might be copied across the expensive Atlantic cables as shown in the figure below:

Sendermail UA

RecipientMTA

MTA inEurope

MTA in NorthAmerica

RecipientMTA

Message withtwo recipients

Message withone recipient

Message withone recipient

Message withtwo recipients

A problem with this, however, is that most MTAs are not willing to handle mail, unless

either the recipient or the sender is local to the MTA. Thus, the saving shown above

requires an agreement with the MTA which splits the message after transport across the

Atlantic. This was not always so. In the beginning of the 1990-s, most MTAs were willing

to forward mail for any recipient. The reason why this was abolished in the middle of the

1990-s was that spammers used this feature to get foreign MTAs to help them split mail to

millions of recipients. Some so-called experts claimed that spamming could be stopped by

forbidding splitting of mail by other than the MTA of the sender or the recipient. They

enforced their view by implementing a program which scanned all MTAs everywhere,

checking that they did not allow foreign splitting, and sending angry letters to non-

conforming MTA administrators (postmasters) threatening to stop receiving mail from

them unless they stopped splitting. This is an interesting example of how the Internet is

regulated in dubious ways by pseudo-police-authorities. Spamming could be counteracted

more efficiently using other methods than this. In fact, there is no proof that spamming has

in any way been reduced by this method.

Here I am flaminggiving my persona

opinion on acontroversial issuOther experts havdifferent opinion o

this.

1.4 Use of the DNS for mail transport

E-mail has its own special way of using the DNS. The DNS can actually be seen as two

different data bases, one for e-mail and one for all other usage. The special records for e-mail

are called MX (mail exchange) records.

Compendium 4 page 68

Page 45: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Message Handling 7

If an MTA receives a message for a recipient with the e-mail address [email protected], it will

connect to the DNS and ask for the MX record for the domain “bar.net”. It may then be told

that the MX record for “bar.net” refers to the domain name “mail.bar.net”. It will then make a

second ordinary (not MX) DNS lookup to find the IP number for “mail.bar.net,” and it will

then forward the message to the SMTP port at this IP number.

A special facility of the DNS, when used for e-mail, is that a lookup for the MX record for

a domain (like “bar.net” in the example) may return more than one result. It may return a list

of domains. The reason for this is to increase reliability, by having more than one MTA which

is willing to handle mail for a particular recipient. An organisation may have one primary

mail MTA and another secondary mail MTA to use if the primary mail MTA is not in

operation. A common usage of this is that if each department in an organisation has different

MTAs, the organisation may have a common master MTA, which is used if one of the

department MTAs is out of operation.

There is, however, a risk with this technique. Suppose that an organisation foo.bar has

three different MTAs, mail1.foo.bar, mail2.foo.bar and mail3.foo.bar. Suppose that a

message arrives at mail3.foo.bar, but that the malbox of the recipient actually resides in

mail1.foo.bar. The MTA at mail3.foo.bar will then use the DNS to look up the MX record for

foo.bar, and may get a list of mail1.foo.bar, mail2.foo.bar and mail3.foo.bar. It might then

transfer the message to mail2.foo.bar. The MTA at mail2.foo.bar will make a new MX record

look up, get the same result, and transfer the message back to mail3.foo.bar. In this way, an

infinite loop may occur with the message being sent back and forward between mail2.foo.bar

and mail3.foo.bar.

To avoid such loops, the DNS has a special facility. It stores, with every MX record, a

priority. In the example above, a DNS lookup for foo.bar may return the three MTA names

mail1.foo.bar, mail2.foo.bar and mail3.foo.bar. But they might each have a priority, for

example:

mail1.foo.bar 0

mail2.foo.bar 10

mail3.foo.bar 20

There is then a rule, that an MTA should never transport a message from an MTA with a

lower MX record priority value to an MTA with a higher MX record priority value. Thus,

mail2.foo.bar can never transport the message to mail3.foo.bar, and no loop will occur.

There is then a rule, that an MTA should never transport a message from an MTA with a

lower MX record priority value to an MTA with a higher MX record priority value. Thus,

mail2.foo.bar can never transport the message to mail3.foo.bar, and no loop will occur.

1.5 Mailbox names

Senders and recipients of e-mail are called mailboxes. Note that the recipient mailbox is a box

on the last MTA, from which the recipient can download messages. The mailbox is not a

storage area on the recipient computer, unless the recipient computer and the last MTA is the

same computer (See the figure on page 3).

Compendium 4 page 69

Page 46: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

8 Message Handling

A mailbox in SMTP usually has two names, a user-friendly name and an e-mail address.

For example, my user-friendly name is “Jacob Palme” and my e-mail address is

[email protected]”. The two names are often written in sequence in the format shown by

these examples:

Jacob Palme <[email protected]>

"Nils B. Frändén" <[email protected]>

If the user-friendly name contains certain special characters, including periods followed by

spaces and non-latin letters, it must be enclosed in double quotes as shown above. However,

this rule is often broken, so it is advisable to write a mail program so that it accepts user-

friendly names without double-quotes.

As is shown by the examples above, the user-friendly name is not globally unique – two

different people may have the same user-friendly name. The e-mail address must be globally

unique (otherwise it would not be possible to know to which mailbox to deliver the mail), it

consists of a local mailbox name, the “@” character and a globally unique domain name.

(Some servers accept mail with an incomplete domain name, if the address is a local user to

the domain served by the server, but this practice is not recommended. It is much better to

always use globally unique e-mail addresses, because then the address of a user will be the

same independently of where it is sent from.)

1.6 Message format

Subject: RE: Applets in MHTML

Date: Fri, 12 Jan 2001 09:34:09 -0800

From: "Mary Smith" <[email protected]>

To: "James Thorn" <[email protected]>

We met, we parted, we met again. We didnot know, then, that what we had was thebest you can have in this life.

MAIL FROM:<[email protected]>RCPT TO:<[email protected]>

Heading

Body

Content

Envelope

A message being sent from sender to recipient consists of an envelope and a content. The

content consists of a message heading and a body.

The envelope contains information used or generated during the transport of the message,

such as the e-mail address of the recipient and the e-mail address of the sender. The sender's

address is needed to be able to send a non-delivery notification if the message cannot be

delivered to the recipient.

The content contains the information which is sent from the original author to the final

recipient. It is, in theory, not meant to be changed during transmission. In reality, it is

sometimes changed. Certain information which logically belongs to the envelope is in reality

Compendium 4 page 70

Page 47: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Message Handling 9

put in the heading. And the content is sometimes transformed, for example between two

different encoding formats, such as 8BIT and Quoted-Printable.

The body contains any kind of information, such a text, pictures, sound, video and/or file

attachments.

Some information is specified both on the envelope and in the heading, such as the sender

and the recipient. This information is, however, not always the same. The envelope shows

information pertaining to a particular transmission or step in transmission. The heading shows

information created by the sender and intended for the final recipient.

Example 1: If a person sends a message to a mailing list, and the mailing list forwards the

message to the members of the list, then the heading contains the original sender in the

“From:” field and the name of the mailing list in the “To:” field. The envelope contains

different information when transporting the message from the original sender to the mailing

list, and when transporting the message from the mailing list to the final recipients, as is

shown by the figure below:

Original sender

Mailing list

Final recipient

From: Henry Smith <[email protected]>To: Flower lovers <[email protected]>

MAIL FROM: <[email protected]>RCPT TO: <[email protected]>

MAIL FROM: <[email protected]>RCPT TO: <[email protected]>

From: Henry Smith <[email protected]>To: Flower lovers <[email protected]>

Envelope

Heading

Envelope

Heading

Example 2: If a message is split into copies to be delivered to different recipients, then each

copy will on the envelope contain the address of the recipient for this copy. A message may,

for example, first be sent with two recipient from the sender UA to the first MTA. The first

MTA may then split the message into two copies for the two recipients. The envelope will

then in the second step of transmission contain only the recipient for this transmission, see the

picture below:

Compendium 4 page 71

Page 48: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

10 Message Handling

Originator

First MTA

MTA for thefirst recipient

MTA for thesecond recipient

RCPT TO: [email protected] RCPT TO: [email protected]

RCPT TO: [email protected] TO: [email protected]

The body can consist of several body parts, for example a text and an attachment or a HTML-

formatted text and embedded pictures. Each body part can also consist of several body parts.

The figure below shows a complex message with body parts inside body parts; this particular

structure is very common:

Content-Type:Multipart/related

Content-Type:Multipart/alternative

Content-Type:Image/gif

Content-type:Text/plain

Content-Type:Text/html

When the body consists of several parts, each part has its own content-heading. A content-

heading contains information about the content part, such as content-type and encoding

method. The set of header fields allowed in a content-heading is a subset of the set of headers

allowed in a message-heading. All the header fields allowed in a content-heading have names

beginning with the string Content- such as “Content-Type”, “Content-Language”, etc.

1.7 E-mail: Message Heading

1.7.1 The Internet Message Format

The format of the messages themselves (corresponding to the P2/P22 protocols of X.400) is

specified in RFC 822 [17], RFC 1036 [19], RFC 1123 [20], and the multi-purpose Internet

mail extension (MIME) standards [21, 22].

The figure below shows an example of a message heading according to the RFC 822

standard. Note that what is shown is not some legible print-out of the heading, but the actual

contents, since RFC 822 uses readable IA5 characters in the heading.

From [email protected] Wed

Nov 4 04:16 MET 1992

Received: from sunic.sunet.se by heron.dafa.se (16.6/SiteCap-3.0)

id AA09605; Wed, 4 Nov 92 04:16:24 +0100

Received: from ics.uci.edu by sunic.sunet.se (5.65c8/1.28)

id AA25221; Wed, 4 Nov 1992 04:15:03 +0100

Compendium 4 page 72

Page 49: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Message Handling 11

Received: from ics.uci.edu by q2.ics.uci.edu id aa14794; 3 Nov 9213:47 PST

Received: from USENET by q2.ics.uci.edu id aa14789; 3 Nov 92 13:46PST

From: "Paul.Rarey" <[email protected]>

Subject: Re: X400 address

Message-ID: <[email protected]>

Encoding: 35 TEXT, 12 TEXT SIGNATURE

X-Mailer: Poste 2.0

Date: 3 Nov 92 21:46:13 GMT

To: [email protected],

Piet Beertema <[email protected]>

cc: [email protected]

... Text of the message ...

Example of an RFC 822 message heading.

User names in RFC822 can have both a user-friendly name and an e-mail address (see

page 8).

RFC 822 specifies a heading format, which is intended to be readable for both humans and

computers. It is used for communication between the sender and the recipient and is not used

during transmission. The most important fields in Internet mail headings are:

Received: Contains a trace of the hosts through which a message has passed. It is

mainly used when investigating problems with the mail transfer. Such

a trace could be used for loop control (as in X.400) but in practice

usually is not. Many UA softwares suppress these lines when showing

the header to their users.

From: The e-mail address of the author of the message.

To:, Cc: and Bcc: The recipient(s). Note that this is not the recipient(s) to be used when

delivering the mail (those are given in the SMTP protocol). This is

information to be read by the recipient(s) or their mail program. It can

even contain names which are not proper e-mail addresses.

Message-ID: A globally unique identity code for the current message, which can be

used in the “Reply-To” field to eliminate duplicates of the same

message arriving via different routes.

Reply-To: Used to indicate that personal replies to a message are to be sent to

someone else than the name in the “From” field. Note that the name in

this field should not be used for “group replies,” that is, replies

intended for the group of recipients or the mailing list that received the

In-Reply-To message.

Followup-To: Similar to “Reply-To” but used for group replies. The value must be a

usenet newsgroup name, not an e-mail address. (This is not part of

RFC 822 it is defined in the Usenet News standard, RFC 1036 [19].)

In-Reply-To: Used to coordinate replies with the original message. An intelligent

mail software can use this field to help the user find the original

message of a reply.

References: Similar to “In-Reply-To.” It is used mainly in Usenet News to indicate

references between postings to newsgroups.

Compendium 4 page 73

Page 50: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

12 Message Handling

Date: The date, time, and time zone when the message was sent.

Subject:: A title line describing the topic of the message.

The RFC 822 heading often contains additional fields not defined in the RFC 822 standard.

Some of them may be defined in other standards, while others are not standardized at all. In

the heading abpve, the fields “Encoding” and “X-Mailer” are not part of RFC 822.

The RFC 822 heading ends with a blank line, which marks the beginning of the text of the

message.

1.7.2 References between Messages and GlobalMessage IDs

There is often a need to let a message refer to previously sent messages: for example, in

replies. It is an important facility for a recipient of a reply to be able to easily ask his

electronic mail system for the message to which the reply replies. It is also useful for users of

electronic mail systems to be able to browse through whole conversations. In order to be able

to transfer a reply link between two messages which are sent at different times, it is useful if

every message has a globally unique identifying code. This is a code which no other message

has or is going to get.

E-mail has some kinds of identifying codes in the heading and sometimes also on the

envelope. The code on the envelope, however, is shortlived, and long-lived references

between messages are handled using the identifying code in the heading. A globally unique

identifying code in the heading is usually produced by combining two subfields. One subfield

is a valid electronic mail name (usually of the originator, but some other valid name can also

be used) plus an identifying code which is unique relative to this electronic mail name. Since

electronic mail names and addresses must be unique (otherwise it would not be possible to

deliver a message to its recipient), this will then produce a globally unique identifying code

for the message itself. Often, these parts are separated with an asterisk. Example:

200107081059*[email protected]. The part before the asterisk is usually an encoding of the date

and time when the message was sent in some format. Another variant is to use a sequential

numbering of all messages sent by [email protected]. A disadvantage with a sequential number is

that the last used number need not be stored, and if this stored number is lost, duplicate

Message-IDs will not be created. Therefore, the date/time method is usually better.

E-mail has three different kinds of references between messages, all indicated by use of the

Message-ID:

In-Reply-To: for ordinary replies.

References: for other kind of references.

Supersedes: when the new message is a replacement of an old message, for

example, a new version of a text under development or containing a

copy of the original message with an error corrected. Supersedes is not

yet standardized, but is commonly used in Usenet News, but not so

often in e-mail.

A message cannot have an In-Reply-To reference to more than one previous message, while

Compendium 4 page 74

Page 51: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Message Handling 13

the number of messages it is References: or Supersedes: can be more than one. A con-

sequence of this is that a conversation which is created using the In-Reply-To field will always

have a tree structure, something which is not true for conversations created by References:

and Supersedes:. Another consequence is that two different conversations can be merged into

one if the References: and Supersedes: fields are used to create conversations. Such merging

is not possible if only the In-Reply-To field is used.

The figure below shows how the Message-ID can be used to connect a reply to the original

message, using a data base of Message-ID-s with pointers to the messages they refer to in the

mailbox data base of both the sender and the recipient.

AAB

AA

B

BReplyto A

A

Messagedata base

IDfile

Messagedata base

IDfile

BReplyto A

BReplyto A

Use of Message-ID to correlate replies with the original message.

When you create a reply link on a message which contains forwarded body parts, you

should carefully consider to which heading the reply link should refer. This is illustrated in

the figure below.

To: BID: 12345

Text body

PersonA

PersonC

To: BID: 12345

Text body

To: CID: 67890

Textbody

To: A, BID: 4711

Text body

PersonB

Problem with references to the wrong Message-ID.

Person A writes a message to person B. B forwards the message to C, using the forwarding

mechanism of including the whole text of the original message as a body part of the

forwarded message. C then writes a reply to the original message and sends the reply to both

A and B. It is very important for C to think carefully of whether to use the ID of the original

message (12345, in the example) or the ID of the forwarding message (67890, in the

example). This is important because A has never seen the forwarding message. This means

that a reference in the reply, saying that it is a reply to 67890, will not be understood by A.

Good message systems should be designed to protect users from such mistakes.

Of course, the intention of C may be to reply also or mainly to the text body which B

Compendium 4 page 75

Page 52: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

14 Message Handling

added when the message was forwarded to C. But if that is the case, C should either send the

reply only to B or forward the whole message on to A, since a reply sent to A on the forwarded

body part in 67890 will otherwise not be understood by A.

A person will often get several copies of the same message, forwarded by different users

and distribution lists. It is an important service to the user that his electronic mail system be

able to correlate these copies, so that the person will not have to read the texts more than once

and so that the person can find all comments, appendices, and recipients of the message. Such

correlation should also be done when the recipient receives the same message both directly

and included as forwarded body parts in other messages. The globally unique Message-ID is

the code which is used to provide such correlations. Note the use of the word “correlate,” not

the word “merge.” Since all copies of the same message are not always identical, information

can be destroyed if they are merged carelessly.

1.8 E-mail: Message Body

1.8.1 Types in Internet Mail

1.8.1.1 MIME Introduction

The Internet mail standards used before 1992 could handle only text in the 7-bit ASCII

alphabet and did not allow the additional body-part type facilities available in X.400.

However, in 1992, the Internet mail standards added facilities in an extension called MIME.

MIME is defined in RFC 2045 [21] to RFC 2049 [22]. A good tutorial on MIME is given in

[25]. There is an FAQ2 on MIME [44] which includes a list of known MIME

implementations.

MIME allows Internet mail to contain the following

� Multiple objects in one message.

� Unlimited line length and message length.

� Character sets other than IA5 (7-bit ASCII). In particular, the ISO 8859-1 and the ISO

10646 (Unicode) character set is important.

� Binary and application-specific files.

� Diagrams, pictures, voice, video, and multimedia in messages.

� References to files, which can be retrieved automatically through the net when the

recipient wants to read the message. The contents of the file is thus not transported with

the message.

2 FAQ (frequently asked question) is a document containing answers to often asked

question in Internet discussion groups. There are many FAQs on various topics, and

they often contain a lot of useful information.

Compendium 4 page 76

Page 53: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Message Handling 15

With MIME, it is possible to have several body parts representing the same information in

different formats. The recipient or his client software can then choose the version which it is

best capable of displaying. For example, a recipient with an IA5-terminal can see a message

in this format, while a recipient with an ISO 8859-1 terminal can see a better version of the

message.

1.8.1.2 MIME Encoding of Data

There was a problem when MIME was defined: it was essential that existing e-mail systems

be able to handle MIME messages in a reasonable way without modification. However, some

previous Internet e-mail systems could not handle messages with 8-bit characters in the text or

with unlimited line length, binary information, etc. in the body.

Because of this, MIME encodes the new body parts in a format which looks like 7-bit

ASCII to old mail software. This is not a very neat solution, but it was necessary since no

extension facility for body parts was included in the old Internet mail standards. MIME

encodes additional information in either of two ways so that it looks like 7-bit ASCII. These

two methods are called base64 and quoted-printable.

Base64 uses four ASCII characters to represent three 8-bit bytes. The 24 bits of the three

8-bit bytes are split into four groups of 6 bits, and each such bit is encoded in a character set

which has 64 different values. These 64 values are those characters in 7-bit ASCII which are

least likely to be corrupted. (This is more fully described in the chapter about coding.)

Quoted-Printable represents those bytes, which have an exact correspondences in 7-bit

ASCII, with their 7-bit ASCII values. Other characters are encoded as an “=” character,

followed by two digits with the hexadecimal value of the byte. An exception is the “=”

character itself, which is coded as “=3D.”

1.8.1.3 Example of a Complete MIME-Encoded Message

The figure below shows a complete example of a MIME message as it is transmitted (not as it

is shown to the user). This message contains two body parts. Each of the body parts contains

the same text in ISO 8859-1.

Test message containing 8-bit characters.

AE=Ä, OE=Ö.

The two lines above are transmitted in both body parts in the MIME message below, but

the text is encoded as quoted-printable in the first body part and as base64 in the second body

part.

Return-Path: <[email protected]>

Date: Sun, 26 Sep 1993 18:49:01 +0100 (MET)

From: Jacob Palme DSV <[email protected]>

Subject: A MIME message

To: Lars Enderin <[email protected]>

Message-Id: <3.85.9309261822.A27024-0200000@ester>

Mime-Version: 1.0

Content-Type: MULTIPART/ALTERNATIVE; BOUNDARY="1430317162"

--1430317162

Content-Type: TEXT/PLAIN; CHARSET=ISO 8859-1

Content-Transfer-Encoding: QUOTED-PRINTABLE

Test message containing 8-bit characters.

AE=3D=80, OE=3D=85.

Compendium 4 page 77

Page 54: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

16 Message Handling

--1430317162

Content-Type: TEXT/PLAIN; CHARSET=ISO 8859-1

Content-Transfer-Encoding: BASE64

VGVzdCBtZXNzYWdlIGNvbnRhaW5pbmcgOC1iaXQgY2hhcmFjdGVycy4KQUU9

gCwgT0U9hS4K

--1430317162--

A complete MIME message.

As can be seen from the example in the figure above, quoted-printable has an advantage over

base64 because a recipient whose mail system is not MIME compatible will still be able to

understand much of the content, especially the ordinary English text. In the example, quoted-

printable is also shorter: only 61 characters are required compared to 73 characters for base64.

Base64 is, however, more suitable for pure binary text, such as a graphic bit image or an

encoded sound.

In addition to base64 and quoted-printable, MIME also allows the encodings 7bit, 8bit and

binary. These are not really encodings, since they represent uncoded data. Because of this,

8bit and binary should not be transmitted via mailers which are not MIME-compatible. To

transport such data uncoded, extensions to the SMTP standard for message transport are also

needed.

1.8.1.4 MIME Heading Extensions

MIME defines the following extended fields to the RFC 822 heading:

Content-Type: Specifies the type and subtype of the data in the body.

Content-Type: TEXT Specifies textual information in several different char-acter sets.

Content-Type: Specifies that the body contains more than one bodypart. Each body part can have additional headingfields at the beginning of the body part.

Content-Type:APPLICATION For application-specific or binary data. Of special im-portance isAPPLICATION/POSTSCRIPT.

Content-Type: MESSAGE For an encapsulated mail message.

Content-Type: IMAGE For a bitmapped picture using, for example, the graph-ics interchange format (GIF) or joint photographic ex-perts group (JPEG) formats.

Content-Type: AUDIO For sound. (Note that the word “voice” is used inX.400. However, X.400 voice is probably notintended to be restricted only to spoken sounds.)

Content-Type: VIDEO For video using, for example, the motion picture ex-perts group (MPEG) format. The video may, but neednot, contain a sound track.

Content-Transfer-Encoding: Specifies how the data is encoded. Encoding type mayalso be given in the content-type heading field.

MIME-version: To indicate that MIME is used and which MIME ver-sion is used.

Content-ID: ID code for the content. Can be used to allow onebody part to refer to another body part in anothermessage or for references between the body parts in amessage.

Content-Description: Contains a textual description of the content.

IANA (Internet Assigned Numbers Authority), the internet global registration authority,

maintains a register of MIME content types. IANA requires a publicly available description of

Compendium 4 page 78

Page 55: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Message Handling 17

the new format, or, alternatively, that public domain viewers for the new format are available.

A consequence of this is that some commercial companies have been unwilling to to register

the formats of their products. In such cases, a Content-Type EIGHTBIT has to be used

instead. The disadvantage with this is that the receiving client software cannot automatically

start a viewer for the new format when messages in that format arrives.

1.8.1.5 Multipart messages in MIME

The MULTIPART attribute in a MIME heading specifies that the body contains more than

one body part. Each body part can have additional heading fields at the beginning of the body

part, including recursive use of the multipart attribute to include MULTIPARTs as parts in

higher level MULTIPARTs.

Of special interest are

MULTIPART/MIXED for several parts of different types;

MULTIPART/ALTERNATIVE for the same information in more than one encoding;

MULTIPART/PARALLEL as mixed, but to be viewed at the same time (for exam-ple, voice in parallel with a drawing); and

MULTIPART/DIGEST to indicate that this is a collection of several messageswritten by different authors.

MULTIPART/RELATED several files, which are to be combined into onedocument or set of related documents. Used, forexample, to send HTML, SGML or XML with pictures,applets, etc., as separate files.

1.8.1.6 The Multipart/Related Content Type

The Multipart/related content type is designed when you are sending several files, which are

related by URL-links. It is used, for example, to send HTML, SGML and XML with

embedded pictures or applets as separate files.

Each file is a separate body parts. Each body part is labelled by either Content-ID or Content-

Location. The URL referring to the body part from another body part, is of the URL type

"cid:" to refer to a Content-ID, or can be any kind of URL (absolute or relative) to refer to a

Content-Location with the same content.

Example (abbreviated):

Content-type:Multipart/related

The compound object of the HTML text andthe embedded message.

Content-Type: Text/html The main text in HTL format.

<IMG SRC="cid:1*[email protected]> Link to an embedded image using a "cid:"type URL.

<IMG SRC="picture.gif"> Link to an embedded image using a relativeURL.

Content-Type: Image/gif

Content-ID: 1*[email protected] first embedded image, identified by aContent-ID.

Content-Type: Image/gif

Content-Location: picture.gifThe second embedded image, identified by aContent-Locdation URL.

Compendium 4 page 79

Page 56: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

18 Message Handling

Since some mailers do not support this, messages are usually sent using multipart/alternative,

with plain text in the first branch and HTML in the second branch. This can be done in two

ways:

1.8.1.6.1.1 With the multipart/alternative inside the multipart/related:

Content-Type:Multipart/related

Content-Type:Multipart/alternative

Content-Type:Image/gif

Content-type:Text/plain

Content-Type:Text/html

1.8.1.6.1.2 With the multipart/alternative outside the multipart/related:

Content-type:Text/html

Content-Type:Image/gif

Content-Type:Multipart/related

Content-Type:Text/plain

Content-Type:Multipart/alternative

Some mailers send messages using each of these methods, so a good mailer will have to be

able to receive messages in both formats.

1.8.1.7 The Message Content Type

The MESSAGE content type is used to forward a message within another message. For

example, you can forward a message you have received, as one body part, with your own

comment as another body part. MESSAGE has the following subtypes:

� MESSAGE/RFC822 when you encapsulate a normal Internet mail message, formatted

according to the RFC822 or MIME formats.

� MESSAGE/PARTIAL if this is one part of a large message, which you have split into

several smaller messages during transport. Information is in this format included to help

the receiving mail client merge the parts together into a complete message again.

� MESSAGE/EXTERNAL-BODY; ACCESS-TYPE= to send, instead of a message, just a

reference to where the message can be retrieved. ACCESS-TYPE can indicate various

ways of retrieving the actual content, like various variants of FTP etc.

1.8.1.8 Mime Heading Character Sets

MIME allows extended character sets also in message headings. The encoding is rather ugly

and some implementations do not support it.

1.8.2 In-line or attachments

In the user interface, a MIME message can either be displayed as a sequence of text passages

and images, or MIME can be used to send attachments to a primary textual message. The

attachments are then not viewed unless the user explicitly asks for them. In practice, the latter

Compendium 4 page 80

Page 57: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Message Handling 19

usage has been most common. RFC 1806 specifies an e-mail header which can be used to

indicate, separately for each body parts, whether this body part is to be shown inline or as an

attachment. The format is Content-Disposition: followed by either the word

inline or the word attachment. A file name parameter can also be included,

indicating a recommended file name if the attachment is stored in a file.

Note that if the Content-Type: Multipart/related is used to send a document

with inline objects like images, then the techniques described in the MHTML standard should

be used. Content-Disposition can be used, but should be ignored by mailers which

understand Content-Type: Multipart/related.

1.8.3 Sending HTML in e-mail

MIME was planned in order to allow messages to contain pictures, formatted text, sound, etc.

In practice, MIME has mostly been used to send such information as attachments. The

success of the World Wide Web (WWW) means that the HTML (Hypertext Markup

Language) has become a very successful format for documents with formatted text and inline

graphics. There are two ways of sending HTML in e-mail:

(1) Use the message/external-body MIME body part, which means that the e-mail

message only contains the URL (Uniform Resource Locator) of the actual message,

and the recipient mailer will then use this URL to retreieve the message from the net

just like viewing an ordinary web page.

(2) Send the HTML text within the message, using the Content-Type Text/HTML.

The HTML format often uses more than one file to convey a message. For example,

graphics, frames and applets are usually stored in separate files, which are combined by the

web browser when displaying the document. Thus, there is a need to send HTML as a set of

related body parts which together form the document. This is done using the Content-

Type: Multipart/related. The main message will then reference the other parts

either through their Content-IDs or through their URLs. If URLs are used, the other body

parts are given a heading Content-Location which indicates the URL of this body

part.

The MHTML standard is not only useful for sending HTML in e-mail. It can also be used

for archiving of web pages. By archiving web pages in the MHTML format, all information in

a web page can be archived in a single file, instead of separare archiving of the HTML text

and the in-line images and other in-line data.

1.9 SMTP – The Protocol for MessageTransport

The Internet protocol for message transport is SMTP (except for the transmission from the

final MTA to the recipient UA, where other protocols are usually used). SMTP is a protocol

for the transmission of a message from the original UA to the first MTA and from one MTA

to another MTA. SMTP is thus a protocol for transmission one step on the way from

originator to recipient. However, some information, such as the e-mail address of the sender,

Compendium 4 page 81

Page 58: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

20 Message Handling

is usually conveyed from one SMTP step to the next SMTP step.

Since SMTP is used to control the transmission, it specifies most of the envelope

information. Some envelope information is however specified in the message heading.

SMTP is a session-oriented protocol. By that is meant that a connection is established

between two agents, and then a series of interactions takes place between the two agents. Here

is an example of a simple SMTP session:

Sending agent command Responding agent response

Opens a connection to port 25 of thereceiving host

Listens for connections on port 25, acknowledgesnew connections

HELO dsv.su.se 250 nexor.co.uk

MAIL FROM: <[email protected]> 250 OK

RCPT TO:<[email protected]>

250 OK

RCPT TO: <[email protected]> 250 OK

DATA 354 Start mail input

... the lines of text in the message ...

. 250 OK

QUIT 221 nexor.co.uk service closing

An SMTP session can include the transmission of more than one message. In that case, a

new MAIL FROM command comes at the place of the QUIT command in the figure above.

The basic SMTP commands are:

“HELO” opens an SMTP session.

“MAIL FROM:” starts the sending of a new message by specifying the sender. Its value is

an e-mail address enclosed in angle brackets.

“RCPT TO:” is repeated once for every recipient. In the original SMTP protocol, an

acknowledgement from the server (the 250 response code) was required after every recipient,

before the next recipient address could be sent. SMTP has later on been extended with an

optional facility to send several recipients in sequence.

“QUIT” indicates the end of an SMTP session, a server receiving a QUIT command

should close the connection.

“DATA” signifies the start of the body of the message. The server response “354”

indicates that the server expects more. After DATA, the body is sent. According to the

original SMTP standard, the body must be a number of text lines, and finished by a line

containing a single period. Line breaks in SMTP must always be a carriage return character

plus a linefeed in sequence (CRLF) and the end of the body is thus signified by CRLF, a

single period, and a second CRLF. In order to allow the sending of messages containing lines

with only a single period, all lines which start with a period have an extra period sent at the

beginning of the line.

Thus, the following text: Is in SMTP sent as:

The next sentence will start witha period:

.line starting with a period

The next sentence will contain asingle period

.

The next sentence will start witha period

..line starting with a period

The next sentence will contain asingle period

..

.

In addition to the HELO, MAIL FROM:, RCPT TO:, DATA and QUIT command, the

Compendium 4 page 82

Page 59: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Message Handling 21

original SMTP protocol contained some additional commands. Most of these are either not

used or not used very much today. Examples of these commands are:

VRFY to ask the server to verify that it has a mailbox with a certain e-mail address. A

server may also give a positive response if it is willing to deliver a message to this recipient,

even though the recipient mailbox is not on the same host as the server. The parameter to

VRFY can be an abbreviated name, the response should be the full e-mail address, if the

abbreviation is non-ambiguous.

TURN is a command which within a single SMTP session switches the roles between

sender and recipient, so that the agent which started the SMTP session can receive mail from

the MTA it connected to. TURN is useful if connection times are long, such as for dial-up

modem connections. The increasing use of fixed lines has reduced the needs for the TURN

command.

The most common digits in the SMTP reply codes are:

1.9.1.1.1.1 First digit:

1 Positive preliminary (not used in SMTP)

2 Success

3 Ready but requires additional info

4 Transient failure

5 Permanent negative

1.9.1.1.1.2 Second digit:

0 Syntax (error)

1 Information requested in reply

2 Transport service problem

5 Application-specific problem

1.9.1.1.1.3 Examples of reply codes to the MAIL FROM command:

250 Originator accepted

452 Out of local storage

500 Command syntax error

A number of proposals for extensions to SMTP have been developed. These extensions allow

the sending of delivery-report requests, binary and 8-bit data, and an SMTP sender can check

if a very large message can be received before sending it. These extensions are only to be

used by extended SMTP servers, so there is also a protocol for two SMTP servers to query

each other’s capabilities at the start of an SMTP session. This protocol thus provides an

extension mechanism to SMTP and is called ESTMP (Extended Simple Mail Transfer

Protocol). The method of protocol extension used by ESTMP is that the server gives the client

a list of which ESTMP extensions it supports, and the client must then only use those

extensions which the server says that it supports. This method has many advantages compared

to the method of indicating protocol level (like HTTP version 0.9 or 1.0 or 1.1) because a

server can choose to provide certain extensions without having to support other less useful

extensions.

In plain ESTMP, a session is started by the client sending the HELO command. A client

which supports ESTMP send the EHLO command instead of the HELO command. The

EHLO command only indicates that the client understands ESTMP, not that the client is

capable of handling any particular ESMTP extension.

Compendium 4 page 83

Page 60: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

22 Message Handling

The table below lists the most important ESMTP extensions:

Service extension Keyword Parameters Verb RFC

Send SEND none SEND 821, 1869

Send or Mail SOML none SOML 821, 1869

Send and Mail SAML none SAML 821, 1869

Expand EXPN none EXPN 821, 1869

Help HELP none HELP 821, 1869

Turn TURN none TURN 821, 1869

Pipelining PIPELINING none none 1854

Message sizedeclaration

SIZE adds optional parametersize-value ::=1*20DIGIT no of octets

none 1870

Checkpoint/ restart CHECKPOINT adds optional parameterTRANSID to MAILFROM command

none 1845

Large and binaryMIME messages

CHUNKING BDAT is used instead ofDATA, and takes asparameter packet lengthand last packetindication

BDAT 1830

8bit-MIMEtransport 8BITMIME adds optional parameterBODY to MAILFROM, values 7BIT and8BITMIME

none 1652

Delivery StatusNotification Extension

DSN adds optionalparameters NOTIFYand ORCPT to RCPTcommand and RET andENVID to the MAILcommand

none 1891, 1892,1894

Here are three different examples of ESTMP capability negotiations:

(1)Only mandatory SMTPcommands provided

S: <wait for connection on TCP port 25>

C: <open connection to server>

S: 220 dbc.mtview.ca.us SMTP service ready

C: EHLO ymir.claremont.edu

S: 250 dbc.mtview.ca.us says hello

(2) S: <wait for connection on TCP port 25>

C: <open connection to server>

S: 220 dbc.mtview.ca.us SMTP service ready

C: EHLO ymir.claremont.edu

S: 250-dbc.mtview.ca.us says hello

Basic optional services:EXPN, HELP;

S: 250-EXPN

S: 250-HELP

Standard service extension:8BITMIME;

S: 250-8BITMIME

Unregistered services:XONE and XVBR

S: 250-XONE

S: 250 XVRB

(3)ESTMP not supported

S: wait for connection on TCP port 25>

C: <open connection to server>

S: 220 dbc.mtview.ca.us SMTP service ready

C: EHLO ymir.claremont.edu

S: 500 Command not recognized: EHLO

Compendium 4 page 84

Page 61: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Message Handling 23

1.9.2 SMTP command pipelining

SMTP often requires many interactions back and forward between client and server. For

example, when a message is to be sent to many recipients, the client is expected to send a

RCPT TO for each recipient and wait for the response from the server before sending the next

RCPT TO. This will make the protocol slow, since interactions across large network distances

often cause a delay of one or more seconds.

In practice, many SMTP clients ignore this rule, and send everything without waiting for

reply codes, and then gets all the reply codes asynchronously. This is not correct, but usually

works, since TCP will provide storage on the net of the data sent in advance. An ESMTP

extension specified in RFC 1854 allows a server to indicate that it is capable ot such

pipelining.

1.9.3 Use of domain addresses for routing

Sender

Nameserver

Nameserver

Nameserver

SE SU.SE DSV.SU.SE

MTA

DSV.SU.SE

Recipient

Transmission of the whole message

Name server inquiries

Name server look-up followed by direct transmission.

The figure above shows how a series of name servers that know host addresses in domains

successively closer to the recipient can be used to find the network address of the recipient

MTA host. The sender can then transmit directly to the recipient MTA. Of course, the name

servers for SU.SE and DSV.SU.SE might be combined into one name server. And the name

server for SE can cache names and addresses, so that after a second inquiry for the address of

DSV.SU.SE, the SE name server can answer directly without forwarding the query to SU.SE.

The technique of keeping names passing through a name server is called caching. Cached

addresses should only be kept for a limited time, since they may otherwise become out-of-

date.

The technique described in the figure above, where each successive name server connects

to another name server, is called chaining. Another also commonly used technique is called

referral. With referral, the searcher will successively connect to a series of name servers until

the right one is found.

1.9.4 Routing and Use of Name Servers in Internet

The methods for routing mail messages using name servers in Internet is described in RFC

Compendium 4 page 85

Page 62: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

24 Message Handling

974 [18], RFC 1101 [23], RFC 1123 [20], and RFC 1348 [24].

Resolver Nameserver MTA

RecipientTransmission of the whole message

Client

DNS query

(1)

(2)

(3)

(4)

(5)

Use of name servers for Internet mail routing.

E-mail handling in the Internet is usually done in the following stages. (The numbers refers to

numbers in the figure above.)

(1) The originator edits and submits his message for mailing.

(2) For each recipient, the name server facility is used to find the IP-address of the host

which is most suitable for delivering mail to the recipient.

(3) The mail is then forwarded to the hosts serving each recipient. The SMTP protocol is

used for this.

(4) The host described in (3) can be the host closest to the recipient or an intermediate host

which forwards the message, for example, a gateway to another network with a different

mail standard. If it is an intermediate host, this host will then use SMTP or some other

protocol (like X.400 P1) to forward the message to the next host in a chain leading to

the recipient. If the message is sent to a distribution list, the host handling the list will

expand the list and forward the message to the members of the list. Often, incoming

messages from the Internet to an Intranet are first routed to a firewall process, which

will check the messages for viruses, before they are re-routed internally.

(5) Finally, the recipient reads the mail. The mail system of the recipient can use the

machine-readable information in the message heading to aid the user in providing

facilities like:

(a) Finding the message to which the current message is a reply.(b) Finding messages from a certain sender, or messages which arrived

via a certain distribution list.(c) Finding messages written between certain dates.

Name servers for routing in the internet are called DNS servers. A DNS server takes as

input an Internet domain address, such as EIES2.NJIT.EDU. There are many DNS name

servers, all responsible for only part of the DNS tree. This means that sometimes the first

name server contacted by a resolver may not have the information requested. The information

can then be found by using either chaining or referral. The Internet DNS allows both chaining

and referral. Every DNS server must support referral. Support for chaining is optional.

1.9.5 Delivery Status Notifications

A good electronic mail system should always (unless the sender explicitly relinquishes this

requirement) inform the sender if a message cannot be delivered. But sometimes a fault, such

as a disk crash, can cause messages to disappear without any notification to the sender (such

situations are often called black holes). However, if the sender has requested a delivery

Compendium 4 page 86

Page 63: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Message Handling 25

notification, the fact that it has not arrived within a reasonable time can indicate to the sender

that the message has been lost, even when no nondelivery notification appears.

It is, of course, advantageous to the sender if his mail software automatically recognizes

incoming delivery and nondelivery notifications and handles them in suitable ways. The

sender may prefer not to be informed every time such a notification arrives, but to store them

so that later the sender can ask the system to provide a report about which recipients have

received their messages. Notifications usually indicate that a message has reached the mailbox

of the recipient, but some systems also provide notifications when a message has been read by

the recipient, or, rather, when it has been shown on the screen of the recipient.

Nondelivery notifications are often difficult to understand if you are not an expert on

electronic mail protocols. Getting a nondelivery notification from some recipient to whom

you never sent any message can be especially confusing. What happens is that you send a

message to a mailing list, and there is a delivery problem in delivering the message to one of

the members of this list. For large mailing lists, such delivery reports should, of course be sent

to the manager of the list, not to the originator of the message. However, it is not uncommon

for mailers to mistakenly send the nondelivery notifications to the originator.

Distri-bution

listOriginator

Recipient

Recipient

Recipient

Nondelivery notification

Manager

Sending error reports to mailing list manager or to the originator

The Internet mail delivery notification functionally is called Delivery Status Notifications

(DSNs), and it is specified in four RFCs:

RFC 1891: SMTP service extension, 31 pages

RFC 1892: Multipart/report content-type, 4 pages

RFC 1893: Enhanced status codes, 15 pages

RFC 1894: Delivery Status Notification format, 31 pages

Requests for delivery status notifications is sent via SMTP, using optional parameters to

the RCPT TO and MAIL FROM commands:

SMTPcommand

Optionalparameter

Description

RCPT TO NOTIFY Values:NEVER = do not send any DSNs.SUCCESS = send a DSN if the message was successfully deliveredto the recipient mailboxFAILURE = send a DSN if the message could not be delivered to therecipient mailboxDELAY = indicate willingness to accept notifications if delivery isdelayed but may succeed later on

RCPT TO ORCPT Original sender-specified recipient address (needed to allow recpientof notifications to correlate notifications with original recipients,since some gateways will rewrite the recipient e-mail addressesbefore delivery)

Compendium 4 page 87

Page 64: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

26 Message Handling

SMTPcommand

Optionalparameter

Description

MAILFROM

RET Values:FULL = return full text of the message with the DSN HDRS = return only headers of the message with the DSN

If neither FULL nor HDRS is indicated, this means that no return ofeither headers or body is requested.

MAILFROM

ENVID Transaction ID to be returned with notification, so that the sender canfind which message sending caused the failture. (Message-ID cannotbe used for this, since sometimes the same message is sent more thanonce with the same Message-ID).

While the requests for DSNs are sent via SMTP, the DSNs themselves are specially

formatted MIME messages. A DSN is sent as a MIME message with the Content-Type

Multipart/Report. A Multipart/Report consists of two mandatory and one optional part:

Part 1 (mandatory) is a human readable message explaining in natural language the cause

of the error. This part is mainly intended for recipients whose e-mail clients do not recognize

the special format of Part 2 of a DSN.

Part 2 (mandatory) contains a machine parsable account of the reported event, with a

special MIME type called Message/Delivery-status.

Part 3 (optional) contains the original message either full or only its headers.

Here is an example of a complete delivery status report:

Date: Thu, 7 Jul 1994 17:16:05 -0400

From: Mail Delivery Subsystem <[email protected]>

Message-Id: <[email protected]>

Subject: Returned mail: Cannot send message for 5 days

To: <[email protected]>

MIME-Version: 1.0

Content-Type: multipart/report; report-type=delivery-status;

boundary="RAA14128.773615765/CS.UTK.EDU"

--RAA14128.773615765/CS.UTK.EDU Part 1, in “human-readable” (?)format:

The original message was received at Sat, 2 Jul 1994 17:10:28-0400

from root@localhost

----- The following addresses had delivery problems -----

<[email protected]> (unrecoverable error)

----- Transcript of session follows -----

<[email protected]>... Deferred: Connection timed out

with larry.slip.umd.edu.

Message could not be delivered for 5 days

Message will be deleted from queue

--RAA14128.773615765/CS.UTK.EDU Part 2, in machine-parsableformat:

content-type: message/delivery-status

Reporting-MTA: dns; cs.utk.edu

Original-Recipient: rfc822;[email protected]

Final-Recipient: rfc822;[email protected]

Action: failed

Status: 4.0.0

Compendium 4 page 88

Page 65: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Message Handling 27

Diagnostic-Code: smtp; 426 connection timed out

Last-Attempt-Date: Thu, 7 Jul 1994 17:15:49 -0400

--RAA14128.773615765/CS.UTK.EDU

content-type: message/rfc822 Part 3, return of originalmessage:

[original message goes here]

--RAA14128.773615765/CS.UTK.EDU--

Here are some of the fields which can occur in a Delivery Status Report:

per-message-fields = Fields which apply to the wholemessage.

[ original-envelope-id-field CRLF ] Envelope identifier from request.

reporting-mta-field CRLF MTA which attempted to performthe delivery or relay.

[ dsn-gateway-field CRLF ] Name of gateway whichtransformed foreign delivery report.

[ received-from-mta-field CRLF ] MTA from which the message wasreceived.

[ arrival-date-field CRLF ] Arrival date to reporting MTA.

*( extension-field CRLF )

per-recipient-fields = Fields which apply to only onerecipient at a time.

[ original-recipient-field CRLF ] Original recipient when sent.

final-recipient-field CRLF Final recipient to whom deliverystatus is reported.

action-field CRLF failed, delyaed, delivered, relayed(to non-DSA environment),expanded.

status-field CRLF Status code (RFC 1893)(DIGIT "." 1*DIGIT "." 1*3DIGIT

[ remote-mta-field CRLF ] Name of MTA which reported toreporting MTA.

[ diagnostic-code-field CRLF ] Sometimes less preciste diagnosticcode from remote MTA.

[ last-attempt-date-field CRLF ] Time of last delivery attempt.

[ final-log-id-field CRLF ] Log entry in final MTA logs.

[ will-retry-until-field CRLF ] Time when delivery attempts willstop.

*( extension-field CRLF )

1.10 Message delivery

When a user runs an e-mail client package on his personal computer, this client needs a

protocol to talk to a server, corresponding to the P3 and P7 protocols in X.400. The model

behind these protocols, like in X.400, is that mail messages are stored in a server, to be

downloaded to the e-mail client at request of the user, as is shown by this picture:

Compendium 4 page 89

Page 66: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

28 Message Handling

User Agent MTA MTA Mailbox User Agent

SMTP SMTP SMTP POP or IMAP

Model behind the POP and IMAP protocols.

In the Internet, the two most important such protocols are:

• Post Office Protocol (POP) [9, 12], a protocol for fast downloading of mail to client

software, where the client stores and handles the mail in the personal computer,

corresponding to P3 in X.400.

• Interactive Mail Access Protocol (IMAP) [8], a protocol for cases where the user wants to

store his messages in the server, and wants to be able to manipulate this storage from

client software on his personal computer. IMAP is a more complex protocol than POP.

Commands in POP and IMAP are textual strings, just like commands in SMTP. Here is a list

of the most important commands in POP:

USER Client identifies mailbox to be downloaded

PASS Password

STAT Get number of messages and size of mailbox

LIST N Return size of message N

LAST Get highest message number accessed

RETR N Retrieve a full message

TOP N M Retrieve only headers and the first N lines

DELE N Delete message

QUIT Release service

NOOP See if POP server is functioning

RPOP Insecure authentication

IMAP is a more sophisticated protocol than POP. In IMAP, a server can send messages to the

client without a request from the client, and several transactions between server and client can

go on in parallel. This can be used to reduce the wait time for the users. Each message in an

IMAP mailbox has a set of properties, which can be retrieved one or more than one at a time.

Examples of properties are a seen flag and a deleted flag on the message. In IMAP, to delete a

message you first set the delete flag on the message, and then perform the expunge command.

IMAP also has a capability for searching for messages in the mailbox stored in the server.

1.11 Mailing lists

1.11.1 Expansion of nested mailing lists

Nested mailing lists occur when one list is a member of another list. If, for example, list B is a

member of list A, then a message sent to list A will be distributed to all members of both list

A and B. A message sent directly to list B, however, will not reach the members of list A,

unless A also is a member of B.

Compendium 4 page 90

Page 67: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Message Handling 29

There are two techniques for handling mailing lists:

• Sender UA expansion. The mailbox software (user agent software) of the sender finds

the list of the members of the list and sends it directly to them. If the lists are nested, the

sender UA will successively and recursively find the lists of members of the sublists, all the

way to the final recipient. With this method, the message will thus be converted to an ordinary

multirecipient list. The message will also have ordinary users, and not lists, as recipients,

before the message leaves the sender UA .

• Expansion at the list location. The sender’s UA sends the list to the list-expansion

agent or to the domain which is responsible for the list. The list is then expanded at this

location. Expansion means the replacement, on the envelope (not in the message heading) of

the name of the list with the names of the members of the list. With nested lists, the message

is then forwarded to the domains of the sublists, which perform the secondary expansion, and

so on if there are sublists to the sublists.

Consider a person who is a member two lists. With the second method, this person will

probably receive two copies of the same message. With the first method, such duplicates can

be eliminated during the expansion. In spite of this, expansion at the list location (the second

method) is most common. With two nested lists, one for the European and one for the

American members, the message need only be sent to one single recipient when crossing the

Atlantic. If the whole list is expanded by the sending UA, at least 100 recipient have to be

listed when crossing the Atlantic, which will make the transfer more expensive. See the figure

below.

Advantage of using nested mailing lists (bottom) vs. nonnested (top).

Other advantages with expansion at the list is that it is easier, to support distributed control of

the membership of a group—each sublist can have its own management. This can, of course,

be a disadvantage when central control of the membership is required.

1.11.2 Loop control for nested mailing lists

If two lists are directly or indirectly members of each other, there is a risk that the same

message will be looped back and forward indefinitely between the lists. There are several

different techniques for avoiding this:

Compendium 4 page 91

Page 68: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

30 Message Handling

(1) Full expansion by the originating UA.

(2a) A trace list of all the mailing lists passed is put on the envelope of the message. A

mailing list can then refuse to accept, incoming messages, that have the name of the

mailing list itself as part of the trace list on the envelope of the message.

(2b) Using a variant of method (2a), each mailing list will instead refuse to send a message

to another mailing list which is included in the trace list of the message.

(3) The registration system for mailing lists is designed in such a way that no list will ever

be a member, directly or indirectly, of itself.

(4a) Each mailing list stores the message-IDs of all messages passing through the list. When

the same message returns once more to the list, the list checks the message-ID of the

message and stops the loop if a message with the same ID has already passed through

the list. (Message-ID is also known under the term Message-ID.)

(4b) This is a variant of method (4a), where a checksum of the content of the message isused instead of the message-ID.

An advantage of method (2b) is that it saves some unnecessary transmission. However,

method (2b) is not as reliable as method (2a), because the same electronic mail address can

have different forms, and therefore the comparison used in method (2b) may not work. It is

easier for a list expander to recognize a name which it itself puts on an envelope than a name

which some other list expander puts on an envelope.

Comparing methods (4a) and (4b), method (4b), use of a checksum, has the advantage that

it caters for systems where the message-ID is corrupted (something which is not unusual), but

has as a disadvantage that two different messages may accidentally get the same checksum. A

further problem with the message-ID is that two messages with the same message-ID may not

always be identical. Finally, some mail systems do not generate globally unique message-IDs

on the messages they produce.

Method (4a) is often used by computer conferencing systems, sometimes combined with

the other methods. For example, Usenet News mainly uses method (4a).

Programs that send messages through gateways from Internet e-mail to Usenet News will

assign Message-IDs to messages lacking such IDs. This can cause the same message to get

different Message-IDs by different gateways.

The Listserv software has very powerful methods for avoiding loops, primarily based on

method (4b).

In practice, many existing mailing lists have no loop control mechanism except that the

people who manage the list manually try to avoid producing loops.

Usenet News uses Message-IDs as its main loop control mechanism. This is, in Usenet

News, implemented in a way which will sometimes cause multi-group messages to only

appear in some of the groups, to which it was sent. The figure below explains how this can

happen:

Compendium 4 page 92

Page 69: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Message Handling 31

Usenet NewsServer A

Gateway frome-mail to

Usenet News

Gateway frome-mail to

Usenet News

Usenet NewsServer B

News-group 2

News-group 1

News-group 1

News-group 2

The original message is sent by e-mail to two e-mail mailing lists,which are both gatewayed

from e-mail to Usenet News through two gateways. One of the gateways enters one of the

mailing lists to Newsgroup 1 through Usenet News server A. The other gateway ensters the

other mailing list to Newsgroup 2 through Usenet News server B.

When the Newsgroup 1 version of this message is to be moved from news server A to

news server B, the loop control in news server B will refuce to accept the message, since it

already has the message in Newsgroup 2. This means that subscribers to Newsgroup 1, but

not Newsgroup 2, in Usenet News server B, will incorrectly never see this message. In the

same way, subscribers to only Newsgroup 2 in News server A will never see the message.

I have suggested that the Usenet News standards should be modified to remove this

problem. But the experts in IETF say that this is not possible, we will have to live with this

problem. This is not the first time when the technical experts in IETF have refused user-

friendly changes to standards.

1.11.3 Management of large mailing lists

Management of large mailing lists is not easy. The manager will every day receive

nondelivery notifications for messages from the list which cannot be delivered, and requests

for addition and removal of members from the list. It is sometimes difficult to identify the

item in the list of members which such a notification or request refers to. Sometimes, they are

actually members of a sub-mailing list. The problems of managing large mailing lists is more

fully discussed in [3].

1.11.4 How You Become a Member of a Mailing list

1.11.4.1 The LISTSERV Way

Suppose you want to become a member of a mailing list, whose e-mail address is

[email protected]. You then write an e-mail message addressed to [email protected]

and write in this message a single text line with the text:

SUB listname Your Own Name

Compendium 4 page 93

Page 70: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

32 Message Handling

SUB listname Your Own Name

For information about other commands you can give, such as how to unsubscribe from a

list, download archived messages from the list, etc., write a message to the same recipient,

[email protected], with the single word HELP in the text of your message. For more

information about Listserv.

1.11.4.2 The -Request Way

Suppose you want to become a member of a mailing list, whose e-mail address is

[email protected]. You then write a message to [email protected] and write

in the text of the message something like “Please add me as a member of this mailing list.”

The e-mail address with the name of a mailing list with “-request” added to it can also be used

for other communication with the list manager.

1.11.4.3 Subscription through web pages

A third method is to subscribe to mailing lists through web pages. However, there is usually

no very secure identification of a person who accesses a web page. Because of this, a

convention has developed that if a person subscribes to a mailing list through a web page,

then the list server will first send an e-mail to the user, asking him/her to confirm the

subscription. Not until this confirmation is received, will the subscription be entered.

1.12 Usenet News

Usenet News is a distributed asynchronous forum system. Forums in Usenet News are called

newsgroups, and messages are called articles.

1.12.1 Distribution of news

The basic principle of Usenet News is that a local server handles most of the functionality.

Usenet News standardizes two variants of the NNTP protocols: One for communication

between adjacent servers, one for communication between a client and a server. Each server

can download as much as it wants of what is available on any of the adjacent servers. Loop

control is handled both by a trace list and a list of the Message-IDs of received messages

stored by each server, so that the server can reject the same message coming back again. The

procedure for distribution of news can be compared to pouring water onto a flat surface; the

water flows out in all directions as shown in the figure below.

Compendium 4 page 94

Page 71: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Message Handling 33

Server Server

Server

Server

Server

“Pouring water” principle of Usenet News distribution.

The figure below shows how new articles are forwarded from server to server in Usenet

News. A server tells its adjacent servers which items it offers, the server requests those it has

not already got via another route.

ServerI have 17, 18, 19 new

Server

Send me 17, 18, 19

Server

I have 18, 19, 20 new

Send me 20

This figure shows how new articles are forwarded from server to server in Usenet News.

Information about a user, such as how much this user has seen, is stored in the client. The

server need not even know which users are using it. There are many different user-interface

softwares for Usenet News, Some of them, of course, do not provide all the available

functions.

1.12.2 Cancel and Supersedes

In addition, Usenet News provides an interesting functionality which restricts communication

to only those members of a newsgroup who work in the same organization or live in the same

area or country. This functionality, however, is not used very much, and its existence is

controversial, since it means that different users will get different views of the same

newsgroup.

Usenet news has a cancel command, which can delete messages already sent out. Only the

author of the cancelled message and the local newsserver administrator is allowed to cancel a

message. Since, however, it is very easy to fake your identity, this command poses an obvious

security risk, and the command is known to have been used to cancel messages for political

reasons. The command is also used (not quite appropriate) by cancelbots, robots (= automatic

programs) which cancel obvious spams by identifying messages with the same content sent to

many disparate newsgroups. Usenet news also often has a Supersedes header field, which

refers from a new message to an old message. This header usually cancels the old message.

Compendium 4 page 95

Page 72: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

34 Message Handling

There is also a supersedes header, which is used when a new message is sent to replace the

previous version of this article. Supersedes is similar to cancel in that it causes a real deletion

of the message being replaced. This is different from, for example, the obsoletes header of

X.400, which only marks a new message as a replacement, but is not meant to cause the

previous version to be physically deleted. Obsoltes thus is similar to the In-reply-To and

References headers in e-mail.

The most important restriction of Usenet News is that closed groups are not well

supported. The only kind of closed groups which are common in Usenet News are groups

which are restricted to one or a selected set of news servers. Such groups will then be open to

anyone with an account in these news servers, but closed to everyone else.

1.12.3 The Network News Transfer Protocol (NNTP)

The Usenet News protocol is called network news transfer protocol (NNTP) and is specified

in RFC 977 [46]. The standard for the format of Usenet News articles is specified in RFC

1036 [19].

The table below lists the most common NNTP commands:

article [<Message-ID>| <Number>]

Return text of designated article. If no parameter isgiven, the next article is returned. The current articlepointer is put at the fetched article.

body [<Message-ID>|<Number>]

As article, but only returns body

group <newsgroup> Go to the designated newsgrouphead [<Message-ID> |<Number>]

As article, but only returns head

help Lists available commandsihave <messageID> Informs the server of an available article. The server

can then ask for the article or refuse it.last Sets current article pointer to last message available,

return the number and Message-ID.list [active |newsgroups |distributions |schema]

Returns a list of valid newsgroups in the format:group last first

newgroups <yymmddhhmmss> ["GMT"][<distributions>]

List newgroups created since a certain datetime."distributions" can be e.g. alt to only get newsgroups inthe alt category.

newnews <newsgroups><yymmdd hhmmss>["GMT"][<distributions>]

List Message-ID of articles posted to one or morenewsgroups after a specific time. newsgroups can be.e.g. net.*.unix to match more than one newsgroups.distributions checks for articles which also has thisother newsgroup as recipient.

next Current article pointer is advanced. Returns numberand Message-ID of current article.

post Submit a new article from a client.slave Tells the server that this is not a user client, it is a slave

server. (May give priority treatment.)stat [<Message-ID> |<Number>]

As article, but only returns Message-ID. Used to set thecurrent article pointer.

Note that the same protocol, NNTP, is used for communication both between a client and a

server, and between two servers, but that sometimes different commands are used. Thus,

when a user client submits a new message, the post command is used, but when a server

Compendium 4 page 96

Page 73: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Message Handling 35

sends a new message to another server, it usually uses ihave to give this information, and

the receiving server then requests the message, if it has not already received it from another

server.

Most of the message headers are the same in Usenet News and in e-mail, but there are

some differences as shown by this table:

Newsgroups: Comma-separated list of newsgroups to which this article belongs.Example of newsgroup format: alt.sex.fetisches.feet. Should neveroccur in e-mail. Use “Posted-To:” instead!

Subject: Add four characters “Re. ” for replies. Do not change subject inreplies.

Message-ID: Mandatory in Network News, and must be globally unique.Path: Path to reach the current system, e.g. abc.foo.net!xyz!localhost. E-

mail path format also permitted. Compare to Received: andReturn-Path in e-mail.

Reply-To: In news: Where replies to the author should be sent. In e-mail:Ambiguous.

Followup-To: Where replies to newsgroup(s) should be sent.Expires: Suggested expiration date.References: Message-ID-s of previous areticles in the same thread. Should

always contain first and last article in thread. Compare to e-mail:Usually only immediately preceeding messages..

Control: Not used in e-mail. Communication with servers. Body or subjectcontains command. Subject begins with "cmsg".cancel Delete physically a previously sent article.

ihave Host telling another host of available newarticles.

sendme Host asking for articles from another host.

newgroup Name of new group, plus optional wordmoderated.

rmgroup Remove a newsgroup. Requires approved.

sendsys Send the sys file, listing neighbours andnewsgroups to be sent to each neighbour.

version Version of software wanted in reply.

checkgroups List of newsgroups and descriptions, used tocheck if list is correct.

Distribution: Not used in e-mail. Limits distribution to certaingeographical/organizational area. Example: Distribution: se, no.

Organization: Of sender.Keywords: For filtering.Summary: Brief summary.Approved: Required for message to moderated group. Added by the moderator,

contains his e-mail address. Also required for certain controlmessages.

Lines: Count of lines of the message body.Xref: Numbers of this message in other newsgroups. Only for local usage

in one server. Example: Xref: swnet.risk:456 swnet.sunet:897

There is a problem with the newsgroups header in a message sent via both Usenet News

and e-mail. Different systems use this header in two different ways, in the mail version of

such messages:

(a) To indicate that this message has also been sent via Usenet news to theindicated newsgroups.

Compendium 4 page 97

Page 74: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

36 Message Handling

(b) To indicate that this is a personal reply, sent only via e-mail, to a messageposted on the indicated newsgroups.

Because of this problem, it is better to use the Posted-To header in e-mail to indicate that a

message has also been sent to certain newsgroups, and e-mail recipients should ignore any

newsgroups heading in an e-mail message.

MIME is not used as much in Usenet News as in e-mail. Instead of BASE64, an older

method named UUENCODING is often used in Usenet News to include binary attachments.

Also, because of message size restrictions, large attachments are very often split into several

messages in Usenet News. This also occurs in e-mail, but is more frequent in Usenet News,

since some Usenet News servers try to save space by not accepting articles above a certain

size limit. Both MIME and Usenet News have methods of indicating how a client can

automatically combine parts into a complete message or attachments.

In addition to NNTP, also other protocols are sometimes used for communication between

Usenet news servers.

1.12.4 News Control in Usenet News

An important functionality in all asycnchornous (not same time) message systems is the news

control functionaltity. This functionality will help the user find which messages are new and

not yet read by this user. Most message systems handle this functionality together with the

storage of messages. This means that if messages are stored in a server, the server also knows

for each user, what that user has read andn ot read. And if messages are stored in the user's

personal computer, then news information is also stored there.

Usenet News, however, does this in a different way. Messages are usually stored in the

server, but the server does not know which messages each individual user has read and not

read. The information on what a user has read is stored in a very compact format in the user

agent software. The most common format for this file is the one shown by this example:

254-290300-312350-

The example above says that this user has unseen articles number 254-290, 300-312 and all

articles numbered higher than 350. This format is very compact, because there are usually

long sequences of articles which are read and unread. Storing a single bit for each message is

then not optimal.

1.13 Relative addresses

Domain addresses are absolute addresses. An absolute address is the same address for a

certain recipient, irrespective of where the message is sent from. Another type of address is

called a relative address. A relative address indicates one or more relay stations on the route

to the recipients. This would be roughly equivalent to indicating the postal mail address of a

person as: “First to London, then from London by train to Nottingham, then by truck to the

University, then by mailman from the University to the Computer Science department.”

Compendium 4 page 98

Page 75: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Message Handling 37

There are three commonly used formats for relative electronic mail addresses. All the three

addresses in the figure below indicate the same relative route but in a different printed format

This format is sometimes used by gateways tothe Internet, although it is not officially part ofthe Internet mail standards.

This format is sometimes used in the Internetmail standards.

This format has been used much, and is stillsometimes used, in Usenet News.

Combinations of several of these formats in one address may occur sometimes. For example,

an address of the format:

MCVAX!FK.ABC.SE!Per_Persson@WUI

may according to some practices be interpretedas:

but may according to Usenet News practice beinterpreted as:

The format using percent (%) signs is not part of the official Internet mail standards. This

format will occur sometimes often when a message passes a gateway between two nets. The

figure below shows how gateways can create relative addresses.

SUNIC.SE SEARN.SUNET.SE

Recipient MTA

CUNYVM.BITNET

[email protected] John%[email protected]

GatewayMTA

SenderMTA

A message from a sender [email protected] passes a gateway from Internet to Bitnet. The

address of the sender is given as [email protected] before the gateway, but the gateway

translates this into John%[email protected] when transmitting the message to Bitnet.

Thus, the gateway takes the original sender address, replaces the @ with a %, and appends @

followed by the name of the gateway.

In Bitnet, when a reply is sent from the recipient to the sender, the reply is first sent to

John%[email protected]. Bitnet views this address as a user named John%sunic.se

at a host named searn.sunet.se. Thus, from a Bitnet viewpoint, the whole Internet is in this

case seen as local names in searn.sunet.se. When the reply reaches searn.sunet.se, this MTA

will strip out @searn.sunet.se and look at the rest, John%sunic.se. It will translate this back

to [email protected] and forward the reply to the original sender.

This example shows that neither the Internet nor the Bitnet will actually view this address as

relative. In Bitnet, the address is an absolute address of a user named John%sunic.se in the

host searn.sunet.se. Only the gateway itself recognizes the translation between % and @.

Compendium 4 page 99

Page 76: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

38 Message Handling

Thus, fundamentalists can proudly claim that “our messaging standard does not use any

relative addressing” even though relative addresses are sometimes transported in their net.

There are many disadvantages of relative addresses. You cannot print your address on your

business card in the same way for all recipients. You have to indicate a different address

depending on to whom you are giving your address. There may also be problems with sending

a return message, as shown by the figure below:

Person B

Person C

Person C

Person B

Person A

Person A

Suppose that a person A in America sends a message to two recipients B and C in Europe and

that this message uses relative addressing via some American gateway. This means that when

B gets the message, the relative address to C is given via this American gateway, so that if B

sends a reply to C, this reply must be routed unnecessarily twice across the Atlantic. There are

two disadvantages of this. Firstly, it means unnecessary costs and delays, as was shown in the

example above. Secondly, the American net may refuse to transmit messages from a

European sender to another European recipient, since the American net may have to pay for

part of this unnecessary transmission.

A problem similar to that in Figure «#techniques».«#relativeinefficiency» can occur when

distribution lists are used, as in Figure «#techniques».«#relative2inefficiency».

Distribu-tion list

Person A

Person B

Person A

Person B

In this example, a person A in Europe sends a message to a distribution list in America. The

letter reaches a recipient B in Europe via the list. If relative addressing is used, a reply from B

to A may have to be transferred via a gateway in America.

Everyone agrees that relative addressing is bad. Why then does it crop up again and again?

The reason is that electronic mail is handled by different electronic mail networks, connected

via gateways. Since each net has limited knowledge of the internal structure of another net, it

can often only address a recipient in another net via a gateway between the nets, and the

inclusion of such gateways into addresses makes them relative. The increasing dominance of

Compendium 4 page 100

Internet as a single universal network has however led to relative addressing occuring less and

less often.

Page 77: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Offprint from Dilip C. Naik: Internet Standards and Protocols , Microsoft Press 1998, ISBN 1-57231-692-6 Page 1

Internet Message Access Protocol Version 4

Figure 9-3 The lAMP state machine as described in RFC 2060.

Internet Message Access Protocol (IMAP) is more suitablewhen the e-mail client software is running on a laptopcomputer. With IMAP a user can selectively downloadmessages or even just parts of messages. This feature ishighly useful when accessing e-mail over a slow telephoneline from a laptop computer. RFC 2060 describes IMAP interms of a state machine. For each state, the client can issue alimited number of IMAP commands to the e-mail server.Some of these commands can cause a transition into anotherstate, wherein a different set of commands would he appli-cable. Figure 9-3 shows the IMAP state machine asdescribed in the IMAP RFC.

With IMAP, the client can download selected messages froma mail server and then disconnect. In the disconnected state,the client can modify any or all of the messages. In order toallow synchronization, IMAP assigns each message a uniqueidentifier that is valid across any IMAP sessions that theclient might establish.

IMAP commands consist of an identifier and a commandname, followed by parameters, if any exist. The serverreturns a tag in its response. The tag allows the client to

associate the response with a command. This is necessarybecause IMAP allows the client to issue multiple commandswithout waiting for a response. Obviously, the client needs toensure that each tag is either unique or, if the client uses a poolof tags, that the pool is large enough so that a particular tag isnot reused before the server response containing that tag isreceived. The server can generate responses that are notassociated with any particular outstanding command. These arecalled untagged responses, and they use the special asterisk (*)tag.

IMAP results include the following:

• OK indicates successful completion of a previously sentcommand. OK can also indicate that additionalinformation has been included when sent as an untaggedresponse.

• NO indicates unsuccessful completion of a previouscommand when tagged; it is used to convey warningswhen untagged.

• BAD indicates a bad command or argument whentagged; it indicates a serious protocol problem whenuntagged.

• PREAUTH is always untagged; it indicates that there isno need for a LOGIN command.

Compendium 4 page 121

Page 78: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Offprint from Dilip C. Naik: Internet Standards and Protocols , Microsoft Press 1998, ISBN 1-57231-692-6 Page 2

• BYF is always untagged; it indicates that the server isending the session.

IMAP commands are summarized in the following table.Parameters in square brackets are optional.

Table 9-3 Summary of IMAP Commands

Command State Parameters Description Result

NOOP All None Used as a "keep alive" to OK, BADreset timers on server

CAPABILITY All None Server responds with an OK, BADuntagged response indicating its capabilities; theonly capability defined fornow is IMAP 4

LOGOUT All None Client indicates end of ses- OK, BADsion; server responds withan untagged BYE response

AUTHENTICATE Nonauthenticated Authentication Client indicatesauthenti- OK, NO,

mechanism name cation mechanism; defined BADauthentication mechanismsare Kerberos v4, S/KEY,and GSSAPI (described inChapter 6)

LOGIN Nonauthenticated User ID, Provides clear text user ID OK, NO,password and password; not as se- BAD

cure as the AUTHENTICATEcommand

CREATE Authenticated Mailbox name Creates a mailbox OK, NO,BAD

DELETE Authenticated Mailbox name Deletes a mailbox OK, NO,BAD

SELECT Authenticated Mailbox name Selects a mailbox for which OK, NO,the client will issue further BADcommands; server respondswith untagged responsesdetailing information aboutthe mailbox as well as theresponse to the SELECTcommand

EXAMINE Authenticated Mailbox name Same as the SELECT com- OK, NO,mand except that the mail- BADbox box is opened inread-only mode

RENAME Authenticated Old mailbox name, Renames a designated OK, NOnew mailbox name mailbox BAD

SUBSCRIBE Authenticated Mailbox name Adds the given mailbox to OK, NO,the list of subscribed BADmailboxes

UNSUBSCRIBE Authenticated Mailbox name "Undoes'" the SUBSCRIBE OK, NO,command; removes a given BADmailbox from the subscribed list

APPEND Authenticated Mailbox [flags] Appends a designated mes- OK, NO,[date-time] sage to the specified mail- BADmessage box; other parameters (if

present) are appended tothe message

Compendium 4 page 122

Page 79: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Offprint from Dilip C. Naik: Internet Standards and Protocols , Microsoft Press 1998, ISBN 1-57231-692-6 Page 3

LIST Authenticated Context, mailbox Allows the user to list a OK, NO,subset of available names; BADthe context argument provides the server with additional context

LSUB Authenticated Context, mailbox Same as LIST command OK, NO,

Command State Parameters Description Resultexcept that the server limits BADits response to mailboxesthat subscribed via theSUBSCRIBE command

STATUS Authenticated Mailbox Requests the status of the OK, NO,named mailbox BAD

FETCHSelected Message set, Retrieves data pertinent to OK, NO,

message item a message; data could be BADname parts of a message or the

whole messageSTORE Selected Message set, Changes data associated OK, NO,

message item with specified message; by BADname, message default the changed valueitem data is returned in an untagged

responseCHECK Selected None Requests implementation- OK, BAD

dependent checkpointingof the currently selectedmailbox; the resulting mailbox state is stored to disk

EXPUNGE Selected None Deletes all messages OK, NO,marked for deletion BAD

SEARCH Selected [character set], Searches mailbox for mes- OK, NO,fsearch criteria] sages that match given cri- BAD

teriaCOPY Selected Message set, Copies specified messages OK, NO,

mailbox name into specified mailbox BADUID Selected Command name, Indicates specified com- OK, NO,

command mand (COPY, FETCH, BADarguments STORE, or SEARCH) that

should be executed withunique identifiers, ratherthan message sequencenumbers, as input

X Selected Implementation Experimental OK, NO,defined implementation- BAD

dependent commandCLOSE Selected None Deletes messages marked OK, NO,

for deletion; causes a tran- BADsition to authenticated state

IMAP is more complex than POP3 and is harder toimplement. IMAP also puts more demands on storageresources at the e-mail server, since old messages mightaccumulate on the server. On the other hand, it is more likelythat a decent backup strategy is in place at the e-mail server,thus protecting data integrity. Also, IMAP allows thepossibility of implementing groupware applications, sinceserver-side mail processing as well as shared mailboxes aresupported features. (A shared mailbox is a mailbox that canbe accessed by multiple recipients.) IMAP also allows forsearches to be implemented on e-mail still residing on the

server, without downloading the mail to the client.

When TCP is used, the IMAP server listens on port 143.

At this writing, both Microsoft and Netscape have committed toshipping IMAP e-mail client and server software. SunSoft, ICL,and NetManage also have IMAP e-mail software implementations.

IMAP is described in RFCs 2060,1731, and 1730.

Compendium 4 page 123

Page 80: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Offprint from Dilip C. Naik: Internet Standards and Protocols , Microsoft Press 1998, ISBN 1-57231-692-6 Page 4

Pretty Good PrivacyPretty Good Privacy (PGP) leverages private keycryptography, public key cryptography, and message digests(all described in Chapter 5, "Cryptography and SecurityBasics"). PGP ensures message confidentiality (only theintended recipient can decrypt and read the message) andmessage authentication (the recipient can be sure of theidentity of the sender). But keep in mind that, strictlyspeaking, PGP is used to ensure privacy of files, not justprivacy of e-mail.

To ensure confidentiality, PGP encrypts a message with arandomly generated 128-bit key using the International DataEncryption Algorithm (IDEA) symmetric block cipher. Thiskey is then placed in a message header, which is encryptedusing the RSA public key cipher. The e-mail recipient'spublic key is used as the key for the RSA cipher. Therecipient decrypts the header using the recipient's private keyto recover the randomly generated key that's embedded in themessage.

To ensure authentication, a technique called digitalsignatures (see Chapter 5) is used. The message is processedusing MD5, which produces a 128-bit hash. The hash is thenencrypted using RSA, with the sender's private key being theinput key for RSA. This encrypted hash is prefixed to themessage. The recipient extracts this encrypted hash anddecrypts it using the sender's public key. The recipient alsodecrypts the message and independently computes the MD5hash of the message. The computed MD5 hash should beidentical to the decrypted hash.

PGP provides for three different key lengths:

• A "casual" option that has 384-bit keys• A "commercial" option that has 512-bit keys• A "military" option that has 1024-bit keys

PGP stores a user 5 public and private key in files called keyringson a computer disk. These keys are stored in an encrypted form.Each user has to provide a password to access the keys. The MD5message digest algorithm is used to produce a digest of thepassword phrase. This message digest is used as the key to encryptthe keys with the IDEA cipher. PGP has a crucial weakness that iscommon to public key cryptography systems: how to ensure that apublic key is correct and belongs to a particular user.

The digital signature produced by PGP and the encrypted messagecan have the most significant bit of a byte set. These messages mayor may not be successfully transmitted through e-mail serversbecause some servers assume that the most significant bit is alwaysreset. PGP provides a scheme of mapping every three charactersinto four characters, with the most significant bit clear.

PGP is available on a wide variety of platforms, includingMicrosoft Windows, DOS, UNIX, Macintosh, and VMS. Bothfreeware and commercial versions of PGP are available. Someversions do not meet U.S. export requirements. An internationalversion of PGP, called PGP5I, was created by printing PGP5 codein a book and then exporting the book (but not the code). The codewas then scanned into a computer. This export version meets U.S.government requirements.

PGP is described in RFCs 2015 and 1991. See http://www.pgp.comfor more details.

Compendium 4 page 124

Page 81: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Offprint from Dilip C. Naik: Internet Standards and Protocols , Microsoft Press 1998, ISBN 1-57231-692-6 Page 5

News and UsenetUsenet allows users to have discussions in public forumscalled newsgroups. A message that's sent to the newsgroup isforwarded to every user who subscribes to the newsgroup.Usenet is the name given to a loosely coupled network ofcomputers that exchange e-mail messages. The messages aretagged with particular subjects or headers. These headersidentify the newsgroup to which the message belongs. Themessage is called an article. This exchange of articles amongservers as well as between clients and servers isaccomplished using the Network News Transfer Protocol(NNTP), described in the next section.

The major difference between Usenet and a list server is inthe way a user accesses and receives information. WithUsenet, the user has to fire up a news reader and downloadmessages. The subscriber can look at article headers and thenretrieve the full text for only selected articles. With a listserver, however, the subscriber receives e-mail directly in hisor her mailbox.

Newsgroups are not centrally administered. They areessentially self-policing mechanisms. Newsgroups arenamed according to a particular convention: names consist ofmultiple components, each separated by a dot. The firstcomponent indicates the category of the newsgroup (forexample, soc, as used in soc.culture.india). The majornewsgroup categories (identIfied by the leading component)are shown in Table 9-5.

Table 9-5 Malor Newsgroup CategoriesGroupName

Description

biz Dedicated to business-related issues

comp Related to computers, computer software,networking, and other computer-related issues

sci Dedicated to science-related subjects

soc Dedicated to social and cultural issues

talk Provided for lengthy discussions and debates

news Dedicated to discussion of news and information

rec Dedicated to recreational activities such as thearts, hobbies, and so on

misc Associated with miscellaneous issues that do notfit any of the other major categories

alt Limited distribution category whose content hasa wider latitude than other groups

For more details, see RFC 1036.

Network News Transfer Protocol

Network News Transfer Protocol (NNTP) defines a standard for

• Clients to post news articles to servers• Clients to retrieve and read news articles from servers• Articles to be exchanged between news servers

NNTP is similar to SMTP. For example, NNTP messages must usethe ASCII character set, and NNTP commands are in the form ofsimple ASCII text. Commands are ASCII lines with the commandname followed by optional parameters that are specific to thecommand involved. Each command is terminated by ASCII CRand LF characters. Like SMTP, the NNTh responses consist of athree-digit code followed by ASCII text. The three-digit code canbe used to drive a protocol state machine, whereas the text can bedisplayed for human consumption. Responses are terminated by thesame CR/LF sequence. When a command results in the serversending back more data, such as an article body, each line isterminated by the CR/LF sequence. The end of the response isindicated by a line with a single dot (.) followed by a CR/LE Just asin SMTP, dot stuffing is employed; any line that begins with a dotwill have the dot doubled, then a CR/LF appended at the end forsending, and the extra dot stripped by the receiving server.

NNTP responses consist of three digits. The first digit conveysinformation about the success or failure of the message. Table 9-6describes the semantics of the first digit of the NNTP responsecode.

Table 9-6First-Digit NNTP Response Code Semanticslxx Informative message2xx Command OK3xx Command OK so far; send the rest4xx Command was correct but could not be performed5xx Command incorrect or unimplemented; serious error

The second digit in the response code indicates the functionresponse category. Table 9-7 summarizes the semantics for thesecond digit of the NNTP response code.

Table 9-7 Second-Digit NNTP Response CodeSemanticsx0x Connection, setup, and miscellaneous messagesxix Newsgroup selectionx2x Article selectionx3x Distribution functionsx4x Postingx8x Nonstandard implementation-specific extensionsx9x Debugging output

NNTP provides 15 commands that a client can issue to the server.

Compendium 4 page 125

Page 82: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Offprint from Dilip C. Naik: Internet Standards and Protocols , Microsoft Press 1998, ISBN 1-57231-692-6 Page 6

Note that an NNTP server might occasionally act as a clientwhen NNTP messages are being relayed. Table 9-8summarizes the NNTP commands. Square brackets identify

optional parameters. Angle brackets, where shown, aresyntactically required.

Table 9-8 Summary of NNTP CommandsCommand Arguments Description

ARTICLE <msg-id> Retrieves the text of the messagespecified by msg-id

ARTICLE [nnnn] Retrieves the text of the message

specified as article index number nnnnBODY <msg-id> Retrieves the body associated with

the message specified by msg-id

GROUP GroupName Selects GroupName as the currentgroup and returns the numbers of thefirst and last articles, as well as an estimate of the number of articles in thegroup

HEAD <msg-id> Retrieves the header associated withthe message specified by msg-id

HEAD [nnn] Retrieves the header associated withthe message specified as article indexnumber nnnn

HELP none Provides an ASCII text response detailing commands implemented by theserver

IFIAVE <msg-id> Client informs the server that it has theCommand Arguments Description

message specified by msg-id; if theserver sends a positive response, theclient transfers the entire message; ifthe server sends a negative response,the client must not send the message

LAST none Moves the current article pointer backto the previous article within the samenewsgroup

LIST none Returns a list of group names; for eachgroup the msg-id of the first and lastmessages is returned, with an indication of whether posting to that groupis permitted

NEWSGROUPS date, time, [GMTI, [<dist>] Returns a list of newsgroups createdsince specified date and time; time is inserver's time zone but can be optionallyspecified as GMT; dist is optional andused to restrict the list of newsgroupsreturned

Compendium 4 page 126

Page 83: *:96 Internet application layer protocols and standardsjpalme/internet-course/compendium-4.pdf · Message Handling Introduction and Basic Concepts 1.1 A Simple Example Internet application

Offprint from Dilip C. Naik: Internet Standards and Protocols , Microsoft Press 1998, ISBN 1-57231-692-6 Page 7

NEWNEWS newsgroups, date, time Returns message IDs of new messagesin specified group since specified dateand time

NEXT none Moves current article pointer to nextarticle in current newsgroup

POST none Used to post new messages to server;server can respond positively or negatively

QUIT none Terminates the session

SLAVE none Informs the server that the client isanother slave server

STAT msg-id Similar to ARTICLE except that no text isreturned; typically used to positioncurrent article pointer

XOVER none Not described in RFC 977; de facto standard command that retrieves header informationfor a group of articles using a single command

NNTP is documented in RFC 977.

Compendium 4 page 127


Recommended