April 1999 HEPiX99 1
PostKEKA new mail system using DCE/DFS
Akihiro Shibata
[email protected] Research Center,
High Energy Accelerator Research Organization (KEK)
April 1999 HEPiX99 2
Contents PostKEK system
– Requirements
– System Design
– Status
– Summary and Discussion
High Availability File service using DFS
April 1999 HEPiX99 3
System requirements More than 1,000 users Non-stopping service through year. Security services
– POP (IMAP in the future)
– Mail exchanger (out-going mail gateway)
– Remote login
– Home directory
– Mailing-List
April 1999 HEPiX99 4
System Design Design based on distributing system using
DCE (Distributed Computing Environment)
DFS (Distributed Files System)
High availability– Duplication of servers (SMTP, POP, telnetd, ...)
– Higher availability file service by DFS
– Application fail-over (sendmail: mail spooling)
April 1999 HEPiX99 5
System components 4-work stations
– HITACHI 3500 (160MB memory)– OS : HI-UX/WE2
RAID disk– HITACHI A-6531
• 2 port controller• Duplicated electric supply units• 32Gbyte (2-arrays)
– For spools and home directories (file service by DFS) DCE
– HI-DCE Executive (OSF DCE ver 1.1 base)
PostKEK
U S E Rc o m p u t e r s
D F S
S y s t e m S e v e r s
u s e r s e r v i c ew o r k s t a t i o n s
D C E C E L L / . . . / p o s t . k e k . j p
RAID
M a il S P O O LH O M E D IR E C T O R Y
SMTP
POP
Telnet
internet
MA IL
DFS
April 1999 HEPiX99 7
Why DCE/DFS? security
– Integrated login• No password encrypted data is shared in the DCE client• Long password is possible (<128)
– Access Control Lists (ACLs)• more flexible access control than UNIX
– Even root of DCE/DFS clients has no privilege for cell administrations
– No plain password is sent among DCE cell availability
– DCE server replication – Load balancing among replica of DFS
April 1999 HEPiX99 8
Why DCE/DFS ? (2) Scalability
– The resources are distributed and shared among many hosts.
– Up-to hundreds hosts. Multi platforms are supported
– AIX, Solaris, HPUX, Digital UNIX, Irix Uniform file access DFS Backup (local file system)
– Snapshot of the home directory is held
– The files and directories are able to be recovered to those of time when snapshot was taken.
April 1999 HEPiX99 9
DCE/DFS servers
MlservaMlserva
dced (M)
cdsd (M)
secd(M)
fl server <FLDB>
fxd (M)
file exporter (prim)
MlservbMlservb
dced (S)
cdsd (S)
secd(S)
fl server <FLDB>
fxd (S)
file exporter (backup)
mail1mail1
fl server <FLDB>
mail2mail2
April 1999 HEPiX99 10
Mail servers
MlservaMlserva
DNS (M)
SMTP (M)
MlservbMlservb
DNS (S)
SMTP (S)
mail1mail1
SMTP : gateway
POP3
mail2mail2
SMTP : gateway
POP3MX: post.kek.jp
mail1, mail2
spooling
April 1999 HEPiX99 11
Modification from UNIX system Authentication
– To access DFS user should be authenticated for DCE
– Integrated login with DCE and UNIX• about 20-30 lines modification for each source
– POP server,
– ftp server,
– login
qpopper (POP server)– Adding the file lock function to the source code
– DCE login
April 1999 HEPiX99 12
High Availability services DCE servers replication (security server, CDS ser
ver) High Available file service by DFS
– Home directories and mail spools
will be explained in detail later Spooling mails
– Swapping the SMTP servers. (application fail over)
– Synchronized with swapping DFS server POP server, telnet server, SMTP server
– More than 2 servers
High available service: server fail over
MalservbMalservb
DFS : stand-by
SMTP: stand-by
MalservaMalserva
DFS: file exporter
SMTP(M) : spool
MalservbMalservb
DFS : file exporter
SMTP(M) : spool
MalservaMalserva
DFS:
SMTP:
Both DFS and SMTP servers are swapped
April 1999 HEPiX99 14
Status of service Starts at May 5th 1998 Users
360 users at April 1st 1999
Mailing Lists– since December 1998
– 20 lists
POP(2 servers)– 10000 accesses per day
– 500 accesses per hour (peek)
login (2 servers)– 200 accesses per day
– 30 users at the same time
– 60 users per day
Mails– 2000 mails per day
April 1999 HEPiX99 15
Status of system running Swapping server
– It tooks at most 15 minutes for server swapping• 32 Gbyte disk (15 partitions)
• about 800 filesets (RW, Backup)
– and about a further few minutes for propagating to clients
• depending on cache parameter of DFS and so on.
running status– clients: 200 days without stopping
– servers: 100 days without stopping
April 1999 HEPiX99 16
Summary and Discussion Stability and continuous services are required . Mail system using DCE answers the requirements
– security
– The high available system
– The file sharing in secure
– Possible to add a lots of clients High Availability File Service using DFS
– Continuous service is possible even in maintenance.
– Manipulated by hand.
– It is helpful, if “hot” (automatic) fail-over is possible.
April 1999 HEPiX99 17
Discussion(2) IMAP
– The later Supporting on Japanese Language.• Almost 3 clients are available in the beginning of 1998
– Netscape ver 4.0, Airmail, pine
• Now, several mailers support on Japanese.
– IMAP will be supported in near future.
April 1999 HEPiX99 19
DFS fail over limitation Read Only replica
– One Read/Write(R/W) server and many Read Only(RO) servers
– Load sharing among RO servers
– Very useful for application service or read most file service Very useful for application service or read most file service such as web pages, but useless for home directory servicesuch as web pages, but useless for home directory service.
– In case of R/W server fail, one of backup RO replica could be a new RW server. But data consistency between old and new R/W server is not assured.
R/W replica functionality is not implemented in DFS yet.
April 1999 HEPiX99 20
HA File Service Make down time of DFS file server shorter in trouble
or in maintenance– Total fault tolerance or dynamic fail over
– Prevent program failure or file damage in unexpected system down.
HA products using 2port disks exist on NFS, but not on DFS.
April 1999 HEPiX99 21
HA file service - How Adopt two ports RAID
– Multi port RAID is popular recently.
– No data copy is needed between active and stand-by servers. (no data consistency or synchronization problem)
Relation between Fx server and served Fileset.– Seems to be defined by ‘position’ of Serverentry in FL
DB.
– To change FileServer assignment for the fileset, it need just replacement of the Servername in the entry
April 1999 HEPiX99 22
DFS file access mechanism
Two DFS server provides. – File server
• To store and export LFS and on-LFS data as filesets.
– File location server• Stores FLDB (fileset location database) which contain the data of loca
tion of fileset.
• FLDB contains information about fileset (name, ID number., physical location).
FILESET– Sub-tree related files and directories.
– The mount point• the place where fileset is attached to DFS global filesystem
• the name of fileset in which data resides, ( not physical location)
DFS file access mechanism
FLDB
DFS client
CDS server
2. Where is root.dfs?
3. Acsess to /:/4. Interpreting path user
5. Where is shibata.home?
6. Access to /:/usr/shibata
ls /:/user/shibata
root.dfs mounted to /:/
#mount point
shibata.home mounted to
/:/user/shibata
DFS server
1. Where is FLDB ?
Commands Server A side
(Original File Server)
become cell_admin# dfsexport -agg <agg1_A> -detach -force
Server B (New File Server)
become cell_admin
fts eds <hostA> -ch <hostDummy># fts eds <hostB> -ch <hostA># fts eds <hostDummy> -ch <hostB># fts eds <hostA> -prin hosts/<host
B># fts eds <hostB> -prin hosts/<host
A>
# dfsexport -agg <agg1_B>
hostA hostB
fileset 1agg1_A agg1_B
Server
Entries
server 1
server 2
sever 3
File set E
ntry
for fileset 1
for fileset 2
for fileset 3
FLDB entry
Server
Entries
server 2
server 1
sever 3
File set E
ntryfor fileset 1
for fileset 2
for fileset 3
Usual Status
DFS server B
2 port DISK
DFS server Astand-by
FLDB server
user.shibata -> A
Clinets
ls ~shibataunexport
Network
2 port DISK
DFS server Adown
DFS server B
FLDB server
user.shibata -> A
Clinets
ls ~shibataunexport
user.shibata ->B
Network
Server A maintenance
April 1999 HEPiX99 28
HA - result (1) Works well.
– Better than NFS HA.• In usual NFS HA, IP swap trick is used.
Caching of ARP table (IP-MAC) in clients, bridges or routers sometimes makes problem .
Without ‘dfsexport -detach’ when the servers are swapped
– File inconsistency between both servers were happened.
– File damages may happen in server trouble.
April 1999 HEPiX99 29
HA - discussion This mechanism cannot offer load valance among
servers but, useful for high availability– moderately useful in the present status.
– Very helpful if dynamic fail over becomes possible . (even if it cannot prevent application abort or damages of file being opened at the time of the server down.)
– Dynamic fail over is our dream. (Then we will not be called in early morning for a server down)