Emulab and its lessons and value for A Distributed Testbed Jay Lepreau University of Utah March 18,...

Post on 03-Jan-2016

214 views 1 download

Tags:

transcript

Emulaband its lessons and value for

A Distributed Testbed

Jay Lepreau

University of Utah

March 18, 2002

What?

A configurable Internet emulator in a room– Today: 168+160 nodes, 1646 cables, 4x BFS (switch)– virtualizable topology, links, software

Bare hardware with lots of tools: Management Software

An instrument for experimental CS research Universally available to any remote

experimenter Simple to use

Points

Programmable, automated mgmt, complete virtualization:– Qualitatively new environment

Most of it will work in wide area

New Stuff Integrated event system

– Underlying pub/sub system– Integrated into ‘ns’ (statically scheduled)– Start/stop programs– Replayable– Dynamic events– User-accessible

Traffic generation– Automatic, from ns script– New generators:

• TG (tcp, udp)• ‘nse’ with udp, tcp, ftp, telnet

New Stuff (cont’d)

4 node types:– Real, running in the local rack, controlled env.– Real, running ‘nse’– [Simulated]– [Real, in wide-area]

Link configuration and monitoring– Latency, bw, plr, RED, queue size– Link monitoring and capture

GUI network config applet Full-day SIGCOMM tutorial Aug’02

“Programmable Patch Panel”

PCPC

Web/DB/SNMPSwitch MgmtUsers

Internet

Control Switch/Router

Serial

Sharks Sharks

160168

PowerCntl

Fundamental Leverage:

Extremely ConfigurableEasy to Use

– Power– Performance– Virtualization

Key Design Aspects Allow experimenter complete control

– Configurable link bandwidth, latency, and loss rates, via transparently interposed “traffic shaping” nodes that provide WAN emulation

… but provide fast tools for common cases– OS’s, state mgmt tools, IP, batch, ...– Disk loading – 6GB disk image FreeBSD+Linux

• Unicast tool: 88 seconds to load• Multicast tool: 40 nodes simultaneously in < 5 minutes

Virtualization– of all experimenter-visible resources– node names, network interface names, network

addrs– Allows swapin/swapout, easily scriptable

Key Design Aspects (cont’d)Flexible, extensible, powerful

allocation algorithm– Matches desired “virtual” topology to

currently available physical resourcesPersistent state maintenance:

– none on nodes, all in database– work from known state at boot time

Familiar, powerful, extensible configuration language: ns

Separate, isolated control network

Lessons for wide area testbed

Central control: at this scale (1000s) it’s easy

Database!Control node for each site: great

benefits, cheap marginal cost– Trusted, firewall, local disk cache, power

control, console line

Ease of use is dominant driver

Lessons… Generalized resource alloc/mapping

algorithm is great (eg, vs Grid) Get it going quickly, keep it going while

add new stuff– Like a startup– Use feedback and demand– 2.5 years in

Simple authorization model Most of our model and code will work in

wide-area

Lessons…

Freedom for users is freedom for the management software and people

“You’ve got root, use it.”

Over-provision

FreeBSD Jail, or Eclipse/BSD, or VMWare, or ….

Testing is tricky

Have real hardware that can’t virtualize

Test suite part of build Clone DB works some… 8-node minibed Nightly regression testing Schema evolution script/diff/check Developers use/test 3 diff. browsers

Code Base Today 24,100 Web front end 23,900 Back end 2000 ns front end 4200 Resource mapping 4900 Diskimg compression/casting/load 8400 Scripts/daemons from nodes to DB 5000 Event system 6200 Remote console interaction/logging 3300 Regression testing harness and tests 700 Node health monitoring 3700 Documention of internals

More stats

21 “programs”318 “scripts” (including 90 php

scripts, 71 small boot-time scripts)35% Perl32% C19% php12% html, Java, tcl, other

The Database Today

Started with ~18 tables54 tables, 413 columnsGeneral categories

– Physical world: 11 tables, 65 cols– Virtual world: 7 tables, 83 cols– Operational state: 22 tables, 180 cols– Admin data: 14 tables, 85 cols

Note how much operational state shows how much work needs to be done

Testbed Users

30 active projects– more registered– 25 External– About 40/30/30%

dist sys/activenets/traditional networking

~110 users 990 “experiments” in last 8 months

• 7.5/day recently• 40% testbed development

More Sites

More emulab’s under construction:– Kentucky– Umass– Duke, CMU, Cornell, Stuttgart– Others stated intent:

MIT, WUSTL, Princeton, HPLabs, Intel/UCB, Mt. Holyoke, …

Ongoing and Future Work

Federation– heteregeneous sites– resource allocation

Wireless nodes, mobile nodes IXP1200 nodes, tools, code fragments

– Routers, high-capacity shapers Simulation/emulation transparency Event system Scheduling system Topology generation tools and GUI Data capture, logging, visualization tools Microsoft OSs, high speed links, more nodes!

A Global-scale Testbed Federation key Bottom-up “organic” growth

– Local autonomy and priority– Existing hardware resources– Provides diverse hardware

• PCs• Wireless, mobile• Real routers, switches (Wisconsin, …)• Network processors (IXP’s)• Research switches (WUSTL)

But, top-down is much easier: a good start

NSF ITR Proposal (Nov 01)

Global-scale testbedUtah primary

– Research emphasis: software component for heterogeneity; resource allocation/mapping

Collaborators:– Brown, co-PI (resource allocation)– MIT (RON overlay, wireless)– Duke (ModelNet muxing, early adopter)– Mt. Holyoke (education)

Types of Sites

High-end facilitiesGeneric clustersGeneric labs“Virtual machines”

Internet2 links between some sites

Result…

Loosely coupled distributed system

Controlled isolation

“Internet Petri Dish”

New Stuff: Extending to Wireless and

MobileProblems with existing approaches

Same problems as wired domain But worse (simulation scaling, ...) And more (no models for new technologies, ...)

Wireless Virtual to Physical Mapping

Available for universities, labs, and

companies, for research and teaching,

at:

www.emulab.net

A Few Research Issues and Challenges

Network management of unknown and untrusted entities

Security (root!) Scheduling of experiments Calibration, validation, and scaling Artifact detection and control NP-hard virtual --> physical mapping

problem Providing a reasonable user interface ….

How To Use It ... Submit ns script or GUI via web form Behind the scenes:

– Generates config from script & stores in DB– Maps specified virtual topology to physical nodes– Allocate resources– Provides user accounts for node access– Assigns IP addresses and host names– Configures VLANs– Loads disks, reboots nodes, configures Oss– Starts event system, traffic generators, link monitoring/control– Yet more odds and ends ...– User does his/her experiment– [Reports results if batch]

Takes ~3 min to set up 25 nodes, 5 secs/node

An “Experiment”

emulab’s central operational entity Directly generated by an ns script, … then represented entirely by

database state

Steps: Web, compile ns script, map, allocate, provide access, assign IP addrs, host names, configure VLANs, load disks, reboot, configure OS’s, run, report