PlanetLab: Catalyzing Network Innovation October 2, 2007 Larry Peterson Princeton University Timothy...

Post on 16-Dec-2015

213 views 0 download

Tags:

transcript

PlanetLab: Catalyzing Network Innovation

October 2, 2007Larry PetersonPrinceton University

Timothy RoscoeIntel Research at Berkeley

Challenges• Security

– known vulnerabilities lurking in the Internet DDoS, worms, malware

– addressing security comes at a significant cost federal government spent $5.4B in 2004 estimated $50-100B spent worldwide on security in 2004

• Reliability– e-Commerce increasingly depends on fragile Internet

much less reliable than the phone network (three vs five 9’s) risks in using the Internet for mission-critical operations barrier to ubiquitous VoIP

– an issue of ease-of-use for everyday users

Challenges (cont)• Scale & Diversity

– the whole world is becoming networked sensors, consumer electronic devices, embedded processors

– assumptions about edge devices (hosts) no longer hold connectivity, power, capacity, mobility,…

• Performance– scientists have significant bandwidth requirements

each e-science community covets its own wavelength(s)

– purpose-built solutions are not cost-effective being on the “commodity path” makes an effort sustainable

Two Paths• Incremental

– apply point-solutions to the current architecture

• Clean-Slate– replace the Internet with a new network architecture

• We can’t be sure the first path will fail, but…– point-solutions result in increased complexity

making the network harder to manage making the network more vulnerable to attacks making the network more hostile to new applications

– architectural limits may lead to a dead-end

Architectural Limits• Minimize trust assumptions

– the Internet originally viewed network traffic as fundamentally cooperative, but should view it as adversarial

• Enable competition– the Internet was originally developed independent of any

commercial considerations, but today the network architecture must take competition and economic incentives into account

• Allow for edge diversity– the Internet originally assumed host computers were connected to

the edges of the network, but host-centric assumptions are not appropriate in a world with an increasing number of sensors and mobile devices

Limits (cont)• Design for network transparency

– the Internet originally did not expose information about its internal configuration, but there is value to both users and network administrators in making the network more transparent

• Enable new network services– the Internet originally provided only a best-effort packet delivery

service, but there is value in making processing capability and storage capacity available in the middle of the network

• Integrate with optical transport– the Internet originally drew a sharp line between the network and the

underlying transport facility, but allowing bandwidth aggregation and traffic engineering to be first-class abstractions has the potential to improve efficiency and performance

Barriers to Second Path• Internet has become ossified

– no competitive advantage to architectural change– no obvious deployment path

• Inadequate validation of potential solutions– simulation models too simplistic– little or no real-world experimental evaluation

• Testbed dilemma– production testbeds: real users but incremental change– research testbeds: radical change but no real users

Recommendation

It is time for the research community, federal government, and commercial sector to jointly

pursue the second path. This involves experimentally validating new network

architecture(s), and doing so in a sustainable way that fosters wide-spread deployment.

Approaches• Revisiting definition & placement of function

– naming, addressing, and location– routing, forwarding, and addressing– management, control, and data planes– end hosts, routers, and operators

• Designing with new constraints in mind– selfish and adversarial participants– mobile hosts and disconnected operation– large number of small, low-power devices– ease of network management

Deployment Story• Old model

– global up-take of new technology– does not work due to ossification

• New model– incremental deployment via user opt-in– lowering the barrier-to-entry makes deployment plausible

• Process by which we define the new architecture– purists: settle on a single common architecture

virtualization is a means– pluralists: multiplicity of continually evolving elements

virtualization is an ends

• What architecture do we deploy?– research happens…

Validation Gap

Analysis Simulation / Emulation Experiment At ScaleWith Real Users

Deployment

(models) (code)

(results)

(measurements)

PlanetLab

What is PlanetLab?• An open, shared testbed for

– Developing– Deploying– Accessing

- planetary-scale services.What would you do if you had Akamai’s

infrastructure?

PlanetLab

Motivation• New class of applications emerging that spread over

sizable fraction of the web• Architectural components starting to emerge• The next Internet will be created as an overlay on the

current one• It will be defined by services, not transport• There is NO vehicle to try out the next n great ideas

in this area

PlanetLab

Guidelines (1)

• Thousand viewpoints on “the cloud” is what matters– not the thousand servers– not the routers, per se– not the pipes

PlanetLab

Guidelines (2)

• and you must have the vantage points of the crossroads– co-location centers, peering points, etc.

PlanetLab

Guidelines (3)

• Each service needs an overlay covering many points– logically isolated

• Many concurrent services and applications– must be able to slice nodes => VM per service– service has a slice across large subset

• Must be able to run each service / app over long period to build meaningful workload– traffic capture/generator must be part of facility

• Consensus on “a node” more important than “which node”

PlanetLab

Guidelines (4)

• Test-lab as a whole must be up a lot– global remote administration and management– redundancy within

• Each service will require own management capability• Testlab nodes cannot “bring down” their site

– not on forwarding path

• Relationship to firewalls and proxies is key

PlanetLab

Guidelines (5)• Storage has to be a part of it

– edge nodes have significant capacity

• Needs a basic well-managed capability

PlanetLab

Initial core team:Intel Research:

David CullerTimothy RoscoeBrent ChunMic Bowman

Princeton:Larry PetersonMike Wawrzoniak

University of Washington:Tom AndersonSteven Gribble

PlanetLab

• 1000+ machines spanning 500 sites and 40 countries

• Supports distributed virtualization each of 600+ network services running in their own slice

Requirements

1) It must provide a global platform that supports both short-term experiments and long-running services.

– services must be isolated from each other– multiple services must run concurrently– must support real client workloads

Requirements

2) It must be available now, even though no one knows for sure what “it” is.

– deploy what we have today, and evolve over time– make the system as familiar as possible (e.g., Linux)– accommodate third-party management services

Requirements

3) We must convince sites to host nodes running code written by unknown researchers from other organizations.

– protect the Internet from PlanetLab traffic– must get the trust relationships right

Requirements

4) Sustaining growth depends on support for site autonomy and decentralized control.

– sites have final say over the nodes they host– must minimize (eliminate) centralized control

Requirements5) It must scale to support many users with minimal

resources available.– expect under-provisioned state to be the norm– shortage of logical resources too (e.g., IP addresses)

Design Challenges• Minimize centralized control without violating

trust assumptions.

• Balance the need for isolation with the reality of scarce resources.

• Maintain a stable and usable system while continuously evolving it.

Key Architectural Ideas• Distributed virtualization

– slice = set of virtual machines

• Unbundled management– infrastructure services run in their own slice

• Chain of responsibility– account for behavior of third-party software– manage trust relationships

PlanetLab

Implementation Research Issues• Sliceability: distributed virtualization• Isolation and resource control• Security and integrity: exposed machines• Management of a very large, widely dispersed

system• Instrumentation and measurement• Building blocks and primitives

29

Slice-ability• Each service runs in a slice of PlanetLab

– distributed set of resources (network of virtual machines)– allows services to run continuously

• VM monitor on each node enforces slices– limits fraction of node resources consumed– limits portion of name spaces consumed

• Issue: global resource discovery– how do applications specify their requirements?– how do we map these requirements onto a set of nodes?

Slices

Slices

Slices

User Opt-in

Server

http://coblitz.org/www.princeton.edu/podcast.mp4

Client

Per-Node View

Virtual Machine Monitor (VMM)

NodeMgr

LocalAdmin

VM1 VM2 VMn…

Global View

PLC

Exploit Layer 2 Circuits

Deployed in NLR & Internet2 (aka VINI)

Circuits (cont)

Supports arbitrary virtual topologies

Circuits (cont)

Exposes (can inject) network failures

Circuits (cont)

BGP

BGP

BGP

BGP

Participate in Internet routing

40

Distributed Control of Resources• At least two interested parties

– service producers (researchers) decide how their services are deployed over available nodes

– service consumers (users) decide what services run on their nodes

• At least two contributing factors– fair slice allocation policy

both local and global components (see above)

– knowledge about node state freshest at the node itself

41

Unbundled Management• Partition management into orthogonal services

– resource discovery– monitoring node health– topology management– manage user accounts and credentials– software distribution

• Issues– management services run in their own slice– allow competing alternatives– engineer for innovation (define minimal interfaces)

42

Application-Centric Interfaces• Inherent problems

– stable platform versus research into platforms– writing applications for temporary testbeds– integrating testbeds with desktop machines

• Approach– adopt popular API (Linux) and evolve implementation– eventually separate isolation and application interfaces– provide generic “shim” library for desktops

43

Virtual Machines• Security

– prevent unauthorized access to state• Familiar API

– forcing users to accept a new API is death• Isolation

– contain resource consumption• Performance

– don’t want to be apologetic

Virtualization

Virtual Machine Monitor (VMM)

NodeMgr

OwnerVM

VM1 VM2 VMn…

Linux kernel (Fedora Core)+ Vservers (namespace isolation)+ Schedulers (performance isolation)+ VNET (network virtualization)

Auditing serviceMonitoring servicesBrokerage servicesProvisioning services

Resource Allocation• Decouple slice creation and resource allocation

– given a “fair share” (1/Nth) by default when created– acquire/release additional resources over time

including resource guarantees

• Protect against thrashing and over-use– link bandwidth

upper bound on sustained rate (protect campus bandwidth)– memory

kill largest user of physical memory when swap at 85%

PlanetLab

Confluence of Technologies• Cluster-based management• Overlay and P2P networks• Virtual machines and sandboxing• Service composition frameworks• Internet measurement• Packet processors• Colo services• Web services The time is now.

Usage Stats• Users: 2500+ • Slices: 600+• Long-running services: ~20

– content distribution, scalable large file transfer, – multicast, pub-sub, routing overlays, anycast,…

• Bytes-per-day: 4 TB– 1Gbps peak rates not uncommon

• Unique IP-addrs-per-day: 1M

Validation Gap

Analysis Simulation / Emulation Experiment At ScaleWith Real Users

Deployment

(models) (code)

(results)

(measurements)

Deployment GapM

atu

rity

Time

Analysis (MatLab)

Controlled Experiment (EmuLab)

Deployment Study (PlanetLab)

Pilot Demonstration (PL Gold)

Commercial Adoption

Idea

s

Implementation Reality

User & Network Reality

Economic Reality

PlanetLab

Emerging applications• Content distribution• Peer-to-Peer networks• Global storage• Mobility services• Etc. etc.

Vibrant research community embarking on new direction and none can try out their ideas.

Trust RelationshipsPrincetonBerkeleyWashingtonMITBrownCMUNYUETHHarvardHP LabsIntelNEC LabsPurdueUCSDSICSCambridgeCornell…

princeton_codeennyu_dcornell_beehiveatt_mcashcmu_esmharvard_icehplabs_donutlabidsl_pseprirb_phiparis6_landmarksmit_dhtmcgill_cardhuji_enderarizona_storkucb_bambooucsd_shareumd_scriptroute…

N x NTrusted

Intermediary(PLC)

Principals• Node Owners

– host one or more nodes (retain ultimate control)– selects an MA and approves of one or more SAs

• Service Providers (Developers)– implements and deploys network services– responsible for the service’s behavior

• Management Authority (MA)– installs an maintains software on nodes– creates VMs and monitors their behavior

• Slice Authority (SA)– registers service providers– creates slices and binds them to responsible provider

Trust Relationships(1) Owner trusts MA to map network

activity to responsible sliceMA

Owner Provider

SA

(2) Owner trusts SA to map slice to responsible providers

1

2

5

6

(3) Provider trusts SA to create VMs on its behalf

3

(4) Provider trusts MA to provide working VMs & not falsely accuse it

4

(5) SA trusts provider to deploy responsible services

(6) MA trusts owner to keep nodes physically secure

Architectural ElementsMA

NM +VMM

nodedatabase

NodeOwner

OwnerVM

SCS

SAslice

database

VM ServiceProvider

Slice Creation

PLC(SA)

VMM

NM VM

PI SliceCreate( ) SliceUsersAdd( )

User/Agent GetTicket( )

VM …

.

.

.

.

.

.

(redeem ticket with plc.scs)

CreateVM(slice)

plc.scs

Brokerage Service

PLC(SA)

VMM

NM VM VM VM…

.

.

.

.

.

.

(broker contacts relevant nodes)

Bind(slice, pool)

VM

User BuyResources( )

Broker

PlanetLab: Two Perspectives• Useful research platform• Prototype of a new network architecture

What are people doing in/on/with/around PlanetLab?

1. Network measurement2. Application-level multicast3. Distributed Hash Tables4. Storage5. Resource Allocation6. Distributed Query Processing7. Content Distribution Networks8. Management and Monitoring9. Overlay Networks10. Virtualisation and Isolation11. Router Design12. Testbed Federation13. …

Lessons Learned• Trust relationships

– owners, operators, developers• Virtualization

– scalability is critical– control plane and node OS are orthogonal– least privilege in support of management functionality

• Decentralized control– owner autonomy– delegation

• Resource allocation– decouple slice creation and resource allocation– best effort + overload protection

• Evolve based on experience– support users quickly

Conclusions• Innovation can come from anywhere

• Much of the Internet’s success can be traced to its support for innovation “at the edges”

• There is currently a high barrier-to-entry for innovating “throughout the net”

• One answer is a network substrate that supports “on demand, customizable networks”– enables research– supports continual innovation and evolution

PlanetLab Software Overview

Mark Huangmlhuang@cs.princeton.edu

Node Software• Boot

– Boot CD– Boot Manager

• Virtualization– Linux kernel– VServer– VNET

• Node Management– Node Manager– NodeUpdate– PlanetLabConf

• Slice Management– Slice Creation Service– Proper

• Monitoring– PlanetFlow– pl_mom

PLC Software• Database server

– pl_db• PLCAPI server

– plc_api• Web server

– Website PHP– Scripts

• Boot server– PlanetLabConf scripts

• PlanetFlow archive• Mail, Support (RT), DNS, Monitor, Build, CVS, QA

Boot Manager• Boot Manager

– bootmanager/source/ Main BootManager class, authentication, utility functions,

configuration, etc.– bootmanager/source/steps/

Individual “steps” of the install/boot process– bootmanager/support-files/

Bootstrap tarball generation Legacy support for old Boot CDs

Virtualization• Linux kernel

– Fedora Core 8 kernel VServer patch

• VServer– util-vserver/

Userspace VServer management utilities and libraries• VNET

– Linux kernel module– Intercepts bind(), other socket calls– Intercepts and marks all IP packets– Implements TUN/TAP, proxy socket extensions

Node Management• Node Manager (pl_nm)

– sidewinder/ Thin XML-RPC shim around VServer (or other VMM) syscalls, and other

knobs– util-python/

Miscellaneous Python utility functions– util-vserver/python/

Python bindings for VServer syscalls• Node Update

– NodeUpdate/ Wrapper around yum for keeping node RPMs up-to-date

• PlanetLabConf– PlanetLabConf/

Pull-based configuration file distribution service Most files dynamically generated on a per-node or per-node group basis

Slice Management• Slice Creation Service (pl_conf)

– sidewinder/ Runs in a slice Periodically downloads slices.xml from boot server Local XML-RPC API for delegated slice creation, query

• Proper– proper/

Simple local interface for executing privileged operations Bind mount(), privileged port bind(), root read()

Administration and Monitoring• PlanetFlow (pl_netflow)

– netflow/ MySQL schema and initialization/maintenance scripts

– netflow/html/ PHP frontend

– netflow/pfgrep/ Console frontend

– ulogd/ Packet header collection, aggregation, and insertion

• PlanetLab Monitor (pl_mom)– pl_mom/swapmon.py

Swap space monitor and slice reaper– pl_mom/bwmon.py

Average daily bandwidth monitor

Database and API• Database

– pl_db/ PostgreSQL schema generated from XML

• PLCAPI– plc_api/specification/

XML specification of API functions

– plc_api/PLC/ mod_python implementation

Web Server• PHP, Static, Generated

– plc_www/includes/new_plc_api.php Auto-generated PHP binding to PLCAPI

– plc_www/db/ Secure portion of website

– plc_www/generated/ Generated include files

– plc/scripts/ Miscellaneous scripts

Boot Server• Secure Software Distribution

– Authenticated, encrypted with SSL– /var/www/html/boot/

Default location for Boot Manager

– /var/www/html/install-rpms/ Default /etc/yum.conf location for RPM updates

– /var/www/html/PlanetLabConf/ Server-side component Mostly PHP