Date post: | 05-Jan-2016 |
Category: |
Documents |
Upload: | allyson-bruce |
View: | 212 times |
Download: | 0 times |
Sponsored by the National Science Foundation
GENICampus Ops Workflow
Chaos GolubitskySan Juan, Puerto Rico Mar 16 2011
www.geni.net
Sponsored by the National Science Foundation 2March 16, 2011 www.geni.net
Outline
• Introduction• Experimenter support• Resources• Monitoring
Sponsored by the National Science Foundation 3March 16, 2011 www.geni.net
Towards a more “production-like” GENI
• Some Spiral 3 ops goals:– Resources are easier for experimenters to find and use– Provisioning an experiment doesn’t require picking up
the phone (as often)– Resources are more reliably available– Problems with resources are easier to detect and
resolve
• Here are some steps we think will be useful
Sponsored by the National Science Foundation 4March 16, 2011 www.geni.net
Campus ops workflow?
• A workflow is a set of steps to achieve a goal:– Become a production GENI campus!
• This process will change as more campuses try it• Proposed workflow steps we think will be useful• Three categories:
– Experimenter support– Resource deployment– Monitoring
• There’s more than one way to do this; input is welcome!
Sponsored by the National Science Foundation 5March 16, 2011 www.geni.net
GPO as reference campus
• We try things out, test, and provide guidance and support to campuses deploying similar things– And pass along ideas for other reference campuses
• We hope to help:– Small testbeds with diverse resources (OpenFlow, MyPLC,
ProtoGENI, L2 backbone connectivity)– Campuses who want to create testbeds– Bigger testbeds (where we can)
• We’re working on:– Experimenter support– More (and more GENI-like) resources– Useful monitoring– Templates for transitioning to GENI operations
Sponsored by the National Science Foundation 6March 16, 2011 www.geni.net
Workflow Steps for Experimenter Support
• Subscribe to [email protected]: http://lists.geni.net/mailman/listinfo/response-team– Report your outages– Answer questions from experimenters
• Tell GPO ([email protected]) you’re willing to support some experimenters:http://groups.geni.net/geni/wiki/ProductionResources
• Create a page advertising each of your aggregates:http://groups.geni.net/geni/wiki/GeniAggregate/YourSiteAggregate– What resources do you have?– Who can use them?– How do they use them?– Resources don’t need to be fully open to the public to be advertised
here– Template: http://groups.geni.net/geni/wiki/TemplateAggregatePage
Sponsored by the National Science Foundation 7March 16, 2011 www.geni.net
Experimenter Support at GPO
http://groups.geni.net/geni/wiki/GeniAggregate/GpoLabProtoGeni
Sponsored by the National Science Foundation 8March 16, 2011 www.geni.net
Workflow Steps for Adding Resources
• Connectivity• Aggregates:
– Give local users access to your resources– Run software that supports the GENI AM API– Give remote users access to your resources (consistent
with your site policy)
• Configuration management:– Know what you’re running– Especially if it’s GENI software (things change fast)– Allows you to help experimenters better– Allows us (and other campuses) to help you better
Sponsored by the National Science Foundation 9March 16, 2011 www.geni.net
Resources at GPO
• GPO can provide templates and help for aggregates we have experience with
• Things we have:– Connections to NLR and I2 backbones– OpenFlow switches (HP/NEC/Quanta), FlowVisors,
controllers, GENI AM API support– Reference installation of WiMAX software– ProtoGENI cluster
• A simple resource you can deploy:– MyPLC plus SFA to support the GENI AM API:
http://groups.geni.net/geni/wiki/GpoLab/MyplcReferenceImplementation
Sponsored by the National Science Foundation 10March 16, 2011 www.geni.net
Workflow Steps for Monitoring (1)
• Two consumers of monitoring data:– Operators and experimenters
• Operators:– Goals:
• Detect and resolve outages quickly• Plan for the future
– Monitoring steps:• Polling and trending of local resources• Alerting on local resource outages• Visibility into status of connected remote resources• Visibility into many remote resources in a consistent format
Sponsored by the National Science Foundation 11March 16, 2011 www.geni.net
Workflow Steps for Monitoring (2)
• Experimenters:– Goals:
• Identify problems affecting the slice• Collect measurement data for their slice
– Monitoring steps:• Status of available resources (how many nodes?)• Status of resources I’m using (is my node up?)• External characteristics of slice (CPU usage? Network
bandwidth?)• Internal characteristics of slice (I&M working session Thursday)
Sponsored by the National Science Foundation 12March 16, 2011 www.geni.net
Monitoring at GPO
• Strategy:– Collect as much data as possible from our site now:
http://monitor.gpolab.bbn.com– Integrate our data with collectors (GMOC, aggregates)
• Tactics:– Trending is more important than alerting:
• Remote operators and experimenters are casual consumers• Don’t want alerts for resources which may not be relevant• Do want historical availability information on request
– Collect numeric trending data in a consistent format:• Using ganglia to collect data in rrdtool format for now
– Generate webpages that format ganglia’s data more meaningfully
Sponsored by the National Science Foundation 13March 16, 2011 www.geni.net
Monitoring at GPO: Ganglia’s native UI
Sponsored by the National Science Foundation 14March 16, 2011 www.geni.net
Monitoring at GPO: Collecting GENI Data
• Active testing:– Use simple scripts to run tests and report results to
ganglia– Test recent values for freshness and sanity– GPO uses this to monitor reachability across the NLR
and Internet2 OpenFlow backbone
• Collecting external slice data:– Run locally on aggregate manager– Query aggregate data: slice names, node counts– Query operational data: packet counters, node state,
CPU usage
Sponsored by the National Science Foundation 15March 16, 2011 www.geni.net
Monitoring at GPO: Status of core VLANs
Sponsored by the National Science Foundation 16March 16, 2011 www.geni.net
Monitoring at GPO: FlowVisor slice status
Sponsored by the National Science Foundation 17March 16, 2011 www.geni.net
Summary
• Spiral 3 ops goals:– Test operations across several unaffiliated campuses– Ramp up GENI-wide experiment support
• GPO is trying to be an example campus, but there are many others
• If you do only two things, please:– Join [email protected]– Make sure we ([email protected]) know what you
would like to support this year, and what we can do to help