Cover your PaaS with the Cloud Foundry dashboard for operational metrics

Post on 08-May-2015

4,446 views 0 download

description

This talk will introduce you to the open source admin UI that IBM announced at the PlatformCF conference in Santa Clara, CA in September, 2013 (http://www.slideshare.net/platformcf/the-ibm-dashboard-for-operational-metrics). The project is available in the GitHub Cloud Foundry Incubator repo. (https://github.com/cloudfoundry-incubator/admin-ui) About the speaker: Daniel Krook is a New York area Senior Certified IT Specialist (IBM L3 Thought Leader/The Open Group L3 Distinguished) and Master Inventor. He currently delivers IBM BlueMix innovation based on the OpenStack and Cloud Foundry open source projects.

transcript

‹#›

Cover your PaaSwith the Cloud Foundry dashboard for operational metrics

Daniel KrookSenior Certified IT Specialist, IBM@danielkrook - krook.info

Your presenter▪ Built the DevOps infrastructure to deploy and manage the first large scale

Cloud Foundry clusters on OpenStack inside of IBM !

▪ Helps customers understand the value of the Platform-as-a-Service cloud delivery model and adopt systems of engagement !

▪ Enjoys meeting and sharing knowledge with the community that builds open cloud architectures (and founded NYC CF)

IBM runs Cloud Foundry on hundreds of SoftLayer VMs

BlueMix

In the past year, we’ve learned how to

• Manage hundreds of DEAs, service nodes, fabric nodes in the beta • Several other development and staging environments before that • Deployed first with Chef, then with BOSH over 18 months • All environments have benefited from the Admin UI

• Keep Cloud Foundry running smoothly • Discover and prevent impending problems • Resolve unexpected issues quickly

1. Show the type and volume of data and why we want to monitor it

2. Show how we monitor that data with the Admin UI (the dashboard for operational metrics)

3. Show you a demo of the Admin UI and how to install it either standalone or via BOSH

4. Show you how to get involved with the GitHub incubator project and improve it

Goals for this talk

We are looking to get better at this, and help the community get better as well.

What’s the important data and how do we find it?

What metrics matter?

Data that can be tracked over time to see trends and behaviors

Data that can help us predict problems before they happen

DEAs and apps health

▪Memory reserved as a proportion of the memory available

General health of all components

▪Health of the virtual machines ▪Status of the processes running on them

Database nodes and services

▪Number of provisioned services against capacity available

At the PaaS layer, that means:

▪ Deliver continuous availability in the cloud !

▪ Proactively solve problems rather than react to them !

▪ Understand the behavior of the system to automate it

Why do we need this data?

▪ NATS message bus • Discover the components to interrogate • Query their varz endpoints

Where can we find it?

▪ Cloud Controller REST API ▪ UAA REST API

!

Enter the Admin UI

1. Views of component health !

2. Resource usage details !

3. Ongoing growth trends !

4. Access to logs and raw varz !

5. Email notifications

The Admin UI provides…

▪ Components nearing capacity or failure ▪ Already failed components ▪ Out of control apps and noisy users

!!!

▪ Active/inactive users and apps ▪ Growth trends and runtime/service adoption

It helps us find (and fix) problems

It helps us see patterns

Link to the spaces, users, and apps tabs above, with search filter enabled

Organizations

Clicking the icon will dump raw JSON data

Link to the orgs, users, and apps tabs above, with search filter enabled

Spaces

Clicking the icon will dump raw JSON data

Apps

Link to the spaces, orgs, and DEAs tabs above, with search filter

App URL is linked and the bound services are listed

Clicking the icon will dump raw JSON data

Link to the spaces and orgs tabs above, with search filter enabled

Users

Clicking the icon will dump raw JSON data

Link to the apps tabs above, with search filter enabled

DEAs

Clicking the varz link will dump raw JSON data

Cloud controllers

Clicking the varz link will dump raw JSON data

Health manager

Clicking the varz link will dump raw JSON data

Service gateways

Individual service node capacities are listed

Clicking the varz link will dump raw JSON data

Routers

Clicking the varz link will dump raw JSON data

Components

Clicking the varz link will dump raw JSON data

LogsThis only shows logs on the VM where the app is installed !Logs are vertically and horizontally scrollable

Nightly stats

This summary page is publicly viewable

Running the Admin UI

Run the Admin UI as a standalone service !

$ git clone https://github.com/cloudfoundry-incubator/admin-ui.git $ cd admin-ui $ ruby bin/admin

Then open http://localhost:8070 !!Run the Admin UI as a BOSH job

- name: admin_ui release: admin-ui template: - admin_ui_v2 instances: 1 resource_pool: logger persistent_disk: 10240 networks: - name: default default: [dns, gateway]

Update config/default.yml (works as is for BOSH-lite)

Thanks!

Daniel KrookSenior Certified IT Specialist, IBM@danielkrook - krook.info

Older screenshots of the Admin UI

Latest screenshots are on GitHub https://github.com/cloudfoundry-incubator/admin-ui

User and app trends

There is also one unauthenticated page for high level stats

DEA list

DEA details

Service node list

Service node details

User list

User details

App list

App details

Log list

Log details

Email notifications

ibm.com/cloud