Date post: | 27-Dec-2015 |
Category: |
Documents |
Upload: | paul-summers |
View: | 216 times |
Download: | 1 times |
Concept demo
System dashboard
OverviewDashboard use caseGeneral implementation ideasUse of MULE integration platform
CollectionAggregation/Factorization modelPlotting Service
Usage Examples
Concept demo of the “troubleshooting dashboard”
Understand compound state of the system in past by cross analyzing independent (and possibly many) data sources. The goal of the analysis is to spot trends and
pattern in the change of the system state to develop some sort of systematic response procedure.
Challenge variety of resources available to supply needed
data. No common format or place where that data can be fetchedOtherwise, grep and awk through log files
Factor, scale ,compare and display data on the fly
Time seriesTime series element describes state of the system
at time T.Series show the evolution of system state (or its
aspects)Time series are generated by the data sourceWe want dashboard to index as many data sources
as possible without enforcing strict content schemaUse “dimensions” as the language to describe system
aspectsTime series element contains list of dimensions which
collectively define an aspect.To optimize time series search queries, the dashboard
will need to know the data source “schema” ( list of supportable dimensions)
Concept componentsBrian’s Bockelman gratia graphing and reporting
toolExcellent resources which renders and makes
available accounting data in XML.dCache memory plots by D. LitvintsevRRD toolkit (for visual reporting)
Common graphing package for time series dataTime series factorization tool.
Developed by me. Implements rules for finding commonalities between different time series elements
SMTP, Mime message, graphical dashboardA way to deliver results to user
Where Mule fits in ?Defines common data format to describe
various system activitiesWe’ve used time series of numerical values
Defines work flow of services interacting with each other in accordance with user’s initial request
Implements aggregation of dataIntegrates results into common reporting
tools
Data source(s)
Our scenarioData
source Web
frontend
Log filesGratia
dCache i.providers
Fuzzy troublesho
oter
requestClient
EmailRequest results
Render
WareHouse
Warehouse Service
MapReduce
[MULE] Data
factorization and
aggregation
Request splitter
Time. Sfactorizatio
n
Aggregator
General design considerationsBreak up work flow into components that
must access and maintain smallest possible , transient contextStandalone service, pass through
transformationsUnderstand means of accepting,
transforming and dispatching requestsMap components into UMO modelsMap acceptance , transformation and
dispatch to Mule endpoints, transformers and routers.
UI: Accepting requestsModel We need a model that describes how we want to accept and
dispatch out user requests to out “troublshooting/dashboard system”.In Mule , model may perform complex business related tasks or
be simply pass though component linking input and output transformation logic
<mule-descriptor name=“TroubleshootingSvc" implementation="org.fnal.mule.plotsvc.TroublshootingSvcWebFndImpl">
.
. </mule-descriptor> While both the model and Mule transformers “transform” data,
the principal trait of the model is ability to maintain and use object state
UI: Accepting requests Model: input endpoints Define independent endpoints within the model
Email <endpoint
address="imap://abaranov:<pswd>@imapserver1.fnal.gov " transformers=« MimeToString XMLToObjectt« />
Web <endpoint
address="axis:http://d0mino01.fnal.gov:65084/services" transformers=“XMLToObject" />
Std in <endpoint address="stream://System.in“
transformers=“XMLToObject" />
Each endpoint accepts data in a different format and uses specific transformer to converge to common type used down the chain.
Dispatching requests <router className="org.mule.routing.outbound.MulticastingRouter"> <endpoint
address="axis:http://d0mino01.fnal.gov:65083/services/WareHouse?method=find" transformers="UserRequestToWareHouseQuery" />
<endpoint address="vm://BriansPlotSvc" /> <endpoint address="vm://ResponseCollectorDispatcher"/> <properties> <property name="enableCorrelation" value="1"/> </properties> </router>
Our use case is to accept the request and multiplex that request to a set of data source providers for lookup.Use correlation to assemble results later
Translate user query into the context specific to the data source
Data source(s) modelData source model adapts particular type
of the data source.Gratia, dCache billing data, log mining, cache
of the previously retrieved dataExample: Gratia access though Brian B.
web interface<mule-descriptor name="BriansPlotSvc”
implementation="org.mule.components.rest.RestServiceWrapper" >Use Mule RestServiceWrapper to accept messages and
proxy them to a REST WEB service.For simplicity we don’t do any parsing or filtering here.
Data source input and output endpointsInput
<inbound-router> <endpoint address="vm://BriansPlotSvc" /> </inbound-router>
Output <outbound-router> <router
className="org.mule.routing.outbound.OutboundPassThroughRouter"> <endpoint address="vm://ResponseCollectorDispatcher"
transformers="ToTMXML ToTypedTMXML XMLToObject"/> </router> </outbound-router> Content of the Brian’s web page (which is XML) is transformed to XML
representing common time series data. That data is transformed to a java specific serialized XML representation and the translated to a vector of internal time series objects.
Aggregation/Factorization modelThe Mule pass though model is used to
implement aggregation and factorization functional pieces.Data is aggregated at the input of the modelData is factorized at the output of the modelNo internal processing is needed.
<mule-descriptor name="ResponseCollectorDispatcher" implementation="org.mule.components.simple.EchoComponent“>
</ mule-descriptor>
AggregationTime data aggregator waits and collects pieces
from all data sources selected at the dispatch stage.<inbound-router> <endpoint
address="vm://ResponseCollectorDispatcher"/> <router
className="org.fnal.mule.plotsvc.TimeDataAggregator“/></inbound-router>
TimeDataAggregator is a very simple class the defines a rule of how vectors of time series data should be joined into a new data type of higher level which instructs further steps of factorization and rendering.
FactorizationWe want to produce collection of plots from
the input data array such that each plot has the same plotting instructions yet built from a subset of supplied data. Identify similarities across data array and
generate set of independent reporting instances.
Mule mapping for factorization
<outbound-router> <router className="org.fnal.mule.plotsvc.GroupBySplitterRouter"> <endpoint address="vm://PlotService"/> <properties> <property name="enableCorrelation" value="1"/> </properties> </router></outbound-router>
GroupBySplitterRouter implements router splitter interface. It uses message context to dissect factorization instructions
and splits the message into pieces relevant only for each independent report (graph)
Uses correlation to enable optional assembly of independent reports into a top level summary
PlottingServiceA stub for time series renderer
May be replaced with any other render. JasperReport for ex.
Model accepts time series messages with opaque instruction on how that time series must be rendered. In our case – plain RRD command.
Model outputs UserPlotResponse that contains initial request along with the URL of the report file
UserPlotResponse aggregationEach UserPlotResponse is aggregated by
the component responsible for notifying user with results of his requestUses correlation as set by factorization
router-splitter
ExampleYour request to Mule stdin endpoint (for
simplicity)<org.fnal.mule.plotsvc.UserPlotQuery> <queryId>Random id</queryId><query> factor(Vos,$VO) </query> <plotCommand> testDimension=Running Jobs;
RRD( AREA:testDimension#FF0000 ) </plotCommand></org.fnal.mule.plotsvc.UserPlotQuery>
Factor instructions define grouping by value of the VO field of the dataset (note : uses all data source data - no initial filtering of the data source, for simplicity )
RRD command follows the syntax of the RRD tools cmd line
Example output on your mailbox
……
……
Example 2Your request sent to the SAME Mule stdin
endpoint<org.fnal.mule.plotsvc.UserPlotQuery> <queryId>Random id</queryId> <query> factor(runningJobs,Running Jobs) </query> <plotCommand>testDimension=VO(dzero); testDimension1=VO(cms);
RRD( LINE2:testDimension#FF0000 :D0 LINE2:testDimension1#0000FF:CMS)
</plotCommand></org.fnal.mule.plotsvc.UserPlotQuery>
Factor defines class of time series that have “Running Jobs” dimension (which are all in the example)
RRD expression plots CMS Vs D0 (true story)
Output in your mail box
capasityDim=Capacity; utilizationDim=Utilization; RRD( CDEF:diff=capasityDim,utilizationDim,-
AREA:capasityDim#FF0000:capasity(Mb) AREA:utilizationDim#0000FF:utilization(Mb)
AREA:diff#00FF00:difference(Mb))