AstroGrid-D Monitoring

Date post: 05-Feb-2016
AstroGrid-D Monitoring. Contents: Host Monitoring (for compute resources) Status Goals until the end of the project Perspectives beyond the project Robotic Telescope Monitoring Status
25.02.2008 AstroGrid-D Monitoring AstroGrid-D Meeting @ AIP 25.-26.2.2008 Frank Breitling Stephan Braune
AstroGrid-D Monitoring




Frank BreitlingStephan Braune

25.02.2008


Host Monitoring (for compute resources) Status Goals until the end of the project Perspectives beyond the project

Robotic Telescope Monitoring Status Goals until the end of the project Perspectives beyond the project

25.02.2008

Host Monitoring Status

Since Dec. 2007 AGD monitoring solution

It builds on Audit Logging provided by Globus Toolkit V4.0.5 and later PostgreSQL Database (DB) DB Triggers Usage Records (UR) XML format (http://staff.psc.edu/lfm/PSC/Grid/UR-WG/) XML2RDF XSLT Stellaris SPARQL queries

A test setup is running at the AIP since Dec. 2007

25.02.2008


r W





Globus gridResource




DatabaseEarlier: status information via

EPR-files and monitoring.pl




SPARQL QueriesTimelines


AGD Monitoring Architecture

25.02.2008

Changes in the Globus Toolkit configuration:

in $GLOBUS_LOCATION/container-log4j.properties:...# GRAM AUDITlog4j.category.org.globus.exec.service.exec.StateMachine.audit=DEBUG, AUDITlog4j.appender.AUDIT=org.globus.exec.utils.audit.AuditDatabaseAppenderlog4j.appender.AUDIT.layout=org.apache.log4j.PatternLayoutlog4j.additivity.org.globus.exec.service.exec.StateMachine.audit=false

output to database (PostgreSQL or MySQL), Database Connection has to be declared in $GLOBUS_LOCATION/etc/gram-service/jndi-config.xml:

<resource ...> <resourceParams> ... <parameter> <name>url</name><value>jdbc:mysql://<host>[:port]/auditDatabase</value> </parameter> <parameter><name>user</name><value>globus</value></parameter> <parameter><name>password</name><value>foo</value></parameter> ... </resourceParams></resource>

table update whenever a job ist started or changed it's status (contrary to SAGAS)

database content is converted into Usage Record format and sent to Stellaris via DB triggers

Activation of Audit Logging in Globusfor WS GRAM (globusrun-ws)

25.02.2008

Activation of Audit Logging in Globusfor Pre WS GRAM (globus-job-run)

Changes in the Globus Toolkit configuration:

in $GLOBUS_LOCATION/log4j.properties:...# GRAM AUDITlog4j.category.org.globus.exec.service.exec.StateMachine.audit=DEBUG, AUDITlog4j.appender.AUDIT=org.globus.exec.utils.audit.AuditDatabaseAppenderlog4j.appender.AUDIT.layout=org.apache.log4j.PatternLayoutlog4j.additivity.org.globus.exec.service.exec.StateMachine.audit=false

text file output has to be configured in $GLOBUS_LOCATION/etc/globus-job-manager.conf:

-audit-directory /tmp/globus

file is converted into Usage Record format and sent to Stellaris via a cron job

25.02.2008

Audit Fields in PostgreSQL DB

25.02.2008

DB Trigger

CREATE FUNCTION update_stellaris() RETURNS "trigger" AS $update_stellaris$use strict;use URI;use Net::hostent;use XML::Writer;use HTTP::Request;use LWP::UserAgent;

my $job_grid_id = URI->new($_TD->{new}{job_grid_id});my $id = unpack("H*", $job_grid_id->query()); my $host=gethost($job_grid_id->host())->name();my $usage_record = "";my $writer = XML::Writer->new(OUTPUT => \$usage_record, NEWLINES => 1, UNSAFE => 1);$writer->xmlDecl("UTF-8");$writer->startTag("JobUsageRecord", "xmlns" => "http://www.gridforum.org/2003/ur-wg#", ...); $writer->startTag("RecordIdentity"); $writer->dataElement("LocalJobId", $_TD->{new}{local_job_id}); $writer->endTag("RecordIdentity"); ..... $writer->raw($_TD->{new}{job_description}); $writer->dataElement("success_flag", $_TD->{new}{success_flag}); $writer->dataElement("finished_flag", $_TD->{new}{finished_flag});$writer->endTag("JobUsageRecord"); $writer->end();

my $req = HTTP::Request->new("PUT", "http://stellaris.astrogrid-d.org/files/hosts/".$host."/urs/".$id, HTTP::Headers->new(Content_Length => length($usage_record)), $usage_record);my $ua = LWP::UserAgent->new(); my $res = $ua->request($req); ..... return;$update_stellaris$ LANGUAGE plperlu;

CREATE TRIGGER update_stellaris_trig BEFORE INSERT OR UPDATE ON gram_audit_table FOR EACH ROW EXECUTE PROCEDURE update_stellaris();

The triggers are installte in the PostgreSQL DB using: audit=# \i trigger.sql Documentation is available at AGD intranet: http://mintaka.aip.de:8080/lenya/intranet/live/workpackages/wg2/GRAM_audit_logging.pdf

25.02.2008

SPARQL Queries for Usage Statistics

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX ur: <http://www.gridforum.org/2003/ur-wg#>

PREFIX x2r: <http://www.astrogrid-d.org/2007/08/14-xml2rdf#>

SELECT ?job_grid_id ?GlobalUserName ?SubmitHost ?executable

?creation_time ?StartTime ?EndTime ?wdv ?Count ?CPU_Time

WHERE { graph ?g {

?n1 ur:JobIdentity ?JobIdentity .

?JobIdentity ur:job_grid_id ?job_grid_id .

?n1 ur:UserIdentity ?UserIdentity .

?UserIdentity ur:GlobalUserName ?GlobalUserName .

?n1 ur:creation_time ?creation_time .

?n1 ur:SubmitHost ?SubmitHost .

OPTIONAL { ?n1 ur:StartTime ?StartTime .

?n1 ur:EndTime ?EndTime . }

OPTIONAL { ?n1 ur:WallDuration ?wall_duration .

?wall_duration x2r:value ?wdv . }

OPTIONAL { ?n1 ur:Resource ?res .

?res x2r:value ?executable . }

OPTIONAL { ?n1 ur:Count ?Count . }

OPTIONAL { ?n1 ur:CPU_Time ?CPU_Time . }

}} ORDER BY DESC(?creation_time) LIMIT 25

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX ur: <http://www.gridforum.org/2003/ur-wg#>

PREFIX x2r: <http://www.astrogrid-d.org/2007/08/14-xml2rdf#>

SELECT distinct ?GlobalUserName ?executable ?SubmitHost sum(?CPU_Time)


graph ?g {

?n1 ur:JobIdentity ?JobIdentity .

?JobIdentity ur:job_grid_id ?job_grid_id .

?n1 ur:UserIdentity ?UserIdentity .

?UserIdentity ur:GlobalUserId ?GlobalUserName .

?n1 ur:SubmitHost ?SubmitHost .

?n1 ur:CPU_Time ?CPU_Time .


?n1 ur:Resource ?res .

?res x2r:value ?executable .


}} ORDER BY ?GlobalUserId

25.02.2008

Retrieving Usage Statistics via Stellaris

25.02.2008

Goals until the end of the project

Integrate monitoring info in Timeline and Resource Map Provide more SPARQL query templates

(See svn://svn.gac-grid.org/software/monitoring/host/)

Provide improved documentation and installation instructions Include all AGD institutes and resource in monitoring Come from test to production mode, i.e. solve remaining problems

25.02.2008

Solve instable DB connection

Audit Logging establishes a DB connection only once, i.e. the first time a job is submitted to Globus If the DB goes down, the connection is lost and no further

data received => a restart of the Globus Container necessary Solution: we have informed the GT developers via

mailing lists: gt-user & gram-user but report: http://bugzilla.globus.org/globus/show_bug.cgi?id=5863

25.02.2008

Add missing fields in audit logging

Some important information is not provided by audit logging global job id (UUID format) resource usage information as reported by the UNIX time command, i.e.: (i) the elapsed real time (ii) the user CPU time (iii) the system CPU time end time of the job, in the same format as creation_time name of submission client name of execution host (and maybe also the number of used CPUs)

Solution: we have informed the GT developers via mailing lists: gt-user & gram-user but report: http://bugzilla.globus.org/globus/show_bug.cgi?id=5864

25.02.2008

Add Usage Record (UR) format

Audit logging is not compatible to the UR format, the OGF standard for monitoring information currently we construct URs via database triggers Solution: we have informed the GT developers via

mailing lists: gt-user & gram-user but report: http://bugzilla.globus.org/globus/show_bug.cgi?id=5865

25.02.2008

Simplify installation procedure

Currently the PostgreSQL has to be recompiled with Perl support

DB triggers have to be installed Globus configuration is necessary Solution: we want to optimize the installation process,

maybe with a Globus helper package

25.02.2008

Upgrade to Stellaris V 0.2.0

Currently a few problems also exist with Stellaris V 0.2.0 We continue testing and Report every problem to Mikael Högqvist

25.02.2008

Perspectives beyond the project

Define a common policy about data privacy,since AGD resources are shared with other grid communities (e.g. LRZ) which might have different restrictions on logging of user information

Suggest AGD monitoring solution to other grid communities

25.02.2008

New vision of the RT project as reflected by new name: OpenTel corresponding project page: http://www.gac-grid.org/project-products/RoboticTelescopes.html OpenTel is an open network for rob. telescopes. Open means

open standards open source open for telescopes to join

OpenTel is for professional and amateur astronomers OpenTel is currently the only open network and therefore a unique

and promising approach in robotic astronomy

Robotic Telescopes Status

25.02.2008

Project History

Progress so far D2.4 Static metadata: FB, done (15.5.2007) D2.7 Dynamic metadata / Monitoring: FB, 66% complete,

publication expected in March D5.3 First Integration of RTs: FB, done (31.7.2007)

Goals until end of project D5.5 Resource Broker: TR, work in progress. FB will help. D5.8 Scheduler: FB, TR, Thomas G., to be done

25.02.2008

Monitoring / Dynamic Metadata

Monitoring a network of robotic telescopes -Deliverable 2.7: STELLA-I & II as info providers for Stellaris

Same database triggers as for host monitoring RDF Calendar format is used for scheduling info (understood by RDF tools) Trigger templates can be easily adjusted for other telescopes Software is collected in a package called “ottools” Timeline showing observation schedule directly from the STELLA DB

(http://photon.aip.de:25000/timeline/telescopes.html) Timeplot showing weather information (tbd)

25.02.2008

Goals until the end of the project

Provide a general solution for the integration of other telescopes. This requires:

Metadata management based on user certificates Software package with tools and templates (ottools)

svn://svn.gac-grid.org/software/OpenTel/ottools Comprehensive documentation Improved user interfaces:

Timeline & Timeplot with menu for selection of telescopes, time windows, etc. Timeplot displaying new metadata of time series (temperature, seeing, etc.) Resouce map displaying dynamic metadata

Resource Broker (D5.5) Scheduler (D5.8) Integrate STELLA-I & STELLA-II First observation via the grid

25.02.2008

Perspectives beyond the project

Improve software, in particular the scheduler Perform more grid observations, more testing Perform first network observations Integrate more telescopes, in particular from hobby

astronomers. Software contributions would be welcome Collaboration with other networks such as the LCOGT Attract and collaborate with the amateur astronomy and open

source community Find an OpenTel logo
