+ All Categories
Home > Documents > Pablo Project

Pablo Project

Date post: 06-Jan-2016
Category:
Upload: vinson
View: 29 times
Download: 0 times
Share this document with a friend
Description:
Pablo Project. http://www-pablo.cs.uiuc.edu/Projects/Pablo/ Goal: portable performance data environment for parallel systems Pablo Version 5.0 components SDDF Library TraceLibrary I/O Analysis programs Analysis GUI SvPablo. Self Defining Data Format -SDDF. - PowerPoint PPT Presentation
40
Pablo Project • http://www-pablo.cs.uiuc.edu/Projects/ Pablo/ • Goal: portable performance data environment for parallel systems Pablo Version 5.0 components SDDF Library – TraceLibrary I/O Analysis programs Analysis GUI – SvPablo
Transcript
Page 1: Pablo Project

Pablo Project

• http://www-pablo.cs.uiuc.edu/Projects/Pablo/

• Goal: portable performance data environment for parallel systems

• Pablo Version 5.0 components

– SDDF Library

– TraceLibrary

– I/O Analysis programs

– Analysis GUI

– SvPablo

Page 2: Pablo Project

Self Defining Data Format -SDDF

• Performance data description language that specifies both data record structures and data record instances

• Supports definition of records containing scalars and arrays of the base types found in most programming languages

• Developed to link Pablo instrumentation software to Pablo analysis environment

Page 3: Pablo Project

SDDF (cont.)

• Goals - compactness, portability, generality, extensibility

• ASCII and binary formats (binary contains flag indicating byte ordering)

• SDDF interface library -- library of C++ classes for writing and interpreting files in SDDF format

• FileStats utility -- shows types of records and range of values appearing in SDDF file

Page 4: Pablo Project

SDDF Example// “description” “IO Seek”“Seek” { // “Time” “Timestamp” int “Timestamp”[];

// “Seconds” “Floating Point Timestamp” double “Seconds”;

// “Event ID” “Corresponding event” // “700013” “lseek” // “700015” “fseek” int “Event Identifier”;

// “Node” “Processor number”; int “Processor Number”; // “Duration” “Event duration in seconds” double “Duration”; // “File ID” “Unique file identifier” // “Number Bytes” “Number of bytes traversed” int “Number Bytes”; // “Offset” “Byte offset from position indicated by Whence” int “Offset”; // “Whence” “Indicates file position that Offset is measured from” // “0” “SEEK_SET” // “1” “SEEK_CUR” // “2” “SEEK_END” int “Whence”;;;

Page 5: Pablo Project

SDDF Example (cont.)

“Seek” {

[2] {

201803857,

0

}, 20.1803857, 70013, 0, 0.0031946, 3, 0, 0, 0 };;

Page 6: Pablo Project
Page 7: Pablo Project

Pablo TraceLibrary

• Basic trace library with extensions for procedure tracing, loop tracing, NX message passing tracing, I/O tracing, MPI tracing

• Basic trace library– functions traceEvent, countEvent, startTimeEvent,

endTimeEvent

– event ID specifies type of event that is being traced

Page 8: Pablo Project

Pablo TraceLibrary (cont.)

• Extensions provide wrapper functions for management of event ID’s for various event types

• Procedure and loop tracing done manually by inserting calls to TraceLibrary routines into application source code

• Default mode is to dump trace buffer contents to a trace file, but it’s possible to have trace data output sent to a socket for real-time analysis

Page 9: Pablo Project

TraceLibrary Scalability

• Documentation states that TraceLibrary monitors and dynamically alters volume, frequency, and types of event data by– associating a user-specified maximum trace level with

each event and – substituting less invasive data recording (e.g., event

counts rather than complete event traces) if maximum user-specified rate is exceeded

• Unclear if these measure are taken automatically by high-level trace library or if they must be explicitly called by user at low level

Page 10: Pablo Project

I/O Extension to TraceLibrary

• I/O instrumentation requires changes to application source code

• I/O trace initialization and termination routines must be called before and after calling any other I/O trace routines

• I/O trace bracketing routines provided for I/O requests that are not implemented as library calls (e.g., getc macro in C and Fortran I/O statements that are part of the language)

Page 11: Pablo Project

I/O Extension (cont.)

• I/O instrumentation options for C programs– Manually replace standard I/O calls with tracing

counterparts– Define IOTRACE so that pre-processor replaces

standard I/O calls with tracing counterparts• I/O instrumentation of Fortran programs

– Manually bracket each I/O call with I/O trace library bracketing routines

Page 12: Pablo Project

I/O Extension (cont.)

• Programs containing to I/O extension interface routines must be linked with– Pablo Trace Extension Library

libPabloTraceExt.a– Pablo Base Trace Library libPabloTrace.a

Page 13: Pablo Project

Sample C program - No Instrumentation#include <stdio.h>#include <stdlib.h>main()

{

FILE *fp; char buffer[1024]; size_t cnt;

fp = fopen(“/etc/motd”, “r”); if (fp != NULL) { cnt = fread(buffer, sizeof(char), 1024, fp); fclose(fp); }}

Page 14: Pablo Project

Sample C program - Manual Instrumentation#include “IOTrace.h”#include <stdio.h>#include <stdlib.h>

main()

{

FILE *fp; char buffer[1024]; size_t cnt;

initIOTrace(); /* Initialize I/O Extension */

fp = traceFOPEN(“/etc/motd”, “r”); if (fp != NULL) { cnt = traceFREAD(buffer, sizeof(char), 1024, fp) traceFCLOSE(fp); } /* Trace termination routines */ endIOTrace(); endTracing();}

Page 15: Pablo Project

Sample C program - Preprocessor Replacement#define IOTRACE#include “IOTrace.h”#include <stdio.h>#include <stdlib.h>

main()

{

FILE *fp; char buffer[1024]; size_t cnt;

initIOTrace(); /* Initialize I/O Extension */

fp = fopen(“/etc/motd”, “r”); if (fp != NULL) { cnt = fread(buffer, sizeof(char), 1024, fp) fclose(fp); } /* Trace termination routines */ endIOTrace(); endTracing();}

Page 16: Pablo Project

Sample Fortran program - No Instrumentation

integer i

open(unit=2,file=‘/tmp/f’,form=‘formatted’,status=‘new’)

i=0

write(2, 100) I

close(2)

100 format(‘Node ‘, i3)

end

Page 17: Pablo Project

Sample Fortran program - Manual Instrumentation#include “fIOTrace.h”

integer I

call initIOTrace()

call traceOpenBegin(‘/tmp/f’, i) open(unit=2,file=‘/tmp/f’,form=‘formatted’,status=‘new’) call traceOpenEnd(2)

i = 0 call traceWriteBegin(2,1,0) write(2, 100) I call traceWriteEnd(9)

call traceCloseBegin(2) close(2) call traceCloseEnd()

100 format(‘Node ‘,i3)

call endIOTrace() call endTracing()

end

Page 18: Pablo Project

MPI TraceLibrary Extension

• MPI profiling library that can be linked in without making source code changes

• Each MPI process output a trace file labeled with the process number

• Insert call to SetTraceFileName() immediately after MPI_Init() to control location of trace file

Page 19: Pablo Project

MPI Extension (cont.)

• Disable tracing by calling MPI_Control(0)

• Re-enable tracing by calling MPI_Control(1)

• Link with Pablo Trace Extension Library (libPabloTraceExt.a) and Pablo Base Trace Library (libPabloTrace.a)

• Merge per-process trace file using the SDDF utility MergePabloTraces

Page 20: Pablo Project

Pablo Trace File Analysis

• Command-line FileStats program scans SDDF file and reports record types, min and max values for each field, and count of each record type.

• SDDFStatistics GUI for generating and browsing statistics from an SDDF file

• Pablo I/O analysis command-line routines• Pablo Analysis GUI

Page 21: Pablo Project

SDDFStatistics

• Statistics for entire file are displayed along top of display

• Record types are displayed in panel at lower left• Clicking on a record type brings up statistics for

each field of that record type• Clicking on a field displays a histogram

summarizing values for that field• Clicking on an array field type brings up statistics

for each dimension of that field

Page 22: Pablo Project

SDDFStatistics display

Page 23: Pablo Project

SDDFStatistics Usage

• SDDFStatistics [-toolkitoption …] [-loadSummary filename] [-openSDDF filename]

• Or use runSDDFStatistics script which invokes the SDDFStatistics program after setting environment variables so that required resources can be located

Page 24: Pablo Project

I/O Analysis Programs

• Iostats generates a report of application I/O activity summarized by I/O request type.

• IOstatsTable produces table summarizing information about I/O operations.

• IOtotalsByPE produces a report showing the total count, duration, and bytes involved for various operations by processor.

Page 25: Pablo Project

I/O Analysis Programs (cont.)

• LifetimeIOstats produces a report summarizing I/O activity by processor and file, prints a histogram of the file lifetimes, and prints total time spent in I/O calls for each procedure.

• FileRegionIOstats generates a report of application I/O activity summarized by file region. Each file is divided spatially into regions whose size is set by calling enableFileRegionSummaries().

Page 26: Pablo Project

I/O Analysis Programs (cont.)

• TimeWindowIOstats produces a report from Time Window Summary trace records. The execution time of the program is divided into time windows whose size is set by calling enableTimeWindowSummaries().

• SyncIOfileIDs processes a trace file contining I/O trace events where many different file Ids may be associated with a given file, and write a new file where every I/O trace event associated with a particular file (as determined by the file name) has the same file ID.

Page 27: Pablo Project

I/O Characterization Research using Pablo

• Detailed characterization of I/O behavior of scalable applications and existing parallel file systems

• Goals– Enable application developers to achieve higher

fraction of peak I/O performance on existing parallel file systems

– Help system software developers design better parallel file systems

Page 28: Pablo Project

I/O Research (cont.)

• Target Platforms– Intel Paragon– IBM SP– Convex Exemplar– SGI Origin 2000

Page 29: Pablo Project

I/O Research (cont.)

• The Scalable I/O (SIO) Initiative has targeted a number of application codes for study, including:– PRISM incompressible Navier-Stokes

calculations– SAR Synthetic Aperture Radar application– HF Hartree-Fock calculations– ESCAT SMC electron scattering– RENDER ray-identification rendering

Page 30: Pablo Project

Pablo and Virtual Reality

• Problem– Very large volume of captured performance data

for parallel systems– Human-computer interface is bandwidth-limited

• Proposed solution– Immerse users in virtual world so that users can

explore, viscerally experience, and modify the dynamic behavior of application and system software on a massively parallel system

Page 31: Pablo Project

Avatar

• Pablo virtual reality system• Operates with workstation monitor, head-mounted

display, and the CAVE• Presentation metaphors

– Scattercube Matrix• generalization of 2-d scatterplot matrix

• shows 3-d projections of sparsely populated, N-dimensional space

– Time Tunnel• event level display of processor and inter-processor behavior

Page 32: Pablo Project

Pablo Analysis GUI

• Toolkit of data transformation modules capable of processing SDDF records

• Supports graphical connection of performance data transformation modules in style of AVS

• By graphically connecting modules and interactively selecting trace data records, user specifies desired data transformation and presentations

• Expert users can develop and add new data analysis modules

Page 33: Pablo Project

Analysis GUI (cont.)

• Module types– Data analysis

• Mathematical transforms (counts, sums, ratios, max, min, average, trig functions, etc.)

• Synthesis of vectors and arrays from scalar input data

– Data presentation - bar graphs, bubble charts, strip charts, contour plots, interval plots, kiviat diagrams, 2-d and 3-d scatter plots, matrix displays, pie charts, polar plots

Page 34: Pablo Project

Pablo Analysis GUI Main Window

Page 35: Pablo Project

Module Creation Window

Page 36: Pablo Project

Module Connection

Page 37: Pablo Project

Configuring a Module (BarGraph)

Page 38: Pablo Project

Graph Execution

Page 39: Pablo Project

Graph with Synthesize Vector Module

Page 40: Pablo Project

Recommended