Debugging Cluster Programs

Post on 29-Jan-2016

39 views 0 download

Tags:

description

Debugging Cluster Programs. using symbolic debuggers. Debugging Code. Careful review of your code Add debugging code to your code print statements at strategic locations in code remove later Use a symbolic debugger. Careful review of your code. Rereading your code is often helpful - PowerPoint PPT Presentation

transcript

Debugging Cluster Programs

usingsymbolic debuggers

Debugging Code

Careful review of your codeAdd debugging code to your code print statements at strategic locations in code remove later

Use a symbolic debugger

Careful review of your code

Rereading your code is often helpfulMost parallel code errors are serial errorsCompare your code to specsTake a break, review your code with a fresh brainHave someone else help you review your code

Common sources of errors

Beyond what the compiler catches Usually run-time errors

Incorrect use of pointers Point out of memory Reference should have used a pointer

Referenced wrong variableIndex initialized wrong, wrong exit condition

Common parallel errors

Deadlock errors Receive before send Receive, but no send

Incorrect arguments in MPI calls Mismatch on tags Mismatch of source/destination Misunderstanding of a the use of an

argument

Add Debugging Code

Add strategically placed code in your code to display critical informationWatch values of variables as the program progressesCan create data-dump functions – call when you need themHave a way to remove them in production code

Add Debugging Code

Can be difficult to get the right debugging code in the right placeDoes not scale well in parallel environmentCan produce unmanageable or unintelligible output

Symbolic Debuggers

Allow you to – inspect your code monitor its behavior modify the data values

on the fly – as your code executes

gdb – GNU debugger

Frequently used GDB commands:

break [file:]function - Set a breakpoint at function (in file).

run [arglist] - Start your program (with arglist, if specified).

bt - Backtrace: display the program stack.

print expr - Display the value of an expression.

c - Continue running your program (after stopping, e.g. at a breakpoint).

next - Execute next program line (after stopping); step over any function calls in the line.

step - Execute next program line (after stopping); step into any function calls in the line.

help [name] - Show information about GDB command name, or general information about using GDB.

quit - Exit from GDB.

gdb

Running in X-windows

Linux (Unix) to Linux ssh to host, login and enter X application

Other platforms (Windows, Mac) – Use X-windows server applicationVNC in most platforms VNC operates as a remote

control application in Linux VNC operates as a X-windows server viewer for Windows, Macintosh, Solaris

Running in X-windows

Using VNCssh to host and loginstart vncserver pay attention to display id (:n)

from your desktop run VNCViewer select host with correct display id

After session kill vncserver – vncserver –kill :n (n is display id

number)

Using VNC

x desktop with VNC

ddd – a graphic front end to gdb…

pgdbg

Debugger from the Portland Group (PGI)Can use with PG compilersCan use with GNU compilers

pgdbg – common commands

Back to text mode for a bitlis[t] [count | low:high | routine | line,count]

-display lines from the source code file or routine

att[ach] <pid> [<exe> | <exe> <host>]

- attach to a running process <pid> or start a local executable and attach to it, or start

an executable <exe> on <host>

c[ont] - continue executing from the current location

pgdbg – common commands

det[ach] – detach from the currently attached process

halt – halt the executing process or thread

n[ext] [count] – continue executing and stop after count lines of source code

nexti [count] – continue executing and stop after count

instructions

pgdbg – common commands

q[uit] – terminate pgdbg and exit

ru[n] [arg0 arg1 … argn] – run program from beginning with arguments arg0, arg1…

s[tep] [count] – execute next count lines of source code and stop. Step steps into

called routines

s[tep] up – steps out of current routine

stepi [count] – execute next count instructions and stop. Steps into called routines

pgdbg – common commands

stepi up – steps out of current routine and stops

Event command –

break line | function - sets a break point to specified line or function. If no line or function specified lists existing breakpoints. A break point stops execution at specified point

clear [all | line | func] – clears all breakpoints, or a breakpoint at line line or at function func.

pgdbg – common commandsstop var - break when the value of var changes at a

location

watch expr – stops and display the value of expr when it changes

track expr – like watch except does not stop execution

trace var - displays a trace of source line execution when the value of var changes

pgdbg – common commands

p[rint] var – displays the value of a variable

edit filename – evokes an editor to edit file filename. If no filename given edits current file

decl[aration name – displays the type declaration for the object name

as[ign] var = expr - assigns the value expr to the variable var

proc [number] – sets the current process to process number number

Resources

gdb man gdb info gdb; Using GDB: A Guide to the GNU

Source- Level Debugger, Richard M. Stallman and

Roland H. Pesch, July 1991.

ddd man ddd

VNC http://www.uk.research.att.com/vnc/ http://www.realvnc.com

Resources

PGI Debugger User’s Guide http://www.pgroup.com/ppro_docs/pgdbg_ug/PGDBG4.htmPGI Users Guide, PGI 4.1 Release Notes, FAQ, Tutorials http://www.pgroup.com/docs.htmMPI-CH http://www.netlib.org/ OpenMP http://www.openmp.org/ HPDF (High Performance Debugging Forum) Standard http://www.ptools.org/hpdf/draft/intro.html