Download - Performance Evaluation of Load Sharing Policies on a Beowulf Cluster James Nichols Marc Lemaire Advisor: Mark Claypool.

Performance Evaluation of Load Sharing Policies on a Beowulf Cluster

James Nichols

Marc Lemaire

Advisor: Mark Claypool

Outline

Introduction Methodology Results Conclusions

Introduction

What is a Beowulf cluster? Cluster of computers networked together via Ethernet

Load Distribution Share load, decreasing response times and increasing

overall throughput Need for expertise in a particular load distribution

mechanism Load

Historically use CPU as the load metric. What about disk and memory load? Or system events like

interrupts and context switches? PANTS Application Node Transparency System

Removes the need for expertise required by other load distribution mechanisms

PANTS

PANTS Application Node Transparency System Developed in previous MQP: CS-DXF-9918.

Enhanced the following year to use DIPC in: CS-DXF-0021

Intercepts execve() system calls Uses /proc files system to calculate CPU load to

classify node as “busy” or “free” Any workload which does not generate CPU load will

not be distributed Near linear speedup for computationally intensive

applications New load metrics and polices!

Outline


Methodology

Identified load parameters Implemented ways to measure parameters Improved PANTS implementation Built micro benchmarks which stressed each load metric Built macro benchmark which stressed a more realistic

mix of metrics Selected real world benchmark

Methodology: Load Metrics

Acquired via /proc/stat CPU – Totals jiffies (1/100ths of a second) that the

processor spent on user, nice, and system processes. Obtain a percentage of total.

I/O – Blocks read/written to disk per second. Memory – Page operations per second. Example: a

virtual memory page fault requiring a page to be loaded into memory from disk

Interrupts – System interrupts per second. Example: incoming Ethernet packet.

Context Switches – How many times the processor switched between processes per second.

Methodology: Micro & Macro Benchmarks Implemented micro and macro benchmarks

Helped refine understanding of how the system was performing, tested our load metrics, etc.

(Not enough time to present)

Real world benchmark: Linux kernel compile Distributed compilation of the Linux kernel Executed by the standard Linux program make Loads I/O and memory resources Details:

Kernel version 2.4.18 432 files Mean source file size: 19KB Needed to expand relative path names to full paths

Thresholds Pants default: CPU: 95% New policy: CPU: 95%, I/O: 1000 Blocks/sec, Memory: 4000

Page faults/sec, IRQ: 115000 interrupts/sec, Context Switches: 6000 switches/sec

Compare PANTS default policy to our new load metrics and policy

Outline


Results: CPU

AverageStd Dev

MaxMin

PANTS - New Policies

PANTS - Defualt0

10

20

30

40

50

60

70

CPU %

Results: Memory

AverageStd Dev

MaxMin


PANTS - Defualt0

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

Page Faults/Sec

Results: Context Switches

AverageStd Dev

MaxMin


PANTS - Defualt0

5000

10000

15000

20000

25000

30000

35000

40000

Context Switches/Sec

Results: Summary

Summary Resuls - Distributed Compilation

0

200

400

600

800

1000

1200

1400

1600

Local NFS PANTS - nomigration

PANTS -default

PANTS - newload metrics

Compilation Method

Tim

e(S

ec

)

Conclusions

Achieve better throughput and more balanced load distribution when metrics include I/O, memory, interrupts, and context switches.

Future Work: Use preemptive migration? Include network usage load metric.

For more information visit: http://pants.wpi.edu

Questions?