CMU-CS- 86-127
Fault-Free
Performance Validation
of Fault-Tolerant
Multiprocessors
Ann Marie Grizzaffi
November 1985
Dept. of Electrical and Computer Engineering
Carnegie-Mellon University
Pittsburgh, Pennsylvania 15213
Submitted to Carnegie-Mellon University in partial fulfillment of the
requirements for the Degree of
Master of Science in Electrical and Computer Engineering
Copyright (_) 1986 Ann Marie Grizzaffi
This Research was sponsored by the National Aeronautics and Space Administration, Langley ResearchCenter under contract NAG-l-190. This Research was also supported by AT&T Bell Laboratories.
The views and conclusions contained in this document are those of the author and should not be
interpreted as representing the official policies, either expressed or implied, of NASA, the United States
Government, AT&T Bell Laboratories, or Carnegie-Mellon University.
Table of ContentsAbstract 11. Introduction 2
2. Background 32.1. Developing the Methodology 3
2.2. Validation Methodology Defined 43. The SIFT Environment 5
3.1. Hardware Configuration 53.2. SIFT Software 5
3.3. Experimental Environment 84. FTMP and Its Experimental Environment 115. The Experiments 14
5.1. Clock Read Characteristics 155.2. Instruction Times 15
5.3. Instruction Combinations 155.4. Task Stretching 16
6. Results and FTMP Comparisons 186.1. Read Time Clock Delay 186.2. Instruction Measurements 19
6.3. Instruction Combination Measurements 206.4. Task Stretching Results 28
7. Future Work 30
7.1. Baseline Experiments 307.2. Synthetic Workload 30
8. Conclusions 31I. Appendix 321.1. Clock Read Dump 331.2. Statistical Data on Instruction Times 34
1.3. Statistical Data on Instruction Combinations 36References 38
List of FiguresFigure 3-1: Block Diagram of SIFT Distributed System 6Figure 3-2: Block Diagram of a SIFT Processor 7Figure 3-3: Block Diagram of the SIFT Test Environment 9
Figure 4-1: FTMP Block Diagram, [Czeck 85] 11Figure 4-2: FTMP's Test Environment, [Czeck 85] 12Figure 5-1: Basic Task Algorithm 14
Figure 5-2: Program Used For Task Stretching - Voting Case 17Figure 6-1: Frequency vs. Microseconds per 100 A--1 Iterations 21Figure 6-2: Procedure Calls vs Parameters 22
Figure 6-3: Graph of Instruction Times: SIFT vs. FTMP 24Figure 6-4: A--1 vs. Consecutive Executions 25
ooo
111
List of TablesTable 5-1: Instructions Measured on SIFT 16Table 5-2: Instruction Combinations Tested on SIFT 16
Table 6-1: SIFT Clock Read Results 18Table 6-2: Clock Read Results for SIFT and FTMP 18
Table 6-3: Summary of SIFT Instruction Execution Times 20Table 6-4: Instruction Times: SIFT vs. FTMP 23
Table 6-5: SIFT vs FTMP for Integer Assign A-----1 26
Table 8-6: SIFT: Comparison Instructions Combinations Not Done on FTMP 26Table 6-7: SIFT: Comparison Between Single Instructions and Combinations 27Table 6-8: FTMP: Comparison Between Single Instructions and Combinations 27Table 6-9: SIFT vs. FTMP in Addition Combination 28
Table 6-10: SIFT: Task Stretching Results 29Table I-1: Raw Data: SIFT Clock Read Experiment 33
Table I-2: Instruction Execution Time: Integer and Boolean Data Types 34Table I-3: Statistical Information: Integer and Boolean Data Types 34Table I-4: Instruction Execution Time: Miscellaneous Instructions 35Table I-5: Statistical Information: Miscellaneous Instructions 35Table I-6: Instruction Execution Time: Instruction Combinations 36
Table I-7: Statistical Information: Instruction Combinations 37
Abstract
By the 1990's, aircraft will employ complex computer systems to control flight-critical functions. Since
computer failure would be life threatening, these systems should be experimentally validated before being
given aircraft control.
Over the last decade, Carnegie-Mellon University has developed a validation methodology for testing the
fault-free performance of fault-tolerant computer systems. Although this methodology was developed to
validate the Fault-Tolerant Multiprocessor (FTMP) at NASA-Langley's AIRLAB facility, it is claimed to
be general enough to validate any ultrareliable computer system.
The goal of thisresearchwas to demonstratethe robustnessof the validationmethodology by its
applicationon NASA's SoftwareImplementedFault-Tolerance(SIFT)DistributedSystem. Furthermore,
the performance of two architecturallydifferentmultiprocessorscould be compared by conducting
identicalbaselineexperiments.
From an analysis of the results, SIFT appears to have a better overall performance for instruction
execution than FTMP. One conclusion that can be made is thus far the validation methodology has been
proven general enough to apply to SIFT, and has produced results that were directly comparable to
previous FTMP experiments.
1. Introduction
Today's aircraft use simple on-board computers to perform isolated functions that are not flight critical.
If the computer fails, the flight crew can assume control of the function previously done by the computer,
without loss of life or cargo. But expanding technology is creating advanced aircraft that are too complex
for humans and simple computers to control. By the next decade, aircraft will require fault-tolerant
computer systems to perform flight-critical functions. Unfortunately, integrating avionics and control
functions means that computer failure can become a life threatening situation. Therefore, it is critical
that any computer system put in control of an aircraft be fault-free.
The NationalAeronauticsand SpaceAdministration(NASA) has on-goingresearchin theintegrationof
avionicsand controlfunctionsat Langley Research Center'sAvionicsIntegratedResearch Laboratory
(AIRLAB). One study determinedthat a computer system couldbe consideredsufficientlyreliablefor
aircraftcontrolifithas a probabilityof lessthan 10"I°failuresper hour,or one failureper millionyears
ofoperation.Sinceitisnot feasibletowaita millionyearsto insurethata computer systemisfault-free,
a validationmethodologyhad to be createdthatwould testfor functionalcorrectness.In lightof this
need,NASA heldseveralworkshopstodeterminethe bestapproach.
Based upon the results of these workshops, Carnegie-Mellon University (CMU) developed a set of
experiments to validate the prototype Fault-Tolerant Multiprocessor (FTMP) at AIRLAB [Clune
84, Feather 85]. Through this research, CMU was further able to develop a validation methodology
claimed to be general enough to test the fault-free performance of any fault-tolerant system.
The goal of this research was to demonstrate the robustness of the validation methodology by
application to NASA's Software Implemented Fault-Tolerant (SIFT) distributed system. This research
also demonstrated that conduction of identical baseline experiments allows the performance of two
architecturally different systems to be directly compared.
2. Background
In 1979, NASA held several workshops to develop a complete set of procedures for validating fault-
tolerant computer systems. One study in particular [NASA 79] produced an approach to validation that
would test systems in an orderly manner. This list was based on a building block method of analysis.
The approach proposed that experimentation should begin with the measurement of primitive hardware
and operating system activities. After primitive activities are characterized, more complex experiments
should be done to define interactions between primitive activities. This orderly progression would not
only build up confidence in a system in an incremental manner but insure uniform coverage and make the
cause of unexpected phenomenon easier to locate. The steps in the building block approach included:
1. Initial Checkout and Diagnostics.2. Programmer's Manual Validation.3. Executive Routine Validation.
4. Multiprocessor Interconnect Validation.5. Multiprocessor Executive Routine Validation.6. Application Program Validation and Performance Baseline.7. Simulation of Inaccessible Physical Failures.
8. Single Processor Fault Insertion.9. Multiprocessor Fault Insertion.
10. Single Processor Executive Failure Response Characterization.11. Multiprocessor System Executive Fault Handling Capabilities.
12. Application Program Validation on Multiprocessor.13. Multiple Application Program Verification on Multiprocessor.
The first six tasks validate the fault-free functionality of the system while the remaining seven validate
fault handling capabilities.
2.1. Developing the Methodology
Over the last decade, CMU has devoted over 100 man-years to the design, construction, and validation
of multiprocessor systems. This research led to the development of a generalized methodology for
validating the fault-free performance of multiprocessor systems.
Some of the guidelines used in creating this methodology included:
• Refining the validation methodology as experiments uncover new information or themethodology is applied to new multiprocessor systems.
• Designing experiments to validate behavior that is documented as well as uncovering behaviorthat is not documented.
• Performing experiments in a systematic manner. Since the search is for the unexpected, thereshould be no shortcut to thorough testing.
• Designing experiments so that they are repeatable.
• Using a building block approach that changes one variable at a time, so causes of unexpectedbehavior are easy to isolate.
• Allowing experiments to take advantage of the abstract levels used in the design of the system.
• Tempering experiments by available environments. More sophisticated experiments may haveto be postponed until the experimental environment is provided with more tools.
Each step of the methodology, like NASA's validation procedure, follows a building block approach.
The technique begins by conducting baseline experiments, experiments that measure a single phenomenon
while all other interactions are held constant. Baseline experiments are designed to validate the basic
assumptions used in the mathematical models from which the system was designed. They also test the
validity of the assumptions made by the system's programmers when designing the operating system and
other applications programs. After the baseline and individual phenomenon have been characterized,
advanced experiments to explore the interaction between basic phenomena are begun.
2.2. Validation Methodology Defined
The methodology begins with measurements of the execution times of system primitives, the overhead
time incurred when programs are executed, and the variation of function execution times on the system.
Specifically, the steps in the hierarchical procedure are as follows:
1. First, measure the time it takes to read the clock. Since the clock will be used for later
phases, the experimenter must be certain that the clock is predictable; a clock read must beconstant or vary predictably.
2. Next, measure single system parameters, or baseline parameters. This includes measuringinstruction execution times, operating system function execution times, and task manipulationphenomena.
3. Last, measure the iteration of programs on the system. The most efficient way to accomplishthis is to create synthetic workloads of different sizes and structures. The synthetic workloadenvironment is used to test features such as raw performance, bottlenecks, and overhead in the
operating system.
When applying the techniques to test systems of different architectures, it is not always possible to
perform identical experiments. Sometimes, exact duplications may prove to be architecturally impossible
or irrelevant for the type of system being tested. For these instances, careful substitutions must be made
to insure that the new experiments test comparable characteristics. When more sophisticated experiments
have to be postponed until the advent of more sophisticated tools, the experimenter is encouraged to
move on to the next step in the baseline experiments.
3. The SIFT Environment
The Software Implemented FaulV-Tolerance (SIFT) computer is one of two prototype systems developed
for NASA for experimentation in fault-tolerant systems research [SRI 82]; the other is a hardware
redundant Fault-Tolerant Multiprocessor, FTMP. SIFT was designed and built by Bendix Flight Systems
Division, under subcontract to SRI International, and delivered to the Langley Avionics Integration
Research Laboratory (AIRLAB) in April 1982. This section gives a brief overview of the SIFT's hardware
configuration and experimental environment [SRI 84].
3.1. Hardware Configuration
The SIFT architecture is made up of a fully distributed configuration of Bendix BDX-930 processors,
with point-to-point communication links between every pair of processors as shown in Figure 3-1.
Although SIFT was designed and built to accommodate eight processors, there are seven in the current
system. Six of these processors are required for fault-tolerant experimentation; reliability estimations
have demonstrated only six are needed to meet the required safety margin of less than 101° probability of
failure per hour [Palumbo & Butler 85]. The seventh processor is used by the Data Acquisition System
described in the next section.
In a fully distributed system, dependency on shared facilities are kept to a minimum. Therefore, each
SIFT processor contains its own main memory, power supply, clock, and I/O channel. As shown in the
block diagram of Figure 3-2, each processor in the system includes:
• 16-bit CPU.
• 32K words of static random access memory (RAM) which holds the SIFT executive program,the application programs, the transaction and data files, and the control stack.
• 1K datafile memory used as a buffer area for the broadcast and 1553 controller.* 1K transaction file memory used to hold the destination address of the values in the datafile
to be transmitted.
• A broadcast controller for interprocessor communication.
• A 1553A controller used to support external I/O to terminals, sensors, or avionics modules.• A real-time clock consisting of a 16-bit counter driven by a 16MHz crystal; each clock tick is
equivalent to 1.6 microseconds.
3.2. SIFT Software
To run an experiment on SIFT, the user writes a task in Pascal on the host computer. Once a task is
written, it is compiled, assembled, and linked with the SIFT operating system. This procedure creates an
absolute executable image file that can be loaded directly onto the selected SIFT processors. Reliability is
achieved by replicating the task on more than one processor. The number of processors chosen is
specified by the user, whose decision is based on the importance of the task.
Allocation of a task is done through a user defined Schedule Table. The Schedule Table lists the set of
1
SENSORS SENSORS ] SENSORS
AND AND I ANDA CTUATORS A C TUA TOR S A CTUATORS
io"+t ++I '+IPROC 1 PROC 2 PROC 7
AND AND _- ANDMEMORY MEMORY MEMORY
I I t I I l1,_*_s_ I_°_'_/ l,,_*_ l _'_ +_°__
Figure 8-1: Block Diagram of SIFT Distributed System
16K 16KMEMORY MEMORY
[ [ MEMOR_B_S BROADCASTCONTROLLER
1K DATAFILE
CPU 1K TRANS FILE
1553Ainte rupt CONTROLLER
REAL-TIMECLOCK
Figure 3=2: Block Diagram of a SIFT Processor
tasks that will be periodically dispatched, along with task specific information. It is the user's job to
decide the order tasks are executed, the number of processors used for replication, and the data to be
voted. The user must also specify the "duration" of the task in increments of 1.6 millisecond slots. This
step insures results are broadcasted in time for voting. It also prevents a non-faulty processor from being
configured out of the system because of task time-out.
After execution of a task, the results from each processor are compared, or Nvoted = on. If all copies are
not the same, an error has occurred. These errors are recorded in the processors' memory to assist the
Executive System in determining which processor is faulty. If an error occurred, the Executive System
isolates the fault by ensuring that only the correct or Umajority" value is passed onto the next task.
Fault isolation prevents a faulty unit from causing problems in the system, such as corrupting a non-
faulty processor's memory. If fault isolation is not done, a faulty or umalicious" processor could create a
life threatening situation, such as transmitting an invalid control signal. Once a processor has been found
faulty, the Executive System reassigns the processor's tasks to another processor, thereby configuring it
out of the system.
3.3. Experimental Environment
SIFT provides the experimenter with a user-friendly test environment, promoting experimentation
through interactive facilities designed to help prepare, exercise, and observe the system's behavior. Figure
3-3 depicts the test environment as seen by the user. From a terminal linked to the host computer, a
researcher can create and run experiments on SIFT, collect data, print out files, and dump data to an
on-line printer. All communication to and from SIFT is through a VAX-11/750. This host computer is
solely dedicated to SIFT research. NASA also installed added features to the SIFT environment to
enhance experimental conditions: a Data Acquisition System (D/kS) for improved data collection and a
global clock for improved measurement conditions.
DAS is made up of many integrated programs that receive and analyze data from the SIFT processors.
These programs are downloaded to the seventh SIFT processor, which can then control data collection.
Before this system was created, data collection was limited to 4K words of memory. With the Data
Acquisition System, information is sent from the SIFT processors to a disk capable of holding 50K blocks,
a total of 12.8M words. DAS requires some initial preparation, but it features an interface program that
facilitates the task. A preprocessing program is also available which provides the user with the ability to
manipulate the data straight from the disk, and the ability to specify what data to save for later
processing.
The global clock is a 16-bit independent measuring device, simultaneously available to all processors via
a read bus. There is no arbitration for this bus and no contention. It features a programmable time-base
TERMINAL Proe1
HOST
COMPUTERDISK Proc
VAX 2
11/7501
I
I
I
I
LINE ProcPRINTER 7
(DAS)
Figure 3-3: Block Diagram of the SIFT Test Environment
10
so the user can specify the resolution of the clock (i.e. 1 microsecond, as used in these experiments). The
global clock assures the consistency and reliability of the measurements taken by the processors, sinceclock times come from a common external reference.
11
4. FTMP and Its Experimental Environment
Since a comparison of SIFT to FTMP will be made, this section will give a brief software overview of
the FTMP system and its experimental environment. Additional information can be found in {Clune
84, Feather 85, Czeck 85].
Figure 4-1 is a block diagram of the FTMP system as seen by the user. Each virtual processor is three
processors tightly coupled and executing in lockstep. Reliability is obtained by having the three
processors (referred to as a "triad") executing code independently and performing hardware votes on the
results. Hence, a triad appears to be a single processor executing a single instruction stream. Within
each processor is a local PROM containing system executive code and initialization data. Also in each
processor is a local RAM containing local data, working stack, and the application code paged in from
system memory. The system memory is also triply redundant, and contains application code and system
data. A quintuply redundant serial bus connects the triads to global memory, I/O devices, a real-time
clock and the error latches.
Processor Processor Processor
Triad 1 Triad 2 Triad 3
8K 8K 8K 8K 8K 8K
PROM RAM PROM RAM PROM RAM !
I/O Port 1 I
I/O Port 2
SystemMemory
32K
--1 I/OPort9 ]System Bus
--t, I/OPortlO 1
Real Time ErrorCounter L arches
Figure 4-1: FTMP Block Diagram, [Czeck 85]
12
In addition to the nine processors that make up the three triads, FTMP has a tenth processor that
serves as a spare. When a processor is found faulty, the spare is configured in as a replacement. If
another failure occurs, the damaged triad is retired since there are no more replacements, and its
functioning processors are set aside to be used as spares.
FTMP's experimental environment is slightly more complicated than SIFT's, as illustrated in Figure
4-2. Tasks are written on the VAX and transferred to the IBM, where all compilation of tasks and tables
and task linkage take place. The IBM sends the resulting assembly code, absolute load module, and any
corresponding errors to the VAX. Executable code is down-loaded to FTMP via the CTA program
running in PDP-11 Emulation mode. The Test Adapter is also used for debugging and memory
manipulation on FTMP.
IBM VAX 11o7504381
PDP-11Emulation
Unibus
i
Test Adaptert
!
FTMP Rs MilStd Fault
Display 232 15_ Injector
FTMP • FTMPI/O . SystemInterface
Figure 4-2: FTMP's Test Environment, [Czeck 85]
Like SIFT, work on FTMP is divided into tasks. The difference however, is that tasks on FTMP are
executed at different iteration rates. Tasks are grouped into one of three frame sizes depending on their
priority. For example, updating a display terminal need not be done as frequently as adjusting the
13
plane's airspeed, so its task will reside in a slower frame. FTMP executes tasks in the highest priority
frame first, moving to lower priority frames when the higher priority tasks are done.
14
5. The Experiments
The goal of this research was to demonstrate the robustness of the validation methodology used on
FTMP through application to SIFT. As with all research, the success of the experiments was tempered
by the SIFT environment. There were instances where exact duplications proved to be impossible or
irrelevant. For these cases, careful substitutions were made to insure that the new experiments would test
comparable characteristics. Experiments reported in this paper provide a careful study of the baseline
primitives. These experiments fall into four categories:
• Measuring the characteristics of the real time clock.• Measuring instruction execution times.
• Measuring execution times of instruction combinations.• Measuring the effects of task stretching.
Figure 5-1 illustrates the basic task used in measuring the baseline parameters. Each processor reads
the global clock and stores the value of the starting time in memory. It then enters the loop where it
executes the statement being tested, LOOPCOUNT times. After the loop terminates, the global clock is
read again and the ending time is stored. Complete experiments incorporated the basic task and were run
as follows:
1. The variable LOOPCOUNT was set to 100. Consequently, the task provided the time, inmicroseconds, for an instruction to be executed 100 times.
2. The task was repeated 250 times, providing 250 data points per run.
3. Step 2 was repeated 4 times, providing 1000 data points per processor.
All experiments were arbitrarily chosen to run on three processors, producing 3000 points of data for
statisticalanalysis.The timerecordedin thebasictask,includedthe overheadforexecutionof the loop.
Therefore,one experimentwas dedicatedto measuringthe time for a nullloop,so thatthe overhead
couldbe subtractedfrom futureexperiments.
begin
data[time]:--gclock;fori:_ 1 toLOOPCOUNT do
beginfunctionto be measured
end;
data[time+l]:_-_gclock;end
Figure 5-1: BasicTask Algorithm
Once a taskwas compiled,linkedand down-loaded,the executableimage filedid not have to be created
againunlessa change was made inthe code. Through SIFT'sinteractivefacility,changescan be easily
made to a downloaded file,sinceitallowsthe userto readand setmemory locationsdirectly.This gives
15
the experimenter the advantage of varying experimental loops without recreating the image file. The only
problem in using shortcuts is that inconsistencies can be generated. For example, the iteration of a loop
can be increased beyond the allocated time specified in the Schedule Table, causing the task to timeout
and experiment to fail. In such an event, the Schedule Table could also be altered through the interactive
facility, but it would mean changing all schedule tables for every configuration for each processor,
possibly totally up to 24 tables.
5.1. Clock Read Characteristics
In the first category of baseline experiments, the characteristics of the global clock were measured. This
experiment was essential since before the global clock can be used as a measuring tool it must be
ascertained that reading it produces consistent results, or any variations are predictable. To insure that
future experiments using the global clock are valid, repetitive readings of the global clock were performed.
For this experiment, a clock read statement was inserted in the task of Figure 5-1 and iterated 100 times.
Using the experimental procedure described, 3000 data points were collected.
8.2. Instruction Times
In the second category of SIFT baseline experiments, the execution times of various instructions were
measured. This experiment provided the first accurate documentation of SIFT instruction times. In the
past, the user made a random guess as to how much time to allocate a task. This set of experiments
made it possible to put together an accurate listing by which educated estimations can now be made.
Since the performance of SIFT and FTMP are to be compared, efforts were made to insure that as
many of the applicable instructions tested on FTMP [Clune 84, Feather 85] were also measured on SIFT.
Since the SIFT environment does not provide hardware or software support for real or long word data,
only integer and boolean data types were measured. Instructions tested are listed in Table 5-1.
Each of the instructions in Table 5-1 was executed inside the basic task. The null loop itself was
measured so that the overhead from its execution could be subtracted from the results of the other
statements. Using the standard procedure, 3000 data points per instruction were collected.
5.3. Instruction Combinations
In the third category of baseline experiments, the execution times of instruction combinations were
measured to determine if the results exceeded the worst case time, the sum for executing each instruction
alone. This was an important experiment since in the SIFT operating system the user is responsible for
defining the duration of a task: if instruction combinations take longer than expected, the allocated time
may prove insufficient and the task will time out. It was also of interest to determine if tile system's
compiler takes advantage of optimizations.
16
• Null loop.• Integer Assign, A---1.
• Integer Variable Assign, A _ B.• Integer Addition, A -- B + C.• Integer Multiply, A _ B * C.• Integer Division, A -----B div C.• Integer Negate, A --- -B.
• Integer comparisons (greater than/equal to, less than, equal to).• Boolean Assign, A_-M-True.• Boolean Variable Assign, A -- B.• Boolean Or, A _ B or C.• Boolean And, A -- B and C.• Boolean Negate, A _-_NOT B.• If-then, if-then-else conditional statements.• Procedure calls with 0 through 4 parameters.
Table 5-1: Instructions Measured on SIFT
For this experiment two approaches were taken. One experiment tested the effect on execution times
when the consecutive iteration of a single instruction was increased. For this case, the integer assign
statement A_--1 was iterated between 1 and 20, inside the basic task loop. The second set of experiments
measured the execution times of instruction pair and triple combinations. The instruction combinations
are given in Table 5-2. Each set of instructions was executed in the basic task. Using the standard
procedure, 3000 data points were collected.
• Integer Assign and Integer Add.* Integer Assign and Integer Multiply.
• Integer Assign and Integer Divide.• Integer Assign, Add, and Multiply.• Integer Assign, Multiply, and Divide.
• Integer Assign, Add, and Divide.• Other combinations duplicating FTMP experiments: Assign-Assign and Addition-Addition.
Table 5-2: Instruction Combinations Tested on SIFT
5.4. Task Stretching
In the last category of baseline experiments, the effects of task stretching was explored. Theoretically,
as long as a task does not exceed its time allocation and a processor does not broadcast bad data, a
processor is considered healthy. If either condition is violated, the processor will be tagged "faulty" and
be configured out. The purpose of task stretching experiments was to determine whether a faulty
processor really will be reconfigured out.
To ensure the task stretching results were accurate, experiments were done to validate the 1.6
millisecond slot size. In one experiment, the time between consecutive tasks was measured by reading the
17
clock upon entering each task. These tasks were allocated a duration time of one slot per task. In a
second experiment, each task was allocated two slots and run consecutively. In a last experiment, three
one-slot tasks were run back to back and the time was taken upon entering the first task and entering the
third task. For a valid time slot, the results were expected to be consistent no matter how it was
measured.
To measure the effects of task stretching two approaches were taken. In the first experiment, conditions
for task timeout were explored. Processor 1 and 3 were allowed to complete their task before the
deadline, while processor 2 stretched its task beyond the allocated time. In the second experiment, the
broadcast of bad data was explored. There are two ways in which bad data can be passed: a processor
can broadcast malicious data even though it had enough time allocated to finish the task, or it can
broadcast bad data because the task ran out of time. An experiment was done to test for both cases.
Figure 5-2 shows the program used for the second case.
begin
data[time] :---- gclock;for i :--_ 1 to LOOPCOUNT do
beginif i -- WHEN then
stobroadcast(passit, 16_ABCD);if pid _ 2 then
for j :-- 1 to STRETCH doextraloop :-- j;
end;
data[time+l] :-- gclock;end
Figure 5-2: Program Used For Task Stretching - Voting Case
To fully control the two conditions that could trigger a configuration two variables, STRETCH and
WHEN, were introduced. STRETCH controls the amount the task running on processor 2 is lengthened,
or stretched. WHEN signals processors 1, 2, and 3 to broadcast hex value ABCD, arbitrarily c:hosen data.
To test the full effects of task manipulation, STRETCH and WHEN were varied from 1 to 20.
18
6. Results and FT_JIP Comparisons
In this section, the results of the SIFT baseline experiments are reported. Where applicable,
comparisons are made with the validation experiments performed on FTMP.
6.1. Read Time Clock Delay
In this experiment, the clock was tested for repeatability. Analysis of the data shows the global clock to
be a reliable measuring tool. Clock read results for the three processor used are shown in Table 6-1.
Read Time Clock Delay
Microseconds Per 100 Reads with Overhead
Processor Microseconds
min/max
P1 2855/2856
P2 2855/2856
P3 2857/2860
Table 6-1: SIFT Clock Read Results
As Table 6-1 illustrates, the clock reads differed by only 1 to 3 microseconds within a processor, a
negligible amount. As for the variation between processors, the maximum difference was five
microseconds. This is an excellent result considering the SIFT processors are only loosely coupled. The
difference in clock read time is caused by slightly different processor execution rates.
Execution Time for SIFT Clock Read:
100 Iterations of 1 Clock Read -_- 2.86 milliseconds
With Null Loop Overhead
1 Clock Read -- 17.7 microseconds
Without Null Loop Overhead
Execution Time for FTMP Clock Read: 1
16 Iterations of 5 Clock Reads -- 13.99 milliseconds
With Null Loop Overhead1 Clock Read _ 172 microseconds
Without Null Loop Overhead
Table 6-2: Clock Read Results for SIFT and FTMP
1Average of two experiments reported in [Clune 84 I.
19
Summaries of SIFT and FTMP clock read results are shown in Table 6-2. For both machines, the
global clock proved to be a reliable measuring device where any delays were predictable and negligible. In
comparison to FTMP however, SIFT's clock can measure finer grain of events since it is 10 times faster.
6.2. Instruction Measurements
Since the performance of SIFT and FTMP are to be compared, efforts were made to insure that as
many of the applicable instructions tested on FTMP [Clune 84, Feather 85] were also measured on SIFT.
Table 6-3 summarizes the execution times for these instructions. Appendix 1.2 contains complete tables of
instruction execution times and statistical information for each instruction. As shown in Table I-2 for
every case, the result was well within the margin of error for a 95_ confidence interval. As an example
of a Mtypical N result, the execution time for integer assign A--1 was 3.70 microseconds per instruction.
For 100 instructions (assuming normal distribution), a 95_ confidence interval of 0.0072 was calculated 2.
A histogram of the integer assign A_I (Figure 6-1), illustrates that data points were usually within one
microsecond of each other.
Along with simple instructions, execution times for procedure calls were measured for various number of
parameters. To help visuMize the results in Table 6-3, Figure 6-2 plots procedure calls against number of
parameters. An analysis of Figure 6-2 shows that after some slight initial overhead, the execution time
increases linearly with increasing number of parameters. As a comparison, the results of FTMP's
experiment is plotted on the same graph. Although FTMP's execution time also increases linearly, it has
a 474_o more overhead than SIFT's. Since FTMP is a stack machine, it executes extra instructions that
SIFT does not. It must push the number of parameters on the stack before executing a return statement.
The return statement must then pop this number of parameters so it can adjust the stack pointer before
returning control to the calling program, thereby removing parameters no longer needed.
As an overall comparison, the execution speed of SIFT instructions is listed along side FTMP's in Table
6-4 and illustrated in Figure 6-3. Although SIFT requires more time to negate variables, it is faster at all
other instructions including procedure calls. For example, when executing a uORu function; SIFT is
219_ faster. This disparity is due to the differences in compilers. Whereas SIFT's compiler simply loads
the variables into two registers and uORUs them, FTMP's compiler tests each variable separately and
executes code depending on the outcome of the test (i.e. if the first variable is true, it jumps without
testing the other). Worst case is when both variables are false: it must test both variables before it can
jump. An unweighted average (assuming all instructions tested are equally likely) shows that SIFT is
129_ faster than FTMP in executing instructions.
2Referto IFerrari78]for a descriptionof confidenceintervals and how they are calculated.
2O
Summary of SIFT Instruction Execution Times
(Without Null Loop Overhead) ....
Pascal Description Microsecs Per Instruction
Instruction ..... iA :--- 1 Integer Assign 3.70A := B Integer Variable Assign 4.39A := B + C Integer Addition 6.45A :--- B * C Integer Multiply 12.57A := B div C Integer Division 20.83A :--- -B Integer Negate 9.48A :--- B - C Integer Compare 8.51A := B >= C Integer Compare 9.70A :- B <: C Integer Compare ....... 9.45 .........
A "- True Boolean Assign 3.70A "-- B Boolean Variable Assign 4.39A "- B or C Boolean Or 6.89
]
I A "-'- B and C Boolean And 6.89A := NOT B BooleanNegate ..... 6.26
NULL Null Loop 10.86
Procall() Procedure Gall 6.45Procall(A) Procedure Call 7.00Procall(A,B) Procedure Call 15.88Procall(A,B,C} Procedure Call 20.27Procall(A,B,C,D ) Procedure Call 24.39If GO then A:--1 Conditional, True 6.95If GO then A:=I Conditional, False 3.70If GO then A:-1 Conditional, True 8.32Else B:= 1
If GO then A:-I Conditional, False 7.14Else B:-- 1 J
Table 8-3: Summary of SIFT Instruction Execution Times
8.3. Instruction Combination Measurements
This experiment measured instruction combinations to determine if their combined execution times
exceeded worst-case results. This experiment also uncovered compiler optimizations. The first part of
this experiment measured the execution time for the integer assign instruction A:_I, as its consecutive
iteration inside the basic task loop was increased from 1 to 20. Figure 6-4 illustrates the results.
Inspection of Figure 6-4 shows that for SIFT although execution time increases linearly with the number
of iterations, the slope reflects a small amount of compiler optimization. An analysis of the assembly
code shows that savings occurs because the compiler loads a register with the value 1 the first time and
uses stores to assign 1 to A thereafter, as illustrated in Table 6-5.
21
2250 ......
2000 -
1750 -
1500 --
1250 --
Frequency
1000 --
750 -
500 -
250 -
0f 1 I I I I
1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460
Microseconds per Hundred Instructions
FiKure 6-1: Frequency vs. Microseconds per 100 A--1 Iterations
22
7O
J
60-_
5O
40--Microsecs
Per Instruction
30-
SIFT
20-
10- _
I I I0 I 2 3 4
Number of Parameters
Figure 6-2: Procedure Calls vs Parameters
23
Instruction Execution Times: SIFT vs. FTMP
• ........... o , , .... •
Pascal Description SIFT FTM1 _ Percent DifferenceInstruction ......
A := 1 Integer Assign 3.70 4.0 8.1%A := B Integer Variable Assign 4.39 5.5 25.3%A := B + C Integer Addition 6.45 10.0 55.0_vA := B * C Integer Multiply 12.57 20.2 60.7%A := B div C Integer Division 20.83 21.7 4.2%A := -B Integer Negate 9.48 7.0 -26.2v-_A := B = C Integer Compare 8.51 23.2 172.6%A := B >= C Integer Compare 9.70 23.5 142.307vA := B < C Integer C.0mp_re .... 9.45 21.2 124.37Zv
A := True Boolean Assign 3.70 4.0 8.1_7Y_A := B Boolean Variable Assign 4.39 5.5 25.3%A := B or C Boolean Or 6.89 22.0 219.3_A := B and C Boolean And 6.89 21.1 206.2_
A := NOT B Boolean Negate. 6.26 10.9 74.12_ _
NULL Null Loop 10.86 17.7 63.0%
Procall() Procedure Call 6.45 37.0 473.6%
I Procall(A} Procedure Call 7.00 51.7 638.6%Procall(A,B} Procedure Call 15.88 57.5 262.1%Procall(A,B,C) Procedure Call 20.27 63.2 211.8%Procall(A,B,C,D ) Procedure Call 24.39 69.0 182.9%If GO then A:= 1 Conditional, True 6.95 9.0 29.5%If GO then A:=I Conditional, False 3.70 5.5 35.1%If GO then A:=I Conditional, True 8.32 13.2 58.7%Else B:= 1
If GO then A:=I Conditional, False 7.14 9.5 33.0%Else B:= 1
Table 6-4: Instruction Times: SIFT vs. FTMP
In comparison, the results of FTMP's experiment show that although FTMP's graph is also linear, there
is no compiler optimization. This result is plotted along SIFT's results in Figure 6-4. Since FTMP is a
stack machine, consecutive stores are not done. As shown in Table 6-5, FTMP must execute a push and
pop for every instruction. Consequently, although SIFT and FTMP start off with a similar execution
time, by the 20th iteration SIFT is done 94% sooner.
In the second set of experiments the execution times of instruction pair and triple combinations were
measured. Complete statistical results for these combinations are included in Appendix 1.3. Table 6-6 is
a summary of these results. Analysis shows that none of the combinations exceeded the expected time
limit; the small savings shown in the first five was probably due to experimental error since analysis of
the assembly code showed no compiler optimization. The assign-multiple-divide showed true savings since
the compiler did not have to reload C after the multiplication was done.
24
30
F'IlvIP
_ ,_ s _ "" ,,_
I _"0 "_ "" "
20 - ,' " * " "/
/
/
//Microsecs.I
perInstruction ,'
/
I
/
10 -- _'
"_ SIFT
. I I _ I I I I
Int.Ass Var.Ass Neg Add Or And Compare Mult DivInstruction
Figure 8-3: Graph of Instruction Times: SIFT vs. FTMP
25
8O
7O
FTMP
6O
50 SIFT
Microsecs. 40Per Instruction
3O
2O
10
0 l I l5 10 15 20
Iterations
Figure 6-4: A=I vs. Consecutive Executions
26
Instruction SIFT FTMP
A = 1 Load 1,R1 Push 1
Store RI,A Pop AStore R1,A Push 1
Store R1,A Pop AStore R1.A Push 1
Store RI,A Pop APush 1
Pop APush 1
Pop A
Table 6-5: SIFT vs FTMP for Integer Assign A--1
Comparisons of Instruction Combinations Not Tested on FTMP
(in microseconds per instruction without overhead) ......
Pascal Description Time If Done Time For Percent DifferenceInstruction ...... S.ep.arately Combination Separate vs. Combo
A :- 1 Assign & Add 10.15 9.88 2.7%A := B + C Combination .....................
A := 1 Assign & Mult 16.27 16.01 1.6%A := B * C Combination
A := 1 Assign & Div 24.53 24.27 1.1%CombinationA "= B div C _ ......
A "= 1 Assign, Add, Mult 22.72 22.46 1.2%A "= B + C CombinationA'=B*C
A "= 1 Assign, Add, Div 30.98 30.45 1.7%A := B + C Combination
A:=B/C
A := 1 Assign, Mult, Div 37.1 34.78 6.7%A := B * C Combination !
A:=B/C .... i
Table 6-0: SIFT: Comparison Instructions Combinations Not Done on FTMP
The architecturaldifferencebetween registerand stack machines was witnessedwhen instruction
combinationsperformedon FTMP were appliedto SIFT. Table 6-7 and Table 6-8 shows the resultsof
these combinations. Table 6-9 is an example illustratinga representativeinstructioncombination.
SIFT's compilerusesregisterallocationto avoid unnecessaryloads and stores,whereas FTMP's must
push itemson thestackeachtime. In general,theonlyoptimizationFTMP featuresisa duplicatestore:
insome cases,ifa variableisgoingtobe usedtwiceitisduplicatedinsteadofstoredand reloaded.
27
SIFT Combination Comparisons(in microseconds uer instruction without overhead]
'l • , ,a •
Pascal Description Time If Done Time For Percent Difference
Instruction Separa,tely Combination (Separate vs. Combo)
B "= 2 Assign 7.4 5.76 28.5%C :--- 2 Combination
B := C + D Addition 12.9 8.95 44.1%E := C + D Combination
B := C + D Addition 12.9 12.63 2.1%
E := F + A Combination ....!B :-- C + D Addition 12.9 12.63 2.1%
E := A + B Combination ........
B := 2 Assign 8.09 7.82 3.4%C :-- B Combination ....
B :-- 2 Assign 8.09 7.82 _! 3.4%C :-- D Combination
Table 6-7: SIFT: Comparison Between Single Instructions and Combinations
FTMP Combination Comparisons
(in microseconds per instruction without overhead,)HLL Description Time If Done Time For Percent Difference
Instruction Separately Combination (Separate vs. Combo)
B = 2 Assign 8.0 8.0 0.0%C = 2 Combination
B -- C + D Addition 20.0 20.0 0.0%
E -- C + D Combination ....
B -_ C + D Addition 20.0 21.0 -5.0%E = F + A Combination
B = C + D Addition 20.0 21.0 -5.0%
E = A +B CombinationB = C + D Addition 20.0 19.5 2.5%
E = B + A Combination
B -'- 2 Assign 9.5 6.5 31.5_C = B Combination
B-- 2 Assign 9.5 9.5 0.0°/oC -- D Combination
Table 6-8: FTMP: Comparison Between Single Instructions and Combinations
28
Instruction SIFT FTMP
B = C + D Load CoRI Push C
E = C + D Add D,R1 Push D
Store RI,B Add
Store RI°E Pop BPush CPush D
Add
Pop E
Table 8-9: SIFT vs. FTMP in Addition Combination
6.4. Task Stretching Results
The experiments conducted to validate the slot size produced consistent results. This guaranteed all
tasks on all processors were allocated equivalent slot sizes, in multiples of 1.6 milliseconds. The first task
stretching experiment explored only the condition of a task not meeting its time allocation. For this
experiment, the system behaved as predicted--the straying processor was halted when its task took longer
than the scheduler allowed.
In the second experiment, voting was introduced. One experiment tested the system's reaction to a
processor that broadcasted malicious data even though enough time was allocated for it to finish its task.
As expected, the processor was configured out of the system. Another experiment explored the case where
a processor was forced to broadcasted invalid data because its task timed out. This result is summarized
in Table 6-10. For this case, the processor was not only reconfigured out because it passed bad data, but
also because it timed out.
The results of the task stretching experiments proved that SIFT protects itself against, faulty or
"malicious" processors by reconfiguring them out of the system. In comparison, when this type of
experiment was executed on FTMP the processes were never halted and the frame stretched to
infinity [Clune 84].
29
SIFT Task Stretching ResultsfTask Execution Time in Milliseconds]
r ....... :f
WHEN STRETCH Task Execution Time (max) P2 ReconfiguredP! P2 P3 Out
1 5 2.36 9.77 2.31 No10 5 2.36 9.78 2.31 No50 5 2.38 9.80 2.35 No
100 5 2.36 9.76 2.31 No......
1 7 2.36 12.32 2.31 No10 7 2.36 12.33 2.31 No50 7 2.38 12.35 2.35 No
100 7 2.36 12.31 2.31 No....
1 8 2.36 timed out 2.31 Yes10 8 2.36 timed out 2.31 Yes50 8 2.38 timed out 2.35 Yes
100 8 2.3{} timed out 2.31 Yes ,1 10 2.36 timed out 2.31 Yes
10 10 2.36 timed out 2.31 Yes50 10 2.38 timed out 2.35 Yes
100 10 2.36 timed out 2.31 Yes1 100 3.77 timed out 3.70 Yes
10 100 3.78 timed out 3.71 Yes50 100 3.82 timed out 3.76 Yes
100 100 3.76 timed out 3.70 Yes
Table 8-10_ SIFT: Task Stretching Results
3O
7. Future Work
Although application of the validation methodology on SIFT has thus far proven successful, it is by no
means complete. The following sections discuss a few thoughts in these areas.
7.1. Baseline Experiments
To complete the baseline experiments, an experiment on task interaction still remains to be done. The
experiment, as it was conducted on FTMP [Clune 84], is not appropriate for SIFT. Therefore, a careful
modification must be made to insure that comparable characteristics are tested. In one unsuccessful
attempt, a task was executed on a single set of processors and an attempt was made to start up the next
task on another set of processors, so that switching time could be measured. Unfortunately, after a few
unsuccessful runs, it was realized that halting a set of processors to start another set is equivalent to
crashing the system. Therefore, to perform this experiment another approach must, be tried.
Measurements should include the time it takes to switch from one task to another on one set of
processors, and if possible, the time it takes to switch from one set of processors executing a task to
another set of processors executing the next task.
7.2. Synthetic Workload
To date, no work on the synthetic workload has been attempted. The implementation of the synthetic
workload developed for FTMP must be studied [Feather 85], and a comparable experiment must be
designed for SIFT. A synthetic workload is a set of tasks that exercise a computer the same way a
natural workload would, but without all complexity. It is easier to implement than a natural workload,
since it uses simple and repetitive instructions; thus, making it easier to debug. Also, modifications to a
synthetic workload can be readily made since it is controlled with parameters. Although implementing a
synthetic workload is a means of measuring performance, once a synthetic workload is running on SIFT,
it could prove to be a valuable stepping stone to more sophisticated experiments. For example, one idea
presented during the FTMP experiments [Feather 85] was to integrate the synthetic workload with the
fault-injection experiments.
31
8. Conclusions
The purpose of this research was to demonstrates the robustness of the validation methodology by
reporting the results of its application on SIFT. This report was to also show that by using identical
baseline experiments, the performance of two architecturally different systems could be directly compared.
Application of the methodology was successful. As with all research, the success of the experiments were
tempered by the environment, but careful substitutions were made to insure that new experiments would
test comparable characteristics. Using the methodology made it possible to compare SIFT and FTMP.
The following is a brief summary of the results:
1. Clock Read Delay
• Like FTMP, SIFT's global clock proved to be a very reliable measuring device where
any delays were predictable and negligible. In comparison to FTMP however, SIFT'sclock can measure finer grain of events since it is 10 times faster.
2. Instruction Execution Times
• Although SIFT requires more time to negate variables, it is faster at all otherinstructions including procedure calls. Overall, SIFT executes instructions 129_ fasterthan FTMP.
3. Instruction Combinations
• Because SIFT is a register machine, its compiler is able to optimize for cases whereFTMP's compiler can not. In fact, the only optimization FTMP features is a duplicatestore.
4. Task Stretching
• SIFT handles Mmaliciousm processors exactly as predicted: if a task does not completebefore its allocation of 1.6 millisecond time frames, or if it broadcasts bad data, it willbe reconfigured out of the system. In comparison, when this type of experiment wasexecuted on FTMP the processes were never halted and the frame stretched to infinity.
These experiments have shown that by applying a building block approach in a systematic manner, a
fault-tolerant system can be validated through manageable levels of experimentation. One conclusion
that can be made is thus far the methodology has proven to be general enough to apply to SIFT, and has
produced results that were directly comparable to previous FTMP experiments.
32
I. Appendix
1.1 Clock Read Dump
1.2 Statistical Data on Instruction Times
1.3 Statistical Data on Instruction Combinations
33
1.1. Clock Read Dump
Raw Data: Read Time Clock Delay
(Microseconds Per 100 Clock Reads; Including Null Loop Overhead)
Processor Actual Readings (Hex) MicrosecondsStarting Time Ending Time {Decimal)
P1 53F6 5F1D 2855P2 53F3 5F1B 2856P3 53F0 5F1A 2858
P1 B4BC BFE4 2856P2 B4B7 BFDF 2856P3 B4B9 BFE5 2860
P1 C519 D041 2856P2 C51A D042 2856
.p3 C517 D040 .... 2857P1 F694 01BC 2856P2 F694 01BB 2855P3 F691 01BC 2859
Table I-l" Raw Data: SIFT Clock Read Experiment
34
1.2. Statistical Data on Instruction Times
Instruction Execution Time: Integer and Boolean(Ran[e for 95_ Confidence Interval) ,• w ....... ,,,
Pascal Description microsecs/100 microsecs per instructionInstruction inst. w/overhead w/overhead w/o overhead
A := 1 Integer Assign 1456.28 ±.0072 14.56 3.70A := B Integer Variable Assign 1525.00 ±.0047 15.25 4.39A := B + C Integer Addition 1731.19 ±.0056 17.31 6.45A := B * C Integer Multiply 2343.46 ±.0089 23.43 12.57A := B div C Integer Division 3169.47 ±.0089 31.69 20.83A := -B Integer Negate 2031.08 ±.0036 20.31 9.48A := B = C Integer Compare 1937.37 ±.0083 19.37 8.51A := B >= C Integer Compare 2056.07 ±.0033 20.56 9.70A := B < C Integer Compare 2031.08 ±.0036 20.31 9.45A := True Boolean Assign 1456.26 ±.0069 14.56 3.70A := B Boolean Variable Assign 1526.26 ±.0036 15.26 4.39A := B or C Boolean Or 1774.96 ±.0035 17.75 6.89A := B and C Boolean And 1774.94 ---.0037 17.75 6.89
A := NOT B Boolean Negate 1712.43 ±.0088 17.12 6.26
Table I-2: Instruction Execution Time: Integer and Boolean Data Types
StatisticalInformation:Integerand Boolean(microsecondsver 100 instructions-with nullloouoverhead_
:: _ • ........... • •
Pascal Description min/max variance standardInstruction deviation,,
A :- 1 Integer Assign 1456/1457 .201 .448A := B Integer Variable Assign 1524/1526 .130 .361A :ffi B + C Integer Addition 1731/1732 .156 .395A :ffi B * C Integer Multiply 2343/2344 .248 .498A :ffi B div C Integer Division 3169/3170 .249 .500A := -B Integer Negate 2031/2032 .101 .318A := B = C Integer Compare 1937/1938 .232 .482A :- B > = C Integer Compare 2056/2057 .092 .304A := B < C Integer Compare 2031/2032 .101 .318A := True Boolean Assign 1456/1457 .194 .441A := B Boolean Variable Assign 1526/1527 .102 .319A :ffi B or C Boolean Or 1774/1775 .097 .311A := B and C Boolean And 1774/1775 .103 .320A := NOT B Boolean Negate 1712/1713 .245 .495
Table I-3: Statistical Information: Integer and Boolean Data Types
35
InstructionExecution Time: Miscellanous
,, (Range for95% Confidence Interval) _
Pascal Description microsecs/100 avg.microsecsper instruction
Instruction instr,w/overhead w/overhead I w/o overhead
NULL NullLoop 1086.38-.0085 - 10.86
Procall() Procedure Call 1731.19 ± .0054 17.31 6.45Procall(A) Procedure Call 1785.92 ±.0064 17.86 7.00Procall(A,B) Procedure Call 2674.57 ±.0088 26.75 15.88Procall(A,B,C) Procedure Call 3113.23 ±.0065 31.13 20.27Procall(A,B,C,D) Procedure Call 3525.58 ±.0087 35.26 2,1.39If GO then A:=I Conditional, True 1781.18 ±.0052 17.81 6.95If GO then A:=I Conditional, False 1456.29 ±.0074 14.56 :3.70If GO then A:=I Conditional, True 1918.62 ±.0084 19.19 8.32Else B:= 1
If GO then A:=I Conditional, False 1799.90 ±.0042 18.00 7.14Else B:= 1
Table I-4: Instruction Execution Time: Miscellaneous Instructions
Instruction Execution Time" Miscellanous(microseconds per 100 instructions - with overhead)
Pascal Description min/max variance standard
Instruction , deviationNULL Null Loop 1086/1087 .236 .486Procall() Procedure Call 1731/1732 .151 .389Procall(A) Procedure Call 2262/2263 .179 .423Procall(A,B) Procedure Call 2674/2675 .245 .495Procall(A,B,C) Procedure Call 3112/3114 .183 .428Procall(A,B,C,D) Procedure Call 3525/3526 .243 .493If GO then A:-I Conditional, True 1781/1782 .145 .380If GO then A:=I Conditional, False 1456/1457 .207 .455If GO then A:=I Conditional, True 1918/1919 .235 .485Else B:= 1
If GO then A'-I I Conditional, False 1799/1801 .117 .342[ Else B:= 1 i
Table I-5: Statistical Information: Miscellaneous Instructions
36
1.3. Statistical Data on Instruction Combinations
Instruction Execution Time: Combinations(Range for 95% Confidence Interval)
.... ;", i ,,w , , J :
Pascal Description microsecs/100 microseconds per instructionInstructio.n instr, w/overhead w/overhead w/o overhead
A :-- 1 1 Iteration 1456.28 ±.0072 14.56 3.702 Iterations 1662.48 -.0089 16.62 5.763 Iterations 1868.63 ±.0083 18.69 7.825 Iterations 2280.98 ±.0034 22.81 11.958 Iterations 2899.51 ---.0089 29.00 18.13
10 Iterations 3338.17 ± .0065 33.38 22.5212 Iterations 3750.52 ±.0089 37.51 26.6415 Iterations 4369.00 ±.0358 43.69 32.83
20 Iterations .................. 5219...97....±.0216 52.20 41.34
A :- 1 Assign & Add 2074.83 ±.0049 20.75 9.88A :ffi B + C Combination
A := 1 Assign & Mult 2687.12 ±.0043 26.87 16.01A :-- B * C Combination
A := 1 Assign & Div 3513.13 ±.0077 35.13 24.27A :- B div C Combination
A "= 1 Assign, Add, Mult 3331.93 ±.0067 33.32 22.46A "= B + C CombinationA'=B*C
A "= 1 Assign, Add, Div 4131.63 ±.0020 41.32 30.45A "= B + C Combination
A:=B/C
A := 1 Assign, Mult, Div 4564.03 ±.0170 45.64 34.78A := B * C Combination
A:=B/C ......
Table I-6: Instruction Execution Time: Instruction Combinations
37
StatisticalInformation:Combinations
(microsecondspcr I00 instruction-with overhead)
Pascal Description min/max variance standardInstruction deviation
A := 1 1 Iteration 1456/1457 .201 .448
2 Iterations 1662/1663 .250 .5003 Iterations 1868/1869 .232 .4825 Iterations 2280/2282 .095 .3088 Iterations 2899/2900 .250 .500
10 Iterations 3337/3339 .181 .42512 Iterations 3750/3751 .250 .50015 Iterations 4368/4370 1.000 1.00020 Iterations 5219/5221 .603 .777
A := 1 Assign & Add 2074/2775 .138 .372
A := B + C .....Combination ....
A := 1 Assign & Mult 2686/2688 .119 .345A ;= B * C Combination
A := 1 Assign & Div 3512/3514 .215 1 .464A := B div C Combination
A := 1 Assign, Add, Mult 3331/3333 .186 .431A "- B + C CombinationA:=B*C
A :--- 1 Assign, Add, Div 4131/4132 .603 .776A := B . C Combination
A:--B/C
A := 1 Assign, Mult, Div 4563/4565 .481 .693A :- B * C Combination
A:=B/C
Table I-7: Statistical Information: Instruction Combinations
38
References
[Butler 84] Ricky W. Butler and Sally C.Johnson.Validation of a Fault-Tolerant Clock Synchronization SystemNASA-Langley Research Center, 1984.
NASA Technical Paper 2346.
[Clune 84] Ed Clune.Analysis of the Fault Free Behavior of the FTMP Multiprocessor System.Master's thesis, Carnegie-Mellon University, 1984.
[Czeck 85] Ed Czeck.Fault Free Performance Validation of a Fault Tolerant MultiProcessor: Baseline and
Synthetic Workload Measurements.
Master's thesis, Carnegie-Mellon University, 1985.
[Feather 85] Frank E. Feather.Validation of a Fault-Tolerant Multiprocessor Baseline Experiments and Workload
Implementation.Master's thesis, Carnegie-Mellon University, 1985.
[Ferrari 78] Domenico Ferrari.Computer System8 Performance Evaluation.Prentice-Hall, 1978.
[Green 84] David F. Green, Jr., Daniel L. Palumbo, and Daniel W.Baltrus.Software Implemented Fault-Tolerant (SIFT) User'8 GuideNASA-Langley Research Center, 1984.NASA Technical Memorandum 86289.
[NASA 79] Research Triangle Institute.Validation Method8 for Fault-Tolerant Avionics and Control Systems - Working Group
Meeting H
NAsA-Langley Research Center, 1979.NASA Conference Publication 2130.
[Palumbo 85] Daniel L. Palumbo.The SIFT Hardware/Software System8NASA-Langley Research Center, 1985.NASA Technical Memorandum 87574.
[Palumbo & Butler 85]Daniel L. Palumbo and Ricky W. Butler.Measurement of SIFT Operating System OverheadNAsA-Langley Research Center, 1985.NASA Technical Memorandum 86322.
[Shin & Krishna 84]Kang G. Shin, C. M.Krishna.Characterization of Real-Time ComputersNASA-Langley Research Center, 1984.
Contract Report (CR) 3807.
[Siewiorek 82] Daniel P. Siewiorek, C. Gordon Bell, and Allen Newell.Computer Structures: Principal and Examples.MaGraw Hill Book Company, 1982.
39
[Siewiorek & Swarz 82]Daniel P. Siewiorek and Robert S.Swarz.
The Theory and Practice of Reliable System Design.Digital Press, 1982.
[Smith & Lala 82]T. Basil Smith, III and J. H. Lala.
Development and Evaluation of a Fault-Tolerant Multiprocessor(FTMP) ComputerThe Charkes Stark Draper Laboratory, Inc., 1982.Contract Number NAS1-15336.
[Smith & Lala 83]T. Basil Smith, III and J. H. Lala.
Development and Evaluation of a Fault-Tolerant Mulitprocesaor(FTMP) ComputerThe Charkes Stark Draper Laboratory, Inc., 1983.NASA Contractor Report 166071.
[SRI 81] Hierarchical Specification of the SIFT Flight Control SystemSRI International, 1981.
Technical Report CSL-123.
[SRI 82] Investigation, Development, and Evaluation of Performance Proving for Fault-Tolerant ComputersSRI International, 1982.Contract Number NAS1-15528.
[SRI 84] Development and Analysis of the Software Implemented Fault-Tolerace (SIFT)ComputerSRI International, 1984.
Contract Report 172146.
[Starcom 80] Pascal* (ve.O) User's Manual For the Bendix BDX-980Stareom Associates, 1980.
[Wensley 78] Wensley, John H., Green, Milton W., Shostak, Robert E., Lamport, Leslie, Levitt, KarlN., Weinstock, Charles B., Goldberg, Jack, Melliar-Smith, P. M.SIFT: Design and Analysis of a Fault-Tolerant Computer for Aircraft Control.
Proceedings of the 1EEE , October, 1978.
[Wyle 84] Software Users' Manual for the AIRLAB SIFT SchedulerWyle Labortatories, 1984.Document Number SD63148-141R0-D3.