Date post: | 13-Dec-2015 |
Category: |
Documents |
Upload: | cori-garrison |
View: | 219 times |
Download: | 1 times |
Final PresentationFinal Presentation
DigiSat Reliable Computer –
Multiprocessor Control System, Part B.
Niv Best, Shai Israeli
Instructor: Oren Kerem, (Isaschar Walter)
HS-DS Lab, Technion, Winter 2004
Project GoalsProject Goals
Design & implement a hardware mechanism for multiprocessor monitoring & control.
Part of the DigiSat reliable computer project.
Part A GoalsPart A Goals
Familiarize ourselves with the PowerPC 405 core, Virtex II-pro development board.
Study various lab tools available to us.Conceive a monitoring scheme for a
multiprocessor system.
Part B GoalsPart B Goals
Find ways to monitor as many PowerPC & PLB signals as possible.
Simulate a triple processor system.Develop an error simulation model.Implement the monitoring system.Test & debug.
The DigiSat ComputerThe DigiSat Computer
PowerPC based.Implemented upon the Virtex II-pro
platform.Hardware redundancy throughout the
entire system.Our project handles processor
redundancy.
DescriptionDescription
Satellites contain redundant hardware since servicing in space is not applicable.
A monitoring system is required to identify & handle malfunctions.
Must be implemented in hardware.
DigiSat ComputerDigiSat Computer
M1 M1 M1 M2 M2 M2
S1 S1 S1 S2 S2 S2PLBPPC 1
PLBPPC 2
PLBPPC 3
PLBC
ompa
rato
r
PowerPC 405PowerPC 405
32-bit RISC core.Low power consumption.Used in various system-on-chip (SoC)
applications (PDAs, network routers, cellular phones…).
Embedded within the Virtex II-pro platform.
Triple Modular RedundancyTriple Modular Redundancy
3 processors running in parallel. The processors’ signals are constantly
monitored and compared. Errors are detected via a majority vote. Upon error detection an appropriate
reset signal is sent to the faulty processor.
Triple Modular RedundancyTriple Modular Redundancy
After reset, The faulty processor is loaded with an image of the other 2 processors.
Operation is resumed – 3 processors, identical in state, running in parallel.
NOTE: PowerPC 405 does not support image loading & dumping. Our project focuses on error detection.
Triple Modular RedundancyTriple Modular Redundancy
PPC 1
PPC 2
PPC 3
Comparator
“Brain”
PLB 1
PLB 2
PLB 3
PLB
Actual ImplementationActual Implementation
Requires a development board with 3 processors.
Available board contains only one.A 3 processor system needs to be
simulated.Random errors are generated in order
to test the system.
Actual ImplementationActual Implementation
PPC Comparator
Brain
PLB 1
PLB 2
PLB 3
PLB
Bus M
ultiplier
Random
Error G
enerator
Multiple Processor Simulation
Detailed System DiagramDetailed System DiagramPPC
PLB Arbiter
Signal Collector Bus Multiplier
Corrector Arc PLB Sigs
Bus CollectorError Generator
Bus 1Bus 2Bus 3
Comparator
WD
Output Bus
Reset Resolver
TMR Brain
PPC Reset Signals
WDWD
Single Sampler
Capturer
Signal Collection & Signal Collection & ConcatenationConcatenation
Unit Name Source Width
Signal_Collector PowerPC 23
Corrector_Arc PLB Arbiter 133 X 3
PLB_Sigs Corrector_Arc 133
Bus_Multiplier Signal_Collector 23 X 3
Bus_Collector PLB_Sigs & Bus_Multiplier
156
Random Error Generator Random Error Generator Receives the 3 identical busses from the “big
bus” multipliers, and applies an error to one of the busses at random.
ComparatorComparator
Compares the 3 buses and outputs the majority outcome of the 3.
Also outputs a “comparison report”, which is the logical XOR of each bus with the majority vote.
Comparison Formula: (in1 AND in2) OR (in2 AND in3) OR (in1 AND in3)
Single SampleSingle Sample
Generates a very short error (one clock cycle). Relays the output signals from the comparator to the reset
resolver as usual until a flag signal is asserted. Upon assertion continues relaying the signals for one
more clock cycle and then zeros the signals as if there was no error at all.
Reset ResolverReset Resolver
reads the Comparator’s error reports and checks them for errors.
If one of the signals isn’t all ‘0’, a corresponding error signal is asserted, designating which processor must be reset.
Watchdog UnitsWatchdog Units
listens to bus activity and issues a “bark” signal if the bus becomes idle for an extended period of time.
Issues 3 “warnings” before entering a final bark state.
After 3 counting cycles of no bus activity, the watchdog issues a “big bark”, which requires handling before the watchdog can return to normal operation.
TMR BrainTMR Brain
Receives all the monitoring units’ outputs and is in charge of initiating the appropriate correctional activity.
Capturer Capturer
Captures and displays short errors generated by the error generator.
Regularly, displays the outputs of the “Brain”.
Whenever a short error is detected, holds the corresponding error detection signals asserted until a reset occurs.
Used for demonstration purposes only.
Error Generator DemonstrationError Generator Demonstration
Comparator Processor 1 Processor 2 Processor 3
Identifying Faulty ProcessorIdentifying Faulty Processor
ppc_1_rst
ppc_2_rst
rst1
rst2
rst_o1_obuf
rst_o2_obuf
capture
Brain output reset signals
Brain input reset signals
Capturer output reset signals
Error start
Capture start
Error end
Future DevelopmentsFuture Developments
Core loading / dumping : Optional for other processors in the PowerPC family.
Processor synchronizing : Optional for other processors in the PowerPC family.
Software interrupts : allows better watchdog monitoring.
Project AdvantagesProject Advantages
The system can detect single cycle errors, which are common in the cosmic environment.
The system is NOT processor-specific: Changing processors requires only bus-width & signal name adjustments.