© 2006 Hewlett-Packard Development Company, L.P.The information contained herein is subject to change without notice
HP and Carrier Network System Availability
Lee Hines
Hewlett Packard Software Division
6 April 10, 2023
Availability, outages and the impacts of reliable networks
7 April 10, 2023
“There was a power outage at a department store yesterday. Twenty people were trapped on the escalators.”
- Stephen Wright
Measuring availability
• Based on 24x7 operations, • Planned and unplanned outages.
Percent of availability*
99% 99.9% 99.99% 99.999% 99.9999% 99.99999%
Outageminutes/
year~5,000 ~500 ~50 ~5 ~.5 ~.05
Outageto users
3.65 days 8.8 hrs. ~50 min. 5 min. 30 sec. 3 sec.
Carrier network impacts from availability
HP NonStop server availability
HP NonStop availability and location based services
12 April 10, 2023
Increasing the availability – toward Seven, Eight & Nine 9’s
The New NonStop Advanced Architecture• DMR: Dual Modular Redundancy
• TMR: Triple Modular Redundancy (HW Availability: seven 9’s)
• Loose Synchronization (lock-step)
− Each server runs on its own clock.
− Each can perform soft error corrections without causing a miscompare.
• Self-checked, shared-nothing, transparent take-over
• Fault Masking – HW Processor failures are masked and are not visible to all SW except for lowest level of OS.
− E.G. an uncorrectable memory error doesn’t stop the logical processor, it simply stops one processor element that makes up the logical processor.
− Memory has one of the highest rates of failure. NSAA masks all memory failures.
− Repairs don’t result in SW disruption either.
• Fault-tolerant parallel database
• Application server transaction processing monitors
13 April 10, 2023
14 April 10, 2023
Dual to Triple-Mode RedundancyDual-Mode Redundancy = Five 9’s Availability Triple-Mode Redundancy = Seven 9’s Availability
Reliability, Availability, Scalability