HEPiX Fall 2014
Tony Wong (BNL)
UPS Monitoring with Sensaphone: A cost-effective solution
BackgroundFacility built piece-meal over the years
Old data center dates back to ~70’sNew floor space (~60% of total) built-refurbished since
2008UPS power provided for most of the RACF
1 MW (battery only runs for ~30 min) for old data center
1.3 MW (flywheel + generator runs for ~days) for new floor space
Direct monitoring of battery UPS with proprietary software
No direct monitoring for flywheel+generator systemOperational oversight at BNLExpensive, proprietary proposed solutions were rejected
Old Data Center
New Data Center
UPS monitoring in new data centerRequirements include:
Must be fully configurable and controlled by RACF
Alarm notification mechanism over multiple channelsCell phone, SMS/text messaging and emailDirect interface with monitoring computer
Commercially available and supportedCheap (ie, no yearly maintenance contracts)Stand-alone battery back-up (in case of
power loss)Ability to integrate with existing battery UPS
monitoring system
SynapsenseFound with a simple google searchPurchased model IMS-1000 (single room system)
Cheap (~US$900 each unit)Requires some electrical work to install Connect to phone line and internet
Installed one unit in Sigma-7 in 2010 and another in CDCE in 2011Initially configured to notify over phone and email only (no
integration with existing auto-shutdown mechanism)Call down list feature with auto-escalation enabled (ie, if
person A doesn’t acknowledge the alarm, the system calls person B, etc)
Supervisor on call down list – effective way to motivate staffAfter extensive testing, no further development for several
years (other priorities took over)
Anatomy of BNL Configuration
IMS-1000
Utility power and default power source for IMS-1000 unit
UPS 1 UPS 2
Battery back-up for IMS-1000 unit
Alarm signal via telephone line
Alarm signal via Internet
Inputs
Outputs
Alarm & Notification Mechanism
UPS Alarm
Inform 1st contact person
Begin countdown for automatic shutdown
Alert data center supervisor
Alarm & Notification Mechanism
UPS Alarm
Inform 1st contact person
If no answer, escalate to 2nd , 3rd and 4th contacts
Begin countdown for automatic shutdown
Alert data center supervisor
Alarm & Notification Mechanism
UPS Alarm
Inform 1st contact person
If no answer, escalate to 2nd , 3rd and 4th contacts
If no answer from any responder, call the boss
Begin countdown for automatic shutdown
Alert data center supervisor
Alarm & Notification Mechanism
UPS Alarm
Inform 1st contact person
If no answer, escalate to 2nd , 3rd and 4th contacts
If no answer from any responder, call the boss
Begin countdown for automatic shutdown
Alert data center supervisor
Shutdown worker nodes and non-critical servers
Work Timeline
UPS Alarm
Inform 1st contact person
If no answer, escalate to 2nd , 3rd and 4th contacts
If no answer from any responder, call the boss
Begin countdown for automatic shutdown
Alert data center supervisor
Shutdown worker nodes and non-critical servers
Before Summer 2014
After Summer 2014
IMS-1000 Unit
Wall-mounted box
Installed IMS-1000 unit
IMS-1000 Unit (continued)
Close-up view of unit
Input sensor (UPS)
IMS-1000 Web Interface
UPS and CoolingMost of the IT equipment connected to
UPS-backed power, but CRAC (Computer Room Air Conditioning) are not.
Dangerous overheating can occur in a matter of minutes
March 18, 2014
6:20 am 6:40 am
Recent developmentsThe cooling incident on March 18 gave us tangible
evidence that investing a little more time on configuring Sensaphone is a good idea
UPS monitoring via Sensaphone was integrated with existing auto-shutdown of (most) IT equipment due to cooling or utility power interruptions – completed summer 2014Beyond email/phone alarm acknowledgementBrookhaven’s utility division on-call staff notifiedShutdown if temperature passes threshold or time limits
Selected CRAC units now on UPS back-up power and domestic water back-up (for utility chilled water) – completed September 2014
Plan to add more CRAC units to UPS and domestic water back-up in next 2-3 years
September 2014
Temperature fluctuations resulting from engineering work to integrate CRAC units to domestic back-up water (in case utility chilled water plant is down)
ConclusionsSensaphone is a low-cost solution for UPS
monitoring of a data centerEasy to configure (ours was done by a
technician and a summer student)Portable and flexible
Can monitor multiple power sources if needed
Can monitor other parameters such as humidity, temperature, etc
Durable – has worked quietly and reliably for ~4 years
Free, technical support (via email and phone) available