Hazards include server damage from loud noise during discharge of inert gas fire suppression systems
By Kevin Heslin, with contributions from Scott Good and Pitt Turner
A downtime incident in Europe has rekindled interest in a topic that never seems more than a spark away from becoming a heated discussion. In that incident, the accidental discharge of an inert gas fire suppression system during testing damaged the servers in a mission-critical facility.
The incident, which took place while testing the fire suppression system 10 September 2016 at an ING facility in Bucharest, destroyed dozens of hard drives, according to published reports in the BBC and elsewhere. As a result, ING was forced to rely on a nearby backup facility to support its Romanian operations. The event was of great interest to Uptime Institute’s EMEA Network because of the universal requirement for fire protection in data centers. Uptime Institute and Network principals inaugurated a number of information exchanges at ensuing Network meetings.
Fires that originate in data centers are relatively rare and are usually caused by human error during testing and maintenance or by electrical failures, which tend to be self-extinguishing. Other fires spread to the data center from other spaces. At these times, the need for an effective and functioning fire suppression system is obvious, and the system must provide life safety and protect expensive gear and mission-critical data. However, the fire suppression system can pose a risk to operations when inadvertently activated during testing and maintenance. In addition, the fire suppression systems, when deployed, also cause damage in a facility.
These considerations mean that the choice and design of a fire suppression system must meet the business needs and fire threats the facility is likely to face. Water-based systems, for example, will destroy sensitive IT gear when deployed. In general, however, the loss of IT gear in a fire is acceptable to insurance companies and local authorities having jurisdiction (AHJs), who view equipment as replaceable as long as the system saves lives and preserves the building. Data center operators will place a higher value on the data and operations.
Some data centers, however, do deploy inert gas fire suppression systems. In general use, these systems can be used to protect irreplaceable or extremely costly gear. High-performance computers, for example, tend to be far more expensive to replace than standard X86 servers. In theory, inert gas fire suppression systems prevent water from entering the server room via sprinkler systems. However, discharge of an inert gas system has been shown to damage data center servers—even when there is no fire, and so many facilities are turning to pre-action systems, which also remove the presence of water from the data center floor, except when activated. According to one major vendor of both types of fire suppression systems, inert gas systems better protect IT equipment because they do not damage electric and electronic circuits, even under full load operation. In addition inert gas systems can suppress deep-seated fires, including those inside a cabinet.
Uptime Institute agrees that accidental charges of inert gas fire suppression systems are rare. But, at the same time, according to the 2017 Uptime Institute Data Center Industry Survey, about one-third of data center operators have experienced an accidental discharge. In fact, in the same survey, respondents were three times more likely to have experienced an accidental discharge than an actual fire.
Beyond that point of agreement, however, consensus throughout the industry is rare, with much uncertainty on exactly how IT gear is damaged by the discharge of inert gas, how to protect against the damages, or even whether inert gas fire suppression system vendors or IT manufacturers are best positioned to eliminate the problem. Still anecdotes continue to surface, and vendors have documented, during test conditions, the phenomenon in which loud noises from fire suppression systems impaired the performance of data center servers or disabled them, either temporarily or permanently, leading to data loss.
Uptime Institute notes that vendors have tried to address problems tied to the release of inert gasses by redesigning nozzles and improving sensors to reduce false positive signals. Uptime Institute also agrees with vendor recommendations regarding the use of inert gas systems, including:
- Installing racks that have doors to muffle noise
- Installing sound-insulating cabinets
- Using high-quality servers or even solid-state servers and memory
- Slowing the rate of inert gas discharge
- Installing walls and ceilings that incorporate sound-muffling materials
- Aiming gas discharge nozzles away from servers
- Removing power from IT gear before testing inert gas fire suppression systems
- Muffling alarms during testing
- Replicating data to off-site disk storage
A more dramatic step would be to move to pre-action (dry pipe) water sprinkler or chemical suppression systems, but at least one insurance broker recommends the use of inert gas systems in conjunction with a water system as part of a two-phase fire suppression system.
Regardless, pre-action fire suppression systems have become more common. The use of water means that facility owners are protected against the total loss of a data center, and the dry-pipe feature—originally developed to protect against fire in cold environments such as parking garages or refrigerated coolers—protects facilities from the consequences of an accidental discharge in white spaces. In many applications, they are also the more economical choice, especially as local codes and authorities may require the use of a water suppression system, and the inert gas system then becomes not a replacement, but a fairly expensive supplement.
Still inert gas fire suppression systems continue to have their adherents, and they may make business sense in select applications. Data center operators may consider using inert gas in locations where water is scarce or when an application makes use of very expensive and unique IT gear, such as supercomputers in HPC facilities or old-style tape-drive storage. In addition, inert gas systems may be the best choice when water damage would cause irreplaceable data to be irretrievably lost. In these instances, Uptime Institute believes that organizations would be better served by developing improved backup and business continuity plans.
Those considering inert gas suppressions may be somewhat relieved to learn that vendors have taken steps to minimize damage from discharges of inert gas systems, perhaps the most important of these being improved sensors that register fewer false positives. It is also entirely possible to develop rigorous procedures that reduce the likelihood of an inadvertent discharge due to human error, which is by far the most common cause of accidental discharge.
Some industry sources believe that the problem first began to manifest around 2008 as inert gas systems began to become popular as fire suppression systems in data centers. Others note that server density increased at about the same time.
An examination of Uptime Institute Network’s Abnormal Incident Reports (AIRs) database does not support this belief. It includes reports dating back as far as 1994 and 1995, with no obvious increase in 2008. In total, the AIRs database includes 54 incidents involving inert gas fire suppression systems. Of these reported incidents, 15 involved accidental discharge, with 2 downtime reports. However, many of the incidents took place in support areas, with no possibility of server downtime. Still the documented possibility of damage to IT gear and facility downtime worries many data center operators.
Uptime Institute Network members commonly use the AIRs database to identify and prevent problems experienced by their colleagues. In this case, Network members report that 27 incidents were caused by technician error, with another 14 resulting from no maintenance or poor procedures, 9 resulting from a manufacturing problem, and 4 from a design omission or installation problem (see the table).
Although these results are typical for most system failures, they are particularly relevant to discussions of all fire suppression systems, as differences of opinion exist about how exactly the discharge of inert gas systems damages IT gear. Manufacturers believe that the sound of the discharge damages the drives, but others say it is the noise of the fire alarm, either independently or by contributing to the noise level in the facility.
Uptime Institute does not believe this to be a realistic concern, as the AIRs database includes no instances where fire alarms by themselves resulted in data center downtime or server damage.
The fire suppression industry is aware of the problem. During a 2014 meeting with server manufacturers, they acknowledged the problem, noting concerns about the noise (actually a pressure wave moving through the room) emitted by inert gas suppression systems during discharge. The vendors theorized that the volume (decibels) and frequency (hertz) of sound emitted during system discharge combine to damage servers. In response, many vendors redesigned nozzles to reduce pressure, and therefore, the decibel level of the discharge. This, one vendor said, would not eliminate the problem as each server type has a different sensitivity and new server types are also susceptible to different volume and frequency combinations. In heterogeneous environments, sensitivity may vary from server to server or rack to rack.
Vendors say that the time required for testing makes it impossible to develop inert gas suppression systems that discharge without affecting the many server types already available in the marketplace. Many environments are heterogeneous, so a discharge system may affect some of the equipment in a rack or room without doing any apparent damage to other equipment. Vendors have also introduced sensors that are more accurate, reducing the likelihood that a false alarm will trigger an unnecessary discharge.
These same vendors note higher-grade enterprise servers are less susceptible to damage from inert gas discharges. These servers, they note, are more likely to be installed in quality racks that have doors and other features that muffle noise and cushion servers against the shock from the sound waves. In addition, enterprise servers are designed to be operated in large cabinets where many drives spin at the same time, creating both noise and vibration. These drives are tested to withstand harsher environments than consumer drives. They track more precisely and have high sustaining data rates, making them resistant to sound and vibration. According to one vendor, even slamming doors can degrade the performance of the consumer drives sometimes be found in data centers.
These measures can be effective, according to a Data Center Journal article written in 2012 by IBM’s Brian P. Rawson and Kent C. Green. They explain that noise causes the read-write element to go o the data track, “Current-generation HDDs have up to about 250,000 data tracks per inch on their disks. To read and write, the element must be within ±15% of the data track spacing. This means the HDD can tolerate less than 1/1,000,000 of an inch off set from the center of the data track—any more than that will halt reads and writes.” They theorize that decreased spacing between data tracks made servers more susceptible to damage or degraded performance from noise. In the same article, Rawson and Green cite a YouTube video that shows how even a low-decibel noise such as a human voice can degrade HDD performance.
Uptime Institute notes that common security practices along with strong operational procedures relating to testing fire suppression systems can mitigate most risk associated with inert gas fire suppression systems.
Uptime Institute recommends that IT management teams work with risk managers to ensure that all stakeholders understand a facility’s fire suppression requirements and options before selecting a fire suppression system. Operational considerations should also be included, so that the system is well suited to an organization’s risk exposure and the business requirements.
Uptime Institute believes that most data centers would be best served by a combination of pre-action (dry pipe) sprinkler system and high-sensitivity smoke detection. Most AHJs, risk managers, and insurance companies will support this choice as long as other operating requirements are met, like having educated and trained staff providing building coverage. These authorities are generally quite familiar with water-based fire suppression systems, as these constitute the vast majority of installations in the U.S; however, they may not always be familiar with pre-action systems.
In instances when risk managers or insurers require an inert gas fire suppression system, operations staff may be able to mitigate the risk for accidental discharge by implementing documented policies, procedures, and practices, etc. These documents should include as many of the vendor recommendations on page 67 as possible. In this way, the risk manager requirement for inert gas fire suppression should not be taken as the end of the discussion but rather the start of a dialog.
Finally, IT should continuously evaluate its fire suppression system and consider removing inert gas systems from spaces when its use changes. Uptime Institute has documented the use of inert gas fire suppression in spaces that were converted to storage from IT. In this instance, the facility increased its risk of accidental discharge but gained no benefits at all.
Kevin Heslin is chief editor and director of Ancillary Projects at Uptime Institute. In these roles, he supports Uptime Institute communications and education efforts. Previously, he served as an editor at BNP Media, where he founded Mission Critical, a commercial publication dedicated to data center and backup power professionals. He also served as editor at New York Construction News and CEE and was the editor of LD+A and JIES at the IESNA. In addition, Heslin served as communications manager at the Lighting Research Center of Rensselaer Polytechnic Institute. He earned the BA in journalism from Fordham University in 1981 and a BS in technical communications from Rensselaer Polytechnic Institute in 2000.
The post Fire Suppression Systems Bring Unexpected Risk appeared first on Uptime Institute Blog.