To address misconceptions about the frequency and cost of data center downtime, we’ve studied and now explained the common causes, potential costs, and solutions.

After all, the reliance on IT systems to support business-critical applications has increased significantly over the past decade, with data center availability now becoming essential to many companies whose customers pay a premium for access to a variety of IT applications. 

This connection between data center availability and total cost of ownership has made a single downtime event capable of significantly impacting the profitability (and, in extreme cases, the viability) of an enterprise. 

Costs of Data Center Downtime

A study found that the average cost of data center downtime was approximately $5,600 per minute, and the average cost of a single downtime event was approximately $505,500.

Indirect and opportunity costs accounted for more than 62 percent of all costs resulting from data center downtime

This study conducted in 2011 involved Data Center Professionals from 41 independent facilities across various industry segments such as financial services, telecommunications, retail, healthcare, government, and third-party IT services. 

The participating data centers were required to have a minimum of 2,500 ft2 to ensure that the costs were representative of an average enterprise data center. 

Respondents provided cost estimates for a single recent outage, and follow-up interviews were conducted to obtain additional information. 

Business disruption and lost revenue were the most significant cost consequences, and losses in end-user and IT productivity also had a significant impact. Surprisingly, equipment costs were among the lowest costs reported for a downtime event.

Common Causes of Data Center Downtime

The common causes of downtime are UPS system failure, human error, and cyber attacks.

But let’s take a look at two that cause more damage, therefore, result in more expensive.

a) Power-Related Outages – The root causes of power-related outages are discussed, and it is noted that UPS and generator failures are the most costly. Tier I and II data centers are particularly vulnerable to power failures due to a lack of redundancy and other preventative measures.

Redundancy in power systems is recommended to minimize the impact of equipment failure. Additionally, regular maintenance and monitoring of critical power systems can help to minimize the risk of power equipment failure.

Comprehensive monitoring solutions can aid in quickly identifying and addressing power equipment issues.



b) Environmental-Related Outages – Environmental vulnerabilities, such as thermal issues and water incursion, are cited in this study as root causes of data center failures, accounting for 15% of all root causes.

IT equipment failures caused by environmental issues are the most expensive, with a cost of more than $750,000 per incident. It also emphasizes that an optimized cooling infrastructure is critical to preventing catastrophic equipment failures and minimizing downtime.

Best practices for cooling infrastructure are explored, including using refrigerant-based cooling instead of water-based solutions, eliminating hot spots and high heat densities, installing robust monitoring and management solutions, and implementing regular preventive maintenance and service visits.

However, you can implement a proactive strategy to mitigate these risks and improve availability by considering these six key strategies.

Solutions for Data Center Downtime

Regular assessments and performance optimization services can help identify vulnerabilities and create a plan tailored to your infrastructure and budget. By implementing these strategies, you can improve availability, reduce downtime risks, and gain a competitive edge.

Firstly, monitor batteries and implement a battery maintenance program that identifies system anomalies and trends end-of-life. 

Secondly, consider monitoring software like Vertiv’s Data Center Planner to help identify battery problems before they impact operations. 

Thirdly, consider lithium-ion batteries as they are smaller, lighter, and last longer while providing the power needed for critical loads. 

Fourthly, use an integrated approach to optimize your infrastructure with Vertiv’s Liebert iCOM-S Thermal System Supervisory Control to match load demand. 

Fifthly, keep the data center clean, perform preventative maintenance, and assess environmental threats to protect your infrastructure. 

And lastly, implement and update policies and procedures regularly to ensure everyone is aware of common threats and how to respond to system failures.