George Clement, Senior Application Engineer from Intel, highlights a case study that shows the importance of data center management to gain greater insight into power demand, thermal efficiency, server utilization, and capacity planning in their HPC environment.
George Clement, Senior Application Engineer, Intel
Recently the Institute for Health Metrics and Evaluation (IHME) at the University of Washington, faced a challenge to instrument there HPC environment. They needed to ensure the Data Center was cooling and providing enough power for the COVID 19 work they were supporting, and ensure the racks are provisioned and maintained for optimal physical configuration.
Finally, they needed it to be simple and focused on providing their operational team actionable data at the right time.
“We learned a lot from other products about realistic maximum power consumption, but it was only relevant for historical data, it couldn’t provide us the real-time alerting. If something goes wrong in the data center, right now, other products couldn’t tell us that. [This] was easy to plug in, and easy to get the data and analysis from our machines immediately. The alerts and power limitations were set up within a day.”
— Vern Harbers, Technical Project Manager, Infrastructure, IHME, University of Washington
The Institute for Health Metrics and Evaluation (IHME), an independent global health research center at the University of Washington worked with the Intel Data Center Solutions team to help improve their COVID 19 HPC cluster’s availability and performance by targeting several key elements as they implemented there solution:
(Graph: Courtesy of Intel)
Using these techniques the IHME staff reported they gained greater insight into power demand, thermal efficiency, server utilization, and capacity planning in their HPC environment. They had the solution up and working within hours of roll out. Also, they were able to compile and aggregate actionable, real-time data from its collection of servers, quickly and consistently (using OOB interfaces).
George Clement, Senior Application Engineer, from Intel, highlights a case study that shows the importance of data center management to gain greater insight into power demand, thermal efficiency, server utilization, and capacity planning in their HPC environment.
The rapid growth of data centers is resulting in one of the most energy intensive…
The training of large generative AI models is a special case of high-performance computing (HPC)…
It takes the Earth hundreds of millions of years to create usable energy. It takes…
What keeps data center operators up at night? Among other things, worries about the safety…
The global data center outsourcing market was valued at USD 132.3 billion in 2024 and…
The rapid adoption of artificial intelligence (AI) and the computing power required to train and…