Balancing AI’s potential and pitfalls in data center operations

December 17, 2024

The number of greenfield hyperscale data centers is surging — and is expected to continue growing. Grand View Research anticipates growth of 13% each year through 2030. 

This new infrastructure buildout is fueling the most challenging workloads that the computing industry has ever undertaken, with compute requirements of large language model training growing at a clip of at least 1.5 times Moore’s Law (which observes that the that the number of transistors on an integrated circuit doubles every two years with minimal rise in cost). 

But when we peer into underlying GPU power requirements to fuel this training, we discover platforms that gobble energy at startling rates. NVIDIA’s new Blackwell GPUs, the solution of choice for hyperscalers’ largest training deployments in 2024, consume a whopping 1200 watts per GPU, more than 70% higher than the previous generation. (That said, the company says they contribute to more efficient processes because their increased power means that fewer are required per workload.) When you look at the Grace Blackwell platform, the data is even more eye-opening, with an energy draw of 2700 W per system. 

Given that traditional data centers deliver five to 10 kW per server rack, it is abundantly clear that these new powerhouses require fundamental power delivery changes to maintain rack density.

What are hyperscalers doing to address this? Well, in greenfield environments, the solution starts with delivery of more power per rack with new configurations: 30 kW per rack and beyond, with some reports forecasting rack power scaling up to 200 kW. This enables providers to deploy increased compute density per rack to scale compute capability in training clusters further.

ZincFive BC 2 UPS Battery Cabinet with AI

AI’s impact on data center power requirements

Read Post

This also, of course, delivers an exponential increase in heat. Operators are turning to direct liquid and immersion cooling solutions as the only alternatives to dissipate heat generated by these powerful configurations. The debate over when liquid cooling should replace air cooling technologies, at least in these hyperscale environments, is all but over. Air cooling simply cannot address the heat generation of these high-powered GPUs.

Operators are also paying special attention to the cradle-to-grave sustainability of data center environments, with more focus on embedded carbon, power consumption at use, and infrastructure circularity in alignment with corporate carbon commitments. In pursuing these efforts, 63% of respondents to ZincFive’s 2024 Data Center Energy Storage Industry Insights Report survey found that their organizations’ sustainability programs resulted in reduced costs. The same survey found that sustainability was the second highest consideration when selecting energy storage solutions.

As operators build out high-density, high-power capacity racks, a new approach to power backup must also be considered given the sheer scale of power draw within the cluster — and the mission criticality of training runs to the underlying business opportunity. Here, we see new approaches to both immediate and long-term battery backup also coming under new consideration.

In particular, new battery chemistries have the potential to offer something that lithium-ion or lead acid cannot. A nickel-zinc chemistry, for instance, delivers immediate power backup that is tailored for unexpected AI training cluster outages, delivering power failover prior to server reboot or generator ignition. It also has improved power density — taking up less valuable real estate space in the data center — and has no risk of thermal runaway. 

The Rise of Immediate Power Solutions (IPS): Transforming Data Centers  

Read Blog

As we approach the second half of the decade, the growth of hyperscalers is set to continue, driven by the expanding potential of AI. With use cases limited only by human creativity, AI adoption will inevitably grow — but faces constraints due to the capacity of data centers to scale and meet evolving power demands.

Finding safe, reliable, and sustainable infrastructure and energy solutions will fuel the next wave of innovation, whether through new technologies, strategies, or partnerships. These solutions will have to be identified and implemented across various fronts, including the sourcing, use, and storage of power. Fortunately, many of these solutions are already emerging.

Previously published by Latitude Media.

Author
Tim Hysell, ZincFive CEO
Tim Hysell
Co-Founder & CEO, ZincFive
Tim has over three decades of entrepreneurial success in founding, owning, and directing profitable business operations in renewable energy, banking, manufacturing, and medical devices. His companies partnered with global giants such as Siemens, Phillips, and Hewlett-Packard. Prior to owning his own businesses, Tim worked for General Electric, Hewlett-Packard, and Providence Health Systems. Tim is also a co-founder and board member of Pacific West Bank in Oregon.