The tech giant, Nvidia, has joined forces with EPRI, Prologis, and InfraPartners to delve into the realm of smaller-scale data centers tailored for distributed inference capabilities. These centers, ranging from 5 to 20 MW, will be strategically placed at or near utility substations to facilitate swift deployment. The primary objective is to deliver efficient inference processing near the data source while optimizing existing infrastructure and alleviating strain on overloaded transmission systems, as outlined by the collaborating companies.
Aiming to set up five pilot sites across the United States by the end of 2026, this partnership seeks to establish a replicable model for rapid and scalable deployment of micro data centers.
EPRI President and CEO Arshad Mansoor emphasized the transformative impact of AI across industries, highlighting the necessity for the energy system to evolve in response to escalating demand. Mansoor underlined the importance of leveraging existing grid capacity to bring inference compute closer to the point of need, emphasizing the benefits of swift and reliable processing for all stakeholders involved.
The demand for AI inference, the process of utilizing trained AI models to generate outputs, is on the rise, distinct from AI training requirements. Projections indicate that the inference market could reach up to $254 billion by 2030, according to a report from Markets and Markets.
The surge in AI inference demand poses challenges to the power grid, with Omdia’s Data Center Building Report forecasting a 45% compound annual growth rate (CAGR) in AI inferencing power capacity demand from 2025 to 2030, surpassing the projected 30% CAGR for AI training over the same period.
Alan Howard, principal analyst at Omdia, commended the initiative for addressing power availability issues, a departure from the traditional focus on latency in edge deployments. Howard praised Nvidia for driving innovation in AI technologies, enabling the industry to navigate obstacles like power constraints.
According to a KBV Research report, latency and real-time performance present notable challenges in the inference market, with many AI applications necessitating rapid decision-making and significant power consumption. The localization of data centers can effectively mitigate these constraints.
Steven Dickens, CEO and analyst at HyperFrame Research, anticipates a shift towards geographically distributed inference deployments to enhance AI workload efficiency. This trend is poised to revolutionize data center deployment models rapidly, fostering the growth of both large-scale data centers and edge inference facilities.
Inference plays a pivotal role in various applications such as autonomous logistics, fraud detection, and digital diagnostics. The study partners emphasize the critical need for reliable, high-speed compute power in handling these workloads efficiently. By moving inference closer to the grid edge, providers can enhance their responsiveness and operational effectiveness.
Marc Spieler, senior managing director for the global energy industry at Nvidia, recognized the transformative impact of AI on industrial processes, necessitating a reevaluation of data center infrastructure. Spieler highlighted the potential of deploying accelerated computing resources adjacent to available grid capacity to optimize AI inference scalability.