Qualcomm Technologies, Inc. has revealed its latest advancements in AI-optimized solutions for data centers with the launch of the AI200 and AI250 accelerator cards and racks. These cutting-edge offerings, leveraging the company’s expertise in NPU technology, are designed to deliver unparalleled rack-scale performance and memory capacity, setting a new standard for generative AI inference.
The Qualcomm AI200 is specifically crafted to provide cost-effective and high-performance solutions for large language models and various AI workloads. With support for up to 768 GB of LPDDR per card, it caters to the demanding memory requirements of AI tasks while emphasizing scalability and flexibility.
On the other hand, the Qualcomm AI250 introduces a groundbreaking memory architecture based on near-memory computing. This innovative approach results in over a 10x enhancement in memory bandwidth efficiency and a significant reduction in power consumption, enabling the efficient deployment of AI across diverse hardware configurations.
Both the AI200 and AI250 solutions come equipped with advanced features such as direct liquid cooling, PCIe for enhanced scalability, and Ethernet for seamless expansion. Additionally, they integrate confidential computing capabilities to ensure the security of AI processes, drawing 160 kW per rack to meet the evolving needs of modern data center environments.
Durga Malladi, Senior Vice President and General Manager at Qualcomm, emphasized the ease of integration, management, and scaling of AI models on their optimized AI inference solutions. With robust support for leading AI frameworks and simplified model deployment, the AI200 and AI250 are tailored for frictionless adoption and rapid innovation.
The comprehensive software stack accompanying these solutions supports a wide range of machine learning frameworks and inference engines, prioritizing generative AI tasks through techniques like disaggregated serving. This streamlined approach streamlines model integration and provides developers with access to a variety of AI applications, libraries, and tools for enhanced operational efficiency.
Expected to be commercially available in 2026 and 2027, the AI200 and AI250 reflect Qualcomm’s ongoing commitment to advancing data center AI capabilities annually, with a focus on enhancing performance, energy efficiency, and cost-effectiveness.