Summary:
1. NTT has launched a lightweight large language model, tsuzumi 2, running on a single GPU to address the high costs and energy consumption associated with traditional large language models.
2. The deployment of tsuzumi 2 at Tokyo Online University showcases how businesses can achieve performance matching larger models at a fraction of the cost, while also addressing data privacy concerns.
3. The article highlights the technical economics, data sovereignty and security considerations, multimodal capabilities, market context, and implementation factors to consider when evaluating lightweight LLM deployment.
Article:
Enterprise AI deployment often faces challenges due to the high infrastructure costs and energy consumption associated with sophisticated language models. In response to this fundamental tension, NTT recently introduced tsuzumi 2, a lightweight large language model that runs on a single GPU. This innovative solution is designed to help organizations overcome the barriers of traditional large language models, which typically require dozens or hundreds of GPUs, making AI deployment impractical for many businesses.
The launch of tsuzumi 2 has already shown promising results, with early deployments demonstrating performance that matches larger models while running at a fraction of the operational cost. Tokyo Online University, for example, has successfully deployed tsuzumi 2 for course Q&A enhancement, teaching material creation support, and personalized student guidance. By operating on a single GPU, the university has been able to avoid capital expenditure for GPU clusters and ongoing electricity costs, while also addressing data privacy concerns common in educational institutions.
NTT’s internal evaluation has shown that tsuzumi 2 can match or exceed leading external models in performance, despite its significantly smaller infrastructure requirements. This performance-to-resource ratio is crucial for enterprises where the total cost of ownership drives decisions. The model excels in Japanese language performance, particularly in business domains that prioritize knowledge, analysis, instruction-following, and safety.
In addition to cost considerations, data sovereignty plays a significant role in driving the adoption of lightweight LLMs in regulated industries. Organizations handling confidential information are at risk when processing data through external AI services subject to foreign jurisdiction. NTT positions tsuzumi 2 as a “purely domestic model,” developed in Japan and operating on-premises or in private clouds, addressing concerns about data residency, regulatory compliance, and information security.
The article also emphasizes the multimodal capabilities of tsuzumi 2, which include support for text, images, and voice in enterprise applications. This feature is essential for business workflows that require AI to process multiple data types without deploying separate specialized models. Manufacturing quality control, customer service operations, and document processing workflows can benefit from single models that handle all three data types, reducing integration complexity.
While NTT’s lightweight approach contrasts with hyperscaler strategies that emphasize massive models with broad capabilities, it offers practical benefits for organizations with specific requirements that align with lightweight model capabilities. As enterprises navigate AI adoption, the focus is shifting towards efficient, specialized solutions that address specific business requirements while considering cost, security, and operational constraints. Tokyo Online University and FUJIFILM Business Innovation’s successful deployments serve as examples of the practical business value that lightweight LLMs can offer in the enterprise AI landscape.