At the Apsara Conference in Hangzhou, organized by Alibaba Cloud, Chinese AI startups showcased their dedication to developing large language models.
The focus of these companies comes in the wake of OpenAI’s recent unveiling of their latest LLMs, such as the o1 generative pre-trained transformer model supported by Microsoft. This model aims to address complex tasks, opening doors for advancements in various fields like science, coding, and mathematics.
Moonshot AI founder Kunal Zhilin highlighted the significance of the o1 model during the conference, noting its potential to revolutionize industries and create new prospects for AI startups.
Zhilin emphasized the importance of reinforcement learning and scalability in AI progress, mentioning the scaling law, which suggests that larger models with more training data deliver superior performance.
He stated, “This strategy expands the boundaries of AI capabilities,” pointing out that OpenAI o1 could disrupt industries and unlock fresh opportunities for startups.
OpenAI has emphasized the model’s capacity to tackle intricate issues, operating in a manner akin to human thought processes. By refining its strategies and learning from errors, the model enhances its problem-solving skills.
Zhilin suggested that companies with ample computing resources can pioneer not only in algorithms but also in fundamental AI models. This is crucial as AI engineers increasingly rely on reinforcement learning to generate new data once existing organic sources are exhausted.
StepFun CEO Jiang Daxin echoed Zhilin’s sentiments, highlighting that computational power remains a major hurdle for many startups, especially with US trade restrictions limiting Chinese firms’ access to advanced semiconductors.
Daxin remarked, “The computational demands are still significant.”
An insider at Baichuan AI revealed that only a select group of Chinese AI startups, including Moonshot AI, Baichuan AI, Zhipu AI, and MiniMax, have the capacity to make substantial investments in reinforcement learning. Referred to collectively as the “AI tigers,” these companies are deeply involved in LLM development, propelling the next wave of AI innovation.
More from the Apsara Conference
At the same event, Alibaba Cloud made several announcements, including the introduction of its Qwen 2.5 model family, featuring advancements in coding and mathematics. These models range from 0.5 billion to 72 billion parameters and support around 29 languages, including Chinese, English, French, and Spanish.
Specialized models like Qwen2.5-Coder and Qwen2.5-Math have gained traction, with over 40 million downloads on platforms like Hugging Face and ModelScope.
Alibaba Cloud expanded its product range by unveiling a text-to-video model in its picture generator, Tongyi Wanxiang. This model can produce videos in realistic and animated styles, with potential applications in advertising and filmmaking.
Additionally, Alibaba Cloud introduced Qwen 2-VL, the latest iteration of its vision language model. This model can handle videos longer than 20 minutes, support video-based question-answering, and is optimized for mobile devices and robotics.
For more details on the conference, visit the official website.
(Photo by: @Guy_AI_Wise via X)
Want to delve deeper into AI and big data insights from industry experts? Explore the AI & Big Data Expo happening in Amsterdam, California, and London. This comprehensive event is co-located with other leading conferences like Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.
Check out upcoming enterprise technology events and webinars hosted by TechForge here.