Alibaba leads $290 million investment for building a recent kind of AI model as LLM limits emerge

Alibaba Cloud has led a 2 billion yuan investment into ShengShu, the startup behind the AI video generation tool Vidu.

The fundraiser aims to move beyond text-based AI toward real-world simulation tech.

That’s a different kind of AI than what underpins chatbots, and one required to advance robotics.

BEIJING — Alibaba Cloud is investing in a novel type of artificial intelligence designed to better replicate the real globe using a different approach from chatbots such as OpenAI’s ChatGPT.

The shift recognizes the limits of “large language models” trained primarily on text. Instead, developers are starting to focus more on “world models” built on videos and real-life physical scenarios.

To jump on the trend, Alibaba led a 2 billion yuan ($290 million) investment in ShengShu, the startup behind the AI video generation tool Vidu, the enterprise stated Friday. TAL Education and Baidu Ventures also participated in the series B funding round.

The investment comes about two months after ShengShu raised 600 million yuan from Qiming Venture Partners and other backers. The startup declined to disclose its valuation.

ShengShu noted the latest funding will support the development of a “general earth model” that uses AI to bridge two currently separate domains: the digital globe of games and AI-generated video, and the physical globe of autonomous driving and robots.

“ShengShu believes that a general globe model, built on multimodal data such as vision, audio, and touch, more naturally captures how the physical earth works than large language models,” the three-year-old startup noted in a statement.

“We aim to connect perception and action,” Zhu Jun, founder of ShengShu, added in a statement, allowing AI systems to better model and predict real-world behavior consistently.

ShengShu’s latest Vidu Q3 Pro model, released in January, ranks among the top 10 AI models for generating videos from text and images, according to Artificial Analysis.

The business launched Vidu globally months before OpenAI made its now-shuttered Sora tool for AI video generation widely available. Chinese short-video companies Kuaishou and ByteDance have also released similar competing AI tools for generating videos.

Globe model competition

Alibaba has expanded its investments in related startups.

The Chinese tech giant and Baidu Ventures last month led a $50 million investment in Tripo AI, a platform that uses AI to quickly generate digital 3D models from photographs. Tripo stated it is also moving away from techniques used by language models toward AI tools grounded in physical space and is developing its own planet model. This also touches on aspects of investors.

In September, Alibaba also led a $60 million investment in PixVerse, which released an AI earth model earlier this year that allows users to direct how a video unfolds while it is being generated.

Alibaba, which got its start in e-commerce, has also released free, open-source AI models for video generation and, in February, launched one for powering robots.

Shengshu remarked Friday it has strategic partnerships with companies developing embodied AI — systems such as humanoid robots that interact with the physical international community — for adopt across industrial, commercial and home settings.

Planet models are critical for robotics because the innovation needs more than LLMs to work, Kevin Kelly, co-founder of the U.S. tech magazine Wired, wrote last month on his Substack.

Ultimately, to replicate human intelligence, AI will need three things: reasoning, an understanding of the physical planet and continuous learning, Kelly mentioned. While AI for the learning category hasn’t been developed yet, LLM-powered chatbots have created the knowledge element, he remarked, making planet models a key area requiring a breakthrough.

AI Disclosure: This article has been generated and curated using advanced AI technology. While we strive for absolute accuracy, some details may be summarized or translated by autonomous systems. Please cross-reference critical financial data with official sources.