Alibaba has reportedly taken a significant step forward in robotics. Lin Junyang, head of the Tongyi Qianwen large language model, recently announced the formation of a small team focused on robotics and embodied intelligence. This move signals Alibaba's accelerated transformation of its multimodal foundational models into intelligent agents capable of operating in the physical world. Lin Junyang emphasized that these agents, equipped with tool-use and memory capabilities, will achieve long-term reasoning through reinforcement learning, "and they absolutely must move from the virtual world to the physical world."
Amidst the global tech giants' rush to deploy robotics, Alibaba Cloud last month led a $140 million investment in Chinese robotics startup X Square Robot, marking its first foray into embodied intelligence. Two weeks ago, at the 2025 Yunqi Conference, Alibaba CEO Wu Yongming stated that global AI investment will reach $4 trillion over the next five years, and Alibaba must keep pace. In addition to the previously announced 380 billion yuan investment in cloud and AI infrastructure over three years, Alibaba plans additional investments.
As technical director of Tongyi Qianwen, Lin Junyang previously led the development of multimodal models capable of processing audio, image, and text input. The establishment of the embodied intelligence team means that Alibaba is expanding from pure virtual AI to the field of physical intelligence.