
Singapore's AISG today officially released its next-generation large-scale language model, Qwen-Sea-Lion-v4, whose underlying architecture has been fully upgraded from Meta Llama to Alibaba's "Tongyi Qianwen" Qwen3-32B. This model achieved first place on the <200 billion parameters open-source leaderboard in the Southeast Asian Language Comprehensive Evaluation Benchmark (Sea-Helm), marking a significant breakthrough in AI technology in Southeast Asia.
This technological upgrade is primarily based on three reasons: First, Qwen3 natively supports 119 languages/dialects, with 36 trillion pre-trained tokens, significantly improving the performance of low-resource languages such as Indonesian and Thai; second, the new model uses byte-pair encoding (BPE) instead of the traditional sentence segmenter, enabling space-free processing of Thai and Burmese characters, optimizing both translation accuracy and inference speed; third, the quantized model requires only 32GB of memory to run, perfectly suited to the limited computing power needs of Southeast Asian SMEs.
Regarding training data, AISG contributed 100 billion Southeast Asian language tokens, with a content density of 13%, 26 times that of Llama2. Alibaba has infused regional knowledge into the model through "advanced post-training" technology, enabling it to accurately understand mixed languages such as Singaporean English and Malay English. Performance tests show that Qwen-Sea-Lion-v4 outperforms the original Llama baseline by an average of 8.4% on tasks involving Indonesian and Vietnamese, ranking first in both document-level inference and cross-lingual summarization metrics.
Currently, the model is available for free download on Hugging Face and the AISG website, supporting 4-bit and 8-bit quantization versions. The Singapore government has included it in its S$70 million national multimodal development plan launched in 2023, with large-scale deployment expected in education, healthcare, and finance by 2026. This achievement not only promotes the development of the AI ecosystem in Southeast Asia but also provides a new technological paradigm for large-scale multilingual models.