LLMs by Vietnamese Trainers Make Breakthroughs on VMLU Rankings

read

February 23, 2025

A report on the development of Vietnamese Large Language Models (LLMs) based on the Vietnamese Multitask Language Understanding (VMLU) benchmark suite for large language models platform published by Zalo AI has recorded significant progress for LLMs trained by the Vietnamese.

On January 10, Zalo AI announced the report on the development of Vietnamese LLMs for 2024, based on the evaluation and ranking platform for Vietnamese language capabilities, VMLU.

The ranking of VMLU leaderboard is based on scores in various fields: general knowledge, STEM, social sciences, and humanities.

Development of AI Community in Vietnamese LLMs

With the participation of an increasing number of organizations and individuals, Zalo AI’s report highlights the robust growth of large language models tailored for Vietnamese users. Specifically, in 2024, VMLU announced 45 LLMs on the leaderboard, receiving submissions from over 155 organizations and individuals, totaling 691 downloads of the evaluation dataset and 3,729 evaluations of LLMs from the platform.

Although the results are humble compared to developed countries, this progress reflects ongoing efforts to access cutting-edge global technology, especially in a context where generative AI is still relatively new in Vietnam. Limitations such as lack of data, infrastructure, and resources have posed challenges, but the advancements indicate promising prospects for this field in Vietnam.

In addition to domestic research teams, many foreign entities are also optimizing LLMs for Vietnamese. International developers submitting evaluations to VMLU include UONLP x Ontocord team – University of Oregon (USA), DAMO Academy team – Alibaba Group (China), and SDSRV teams – Samsung.

Notably, not only enterprises but also research teams from Vietnam universities have made their presence felt in the rankings, such as ML4U team – HCM Bach Khoa University, and FPTU HCMC team – FPT University, signaling a positive trend for AI development in the educational sector, fostering sustainable growth.

As a leading entity in promoting the AI community’s development, the Japan Advanced Institute of Science and Technology (JAIST) has partnered with Zalo AI in building and operating VMLU. Assessing the development status of LLMs in Vietnam, Prof. Nguyễn Lê Minh, Director of the Interpretable AI Research Center at JAIST, stated: “The increase in the number of LLMs in Vietnam indicates significant interest from numerous companies and individuals in enhancing the applications of GenAI. In the future, I believe the trend will lean towards utilizing open LLMs like Llama, adapting them for specific problems and domain data.” However, according to Prof. Nguyễn Lê Minh, there will still be research groups that continue to train their own large language models.

VMLU – The Measure of Vietnamese Capability for the LLM Development Community

Launched in November 2023, the VMLU evaluation platform for Vietnamese language capability for LLMs provides a comprehensive dataset and assessment standards, including 10,880 multiple-choice questions across 58 topics covering four fields: STEM (science, technology, engineering, and mathematics), social sciences, humanities, and expansion. The difficulty level of questions increases gradually across four levels: primary school, lower secondary, upper secondary, and professional (bachelor’s and postgraduate).

After using VMLU to measure and evaluate models, LLM developers can provide scores and request publication on the leaderboard. Through this, they can compare their model’s capabilities with existing LLMs on the market to enhance their training continuously.

While the number of LLMs remains limited compared to the global landscape, Vietnamese-trained models have achieved high rankings on the VMLU leaderboard, directly competing with models from “big brands” like Llama-3-70B (Meta) and Gemini (Google). Specifically, KiLM-13b-v24.7.1 (developed by Zalo AI) has risen to the number 5 position among from-scratch models with an average general score of 66.07 (above GPT-4 with a score of 65.53). Another Vietnamese model, ViGPT-1.6B-v1 (developed by VinBigData), is also in the Top 15 from-scratch models.

For the fine-tuned models ranking, 9 out of 10 domestic LLMs made it to the Top 10, highlighting the current trend in developing large language models in Vietnam. This involves leveraging open LLMs to adapt to specific problems and domain expertise. Continuously improving models with increasing scores demonstrates commendable efforts toward localizing LLMs for Vietnamese users.

The rapid progress of Vietnamese-trained LLMs on VMLU leaderboard marks a significant step forward for AI development in Vietnam. Despite challenges in resources and infrastructure, the growing participation of research teams, universities, and enterprises reflects a strong commitment to advancing generative AI. As local models continue to improve and compete with global counterparts, Vietnam’s AI community is ready for even greater breakthroughs in the future.