Can machines imagine? Critical thinking and cultural reasoning in multimodal-multilingual AI
Abstract
Effective communication across languages and cultures is essential in today’s interconnected world. Multimodal-multilingual language models (MMMLMs) aim to advance this goal by integrating text, speech, and visual understanding across diverse linguistic contexts. This study evaluates four leading MMMLMs-GIT, mPLUG, CLIP, and Whisper + GPT-4V-on cross lingual and cross-modal tasks, including image captioning, visual question answering, speech-to-image generation, and idiomatic translation. Performance was assessed in high-resource (English, Arabic), medium resource (Malay), and low-resource (Macedonian) settings. Results show strong performance in structured tasks but notable limitations in cultural reasoning, figurative language interpretation, and semantic grounding in low-resource environments. GIT delivered the most consistent multilingual results, while Whisper + GPT-4V excelled in fluency yet lacked cultural sensitivity. To address these gaps, the study proposes culturally informed evaluation protocols that integrate quantitative metrics such as BLEU, CIDEr, and F1 with qualitative, community-centered approaches. These include cross-cultural annotation panels, inter-rater reliability validation using Cohen’s kappa, and a novel “cultural fidelity” metric to measure alignment with culturally specific norms. The findings emphasize the need for inclusive datasets, ethical development, and interdisciplinary collaboration to ensure MMMLMs support equitable and culturally aware global communication.
Keywords
Cross-cultural communication; Ethical AI; Low-resource languages; Multilingual AI; Multimodal language models
Full Text:
PDFDOI: http://doi.org/10.11591/ijict.v15i2.pp823-838
Refbacks
- There are currently no refbacks.
Copyright (c) 2026 Mohammad Awad AlAfnan, Siti Fatimah MohdZuki, Shefa Mohammad AlAfnan

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
The International Journal of Informatics and Communication Technology (IJ-ICT)
p-ISSN 2252-8776, e-ISSN 2722-2616
This journal is published by the Intelektual Pustaka Media Utama (IPMU).