Abstract
Sentiment analysis (SA) has gained much traction In the field of artificial intelligence (AI) and natural language processing (NLP). There is growing demand to automate analysis of user sentiment towards products or services. Opinions are increasingly being shared online in the form of videos rather than text alone. This has led to SA using multiple modalities, termed Multimodal Sentiment Analysis (MSA), becoming an important research area. MSA utilises latest advancements in machine learning and deep learning at various stages including for multimodal feature extraction and fusion and sentiment polarity detection, with aims to minimize error rate and improve performance. This survey paper examines primary taxonomy and newly released multimodal fusion architectures. Recent developments in MSA architectures are divided into ten categories, namely early fusion, late fusion, hybrid fusion, model-level fusion, tensor fusion, hierarchical fusion, bi-modal fusion, attention-based fusion, quantum-based fusion and word-level fusion. A comparison of several architectural evolutions in terms of MSA fusion categories and their relative strengths and limitations are presented. Finally, a number of interdisciplinary applications and future research directions are proposed.
View more >>