Paper Title

A SURVEY OF TRANSFORMER-DRIVEN HYBRID DEEP LEARNING METHODS FOR MULTIMODAL SENTIMENT ANALYSIS IN NLP

Keywords

  • Unimodal sentiment analysis
  • Multimodal sentiment analysis
  • fusion methodologies
  • textual
  • visual
  • and audio/visual data

Publication Info

Volume: 12 | Issue: 4 | Pages: 37-47

Published On

September, 2025

Downloads

Abstract

Sentiment analysis (SA) is a computational approach aimed at identifying, extracting, and quantifying subjective information such as emotions, attitudes, and opinions. While traditional unimodal sentiment analysis relies solely on textual data, multimodal sentiment analysis (MSA) leverages diverse information sources, including speech, tone, facial expressions, and body movements, to capture a deeper, more accurate representation of human emotions. This study provides a comprehensive review of existing research on multimodal fusion methods and feature extraction, emphasizing the integration of textual, visual, and audio-visual data through transformer-based deep learning models. It traces the evolution and theoretical underpinnings of MSA, outlining its major advancements, existing challenges, and practical advantages. Furthermore, the paper identifies emerging trends and future research directions, serving as a valuable reference for scholars and practitioners seeking to advance the field of multimodal sentiment analysis.

View more »

Uploaded Document Preview