Paper Title

Multimodal Sentiment Analysis Using Transformer-Based Architectures: A Fusion of Text, Audio, and Visual Cues

Keywords

  • Multimodal Sentiment Analysis
  • Transformer Architectures
  • Text-Audio-Visual Fusion
  • Attention Mechanisms
  • Deep Learning

Article Type

Research Article

Publication Info

Volume: 5 | Issue: 2 | Pages: 1-7

Published On

July, 2024

Downloads

Abstract

Multimodal Sentiment Analysis (MSA) seeks to interpret human emotions by integrating textual, auditory, and visual data. Leveraging transformer-based architectures, this study introduces a novel framework that effectively fuses these modalities to enhance sentiment classification accuracy. The proposed model employs advanced fusion techniques and attention mechanisms to capture intricate inter-modal relationships. Evaluated on benchmark datasets such as CMU-MOSI and CMU-MOSEI, the model demonstrates superior performance compared to existing state-of-the-art methods, highlighting the efficacy of transformer-based multimodal fusion in sentiment analysis.

View more »

Uploaded Document Preview