Abstract

Learning meaningful representations without large amounts of labeled data has become a cornerstone challenge in machine learning, especially in scenarios involving multimodal data and sparse annotation. This paper explores a hybrid approach combining contrastive learning and generative self-supervised techniques for robust feature extraction in cross-modal environments under low-label regimes. Our proposed framework jointly optimizes representation alignment across modalities and sample diversity using contrastive objectives and latent reconstruction. Empirical evaluation on image-text and audio-visual datasets shows improved performance in downstream classification and transfer learning tasks. The findings support the potential of integrated self-supervision for scalable, data-efficient representation learning.

Close Copy Text

Paper Title

Contrastive and Generative Self-Supervised Learning for Robust Feature Extraction in Cross-Modal and Low-Label Regimes

Authors

Keywords

Article Type

Journal

Issue

Published On

Downloads

Abstract

Uploded Document Preview

QUICKLINKS

CONTACT US