M3-FRD: MULTIMODAL AND MULTILINGUAL FAKE REVIEW DETECTION: BRIDGING THE GAP IN LOW-RESOURCE AND CROSS-DOMAIN CONTEXTS USING TRANSFORMER-BASED DEEP LEARNING
Abstract
Online reviews have become something most of us just... trust, almost without thinking. More than 80% of consumers in the U.S. let them influence what they buy (Hajek et al., 2023), which makes the next number pretty unsettling — TripAdvisor flagged 1.3 million fake reviews in 2022 alone (TripAdvisor, 2023). One platform, one year. The scale of it is hard to enclose your head around. And here's the thing that bothers me about how we're trying to fix it: almost every detection tool out there is built around English text, and only English text. That's it. No images, no star patterns, no sense of how a reviewer actually behaves on the platform — and certainly no consideration for the fact that millions of people leave reviews in Spanish, Arabic, Hindi, and dozens of other languages. Global commerce doesn't run on one language, so why would fake review detection? This paper is essentially our attempt to take that problem seriously. We built a transformer-based architecture that pulls together text, visuals, and behavioral signals all at once — and does it in a way that works even for languages without much labeled training data to lean on. We tested everything on Yelp, Amazon, and a multilingual dataset we put together ourselves, and the results were genuinely encouraging. Fusing these different signals pushed detection F1-score up by 6.3% compared to a text-only BERT baseline, and cross-lingual transfer cut the data requirements for underrepresented languages by around 40%. It is not a complete solution nothing ever really is but it's a more honest attempt at tackling the problem as it actually exist, not just the tidy, English-language version of it.