Abstract
In immersive virtual communications, accurate facial expression mapping is pivotal for emotional presence and realism. This paper proposes a novel sender-side AI-driven facial landmark generation framework aimed at optimizing expression mapping in real-time virtual avatars. By deploying lightweight deep learning models at the sender’s device, our system ensures privacy, reduces latency, and eliminates the need for transmitting raw video. We present an end-to-end architecture incorporating CNN-based landmark detection, temporal expression encoding, and real-time avatar synchronization. Experimental results demonstrate robust expression fidelity across platforms, even under constrained computational conditions. This approach paves the way for scalable, expressive metaverse communication.
View more »