img
img
A Comparison of Time-based Models for Multimodal Emotion Recognition  
Yazarlar (1)
Dr. Öğr. Üyesi Senem TANBERK Dr. Öğr. Üyesi Senem TANBERK
Huawei R&D İstanbul, Türkiye
Devamını Göster
Özet
Emotion recognition has become an important research topic in the field of human-computer interaction. Studies on audio and videos to understand emotions focused mainly on analyzing facial expressions and classified 6 basic emotions. In this study, the performance of different sequence models which are frequently used in literature is compared for multi-modal emotion recognition problems. The audio and images were first processed by multi-layered CNN models, and the outputs of these models were fed into various sequence models. The sequence models are GRU, Transformer, LSTM, and Max Pooling. Accuracy, precision, harmonic, and macro F1 Score values of all models were calculated. The multi-modal CREMA-D dataset was used in the experiments. As a result of the comparison of the CREMA-D dataset, GRU-based architecture with 0.640 showed the best result in harmonic F1 score, LSTM-based …
Anahtar Kelimeler
Bildiri Türü Tebliğ/Bildiri
Bildiri Alt Türü Tam Metin Olarak Yayınlanan Tebliğ (Uluslararası Kongre/Sempozyum)
Bildiri Niteliği Alanında Hakemli Uluslararası Kongre/Sempozyum
Bildiri Dili İngilizce
Kongre Adı 2023 Innovations in Intelligent Systems and Applications Conference (ASYU)
Kongre Tarihi 11-10-2023 / 11-10-2023
Basıldığı Ülke Türkiye
Basıldığı Şehir
Atıf Sayıları

Paylaş