img
img
Deep Learning for Videoconferencing: A Brief Examination of Speech to Text and Speech Synthesis  
Yazarlar (1)
Dr. Öğr. Üyesi Senem TANBERK Dr. Öğr. Üyesi Senem TANBERK
Orion TR, Türkiye
Devamını Göster
Özet
Integrating AI-powered applications into video conferencing systems is further expected to blow up in various industrial scenarios. In this modern era of the video conferencing industry, deep learning techniques are revolutionizing to improve the quality of communication in use cases such as resolution improvement, background noise reduction, video compression, face alignment, transcription, and speech synthesis. This paper reviews the latest works on deep learning-based transcription and speech synthesis methods. They are classified into three categories: Speech to text, text to speech, and speech to text to speech. We included experimental studies conducted in two specific methods in a speech to text and speech synthesis. Experimental results on various test scenarios of two state-of-the-art pre-trained models are also analyzed. Finally, the future development trend of AI- powered video conferencing system …
Anahtar Kelimeler
Bildiri Türü Tebliğ/Bildiri
Bildiri Alt Türü Tam Metin Olarak Yayınlanan Tebliğ (Uluslararası Kongre/Sempozyum)
Bildiri Niteliği Alanında Hakemli Uluslararası Kongre/Sempozyum
Bildiri Dili İngilizce
Kongre Adı 2021 6th International Conference on Computer Science and Engineering (UBMK)
Kongre Tarihi 15-09-2021 / 15-09-2021
Basıldığı Ülke Türkiye
Basıldığı Şehir
BM Sürdürülebilir Kalkınma Amaçları
Atıf Sayıları
Google Scholar 9

Paylaş