Deep Learning for Videoconferencing: A Brief Examination of Speech to Text and Speech Synthesis

TANBERK, SENEM

Deep Learning for Videoconferencing: A Brief Examination of Speech to Text and Speech Synthesis

Yazarlar (1)
Dr. Öğr. Üyesi Senem TANBERK Kurum Bilgileri Mühendislik ve Mimarlık Fakültesi Yazılım Mühendisliği Bölümü - Ana Bilim Dalı Özgeçmiş Sayfası İletişim Bilgileri: Orion TR, Türkiye

Devamını Göster

Özet

Integrating AI-powered applications into video conferencing systems is further expected to blow up in various industrial scenarios. In this modern era of the video conferencing industry, deep learning techniques are revolutionizing to improve the quality of communication in use cases such as resolution improvement, background noise reduction, video compression, face alignment, transcription, and speech synthesis. This paper reviews the latest works on deep learning-based transcription and speech synthesis methods. They are classified into three categories: Speech to text, text to speech, and speech to text to speech. We included experimental studies conducted in two specific methods in a speech to text and speech synthesis. Experimental results on various test scenarios of two state-of-the-art pre-trained models are also analyzed. Finally, the future development trend of AI- powered video conferencing system …

Anahtar Kelimeler

Bildiri Türü	Tebliğ/Bildiri
Bildiri Alt Türü	Tam Metin Olarak Yayınlanan Tebliğ (Uluslararası Kongre/Sempozyum)
Bildiri Niteliği	Alanında Hakemli Uluslararası Kongre/Sempozyum
Bildiri Dili	İngilizce
Kongre Adı	2021 6th International Conference on Computer Science and Engineering (UBMK)
Kongre Tarihi	15-09-2021 / 15-09-2021
Basıldığı Ülke	Türkiye
Basıldığı Şehir

BM Sürdürülebilir Kalkınma Amaçları

Atıf Sayıları
Google Scholar	9

Akademisyenler > Senem TANBERK > Yayın Detayı

Deep Learning for Videoconferencing: A Brief Examination of Speech to Text and Speech Synthesis

Paylaş