| Yazarlar (1) |
Dr. Öğr. Üyesi Senem TANBERK
Orion TR, Türkiye |
| Özet |
| Integrating AI-powered applications into video conferencing systems is further expected to blow up in various industrial scenarios. In this modern era of the video conferencing industry, deep learning techniques are revolutionizing to improve the quality of communication in use cases such as resolution improvement, background noise reduction, video compression, face alignment, transcription, and speech synthesis. This paper reviews the latest works on deep learning-based transcription and speech synthesis methods. They are classified into three categories: Speech to text, text to speech, and speech to text to speech. We included experimental studies conducted in two specific methods in a speech to text and speech synthesis. Experimental results on various test scenarios of two state-of-the-art pre-trained models are also analyzed. Finally, the future development trend of AI- powered video conferencing system … |
| Anahtar Kelimeler |
| Bildiri Türü | Tebliğ/Bildiri |
| Bildiri Alt Türü | Tam Metin Olarak Yayınlanan Tebliğ (Uluslararası Kongre/Sempozyum) |
| Bildiri Niteliği | Alanında Hakemli Uluslararası Kongre/Sempozyum |
| Bildiri Dili | İngilizce |
| Kongre Adı | 2021 6th International Conference on Computer Science and Engineering (UBMK) |
| Kongre Tarihi | 15-09-2021 / 15-09-2021 |
| Basıldığı Ülke | Türkiye |
| Basıldığı Şehir |