img
img
Resume Information Extraction via Post-OCR Text Processing  
Yazarlar (1)
Dr. Öğr. Üyesi Senem TANBERK Dr. Öğr. Üyesi Senem TANBERK
Huawei R&D İstanbul, Türkiye
Devamını Göster
Özet
Information extraction (IE), one of the main tasks of natural language processing (NLP), has recently increased importance in the use of resumes. In studies on the text to extract information from the CV, sentence classification was generally made using NLP models. In this study, it is aimed to extract information by classifying all of the text groups after preprocessing such as Optical Character Recognition (OCT) and object recognition with the YOLOv8 model of the resumes. The text dataset consists of 286 resumes collected for 5 different (education, experience, talent, personal, and language) job descriptions in the IT industry. The dataset created for object recognition consists of 1198 resumes, which were collected from the open-source datasets and labeled as sets of text. BERT, BERT-t, DistilBERT, RoBERTa, and XLNet were used as models. F1 score variances were used to compare the model results. In addition …
Anahtar Kelimeler
Bildiri Türü Tebliğ/Bildiri
Bildiri Alt Türü Tam Metin Olarak Yayınlanan Tebliğ (Uluslararası Kongre/Sempozyum)
Bildiri Niteliği Alanında Hakemli Uluslararası Kongre/Sempozyum
Bildiri Dili İngilizce
Kongre Adı 2023 Innovations in Intelligent Systems and Applications Conference (ASYU)
Kongre Tarihi 11-10-2023 / 11-10-2023
Basıldığı Ülke Türkiye
Basıldığı Şehir
BM Sürdürülebilir Kalkınma Amaçları
Atıf Sayıları
Google Scholar 5

Paylaş