TY  - THES
T1  - GenreNet a web-based book genre classifcation through dual color space input with Vision Transformers
A1  - Gabiana, Melben Kian R.
A2  - Yusiong, John Paul T.
LA  - English
UL  - https://tuklas.up.edu.ph/Record/UP-8027390931312009977
AB  - Book covers are important in navigating the vast number of types of books that exist. This research introduces a new approach to book genre classification using Vision Transformers (VIT), dual color space inputs, and Transformer model. Book genre classification is a multimodal problem that utilizes both image and text as in-puts. Previous studies have shown that models that include both text and image as input yield better results. However, the performance of image-based classification, especially the classification of book covers, remains worse than text-based classification. This study proposes GeureNet, a web-based application that aims to address this gap by proposing a multimodal model utilizing the Vision Transformer model that will classify book genres from book covers using dual color space of images, the BERT model that will classify using book titles. The model fine-tuned in the Book-Cover 30 dataset outperforms the state-of-the-art models achieving 65.14% in top-1 accuracy and 84.74% in top-3 accuracy. The results show that the proposed model is effective in classifying book genres from book covers using dual color space inputs and Transformer models. 
CN  - LG 993.5 2025 C66 G26
KW  - Machine learning.
ER  -