TY - THES T1 - CoVirNet a multimodal web-based book genre classification system using a dual-color-space input CNN-ViT architecture A1 - AcompaƱado, Emiline Barcent Jloise S. A2 - Yusiong, John Paul T. LA - English UL - https://tuklas.up.edu.ph/Record/UP-8027390931312009872 AB - Book genres are not always clearly defined and classifying them based solely on visual or textual patterns can be unreliable. While recent models attempt to improve genre classification by combining both cues, key limitations remain in how input representations and model architectures are designed. This paper proposes CoVirNet-a multimodal classification system that predicts a book's genre using its cover image and title. The model processes two color space representations of the image through a hybrid architecture, where a Convolutional Neural Network (CNN) and a Vision Transformer (VIT) operate in parallel, while the book title is processed using a BERT-based encoder. Evaluated on the BookCover30 dataset, CoVirNet outperforms both the state-of-the-art results and its baseline variants, achieving 65.23% Top-1 accuracy and 84.70% Top-3 accuracy. These results underscore the benefits of color space fusion, architectural hybridization, and deep text modeling in improving multimodal book genre classification. CN - LG 993.5 2025 C66 A36 KW - Machine learning. ER -