A Comparative Study of Text Classification Performance Using NBC and KNN with N-Gram Features in E-Government Services
DOI:
https://doi.org/10.32664/icobits.v1.136Keywords:
Text Classification, N-Gram Features, Naïve Bayes Classifier (NBC), K-Nearest Neighbor (KNN), E-Government Feedback AnalysisAbstract
The Sambat Online service is a promising platform developed by the Malang City Government to collect feedback and suggestions from the public as part of efforts to enhance government service quality. To ensure that citizen feedback is effectively categorized and delivered to the relevant departments, a highly accurate text classification method is required. This study investigates the application of N-Gram features in two popular text classification algorithms: Naïve Bayes Classification (NBC) and K-Nearest Neighbor (KNN). The research aims to analyze and compare the performance of different N-Gram models (unigram, bigram, and trigram), both individually and in combination, for classifying textual data obtained from the Sambat Online system. Experimental results demonstrate that incorporating N-Gram features significantly improves the accuracy of both NBC and KNN classifiers. Among the evaluated methods, NBC achieved consistently higher performance across multiple feature combinations. The highest classification accuracy was obtained when all three N-Gram types were combined, yielding an accuracy of 98.67% for NBC and 97.17% for KNN. These findings indicate that the integration of N-Gram features can effectively enhance text classification performance in e-government applications, particularly when implemented using NBC.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2025 ICoBITS

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.





