A Comparative Study of Text Classification Performance Using NBC and KNN with N-Gram Features in E-Government Services

Authors

DOI:

https://doi.org/10.32664/icobits.v1.136

Keywords:

Text Classification, N-Gram Features, Naïve Bayes Classifier (NBC), K-Nearest Neighbor (KNN), E-Government Feedback Analysis

Abstract

The Sambat Online service is a promising platform developed by the Malang City Government to collect feedback and suggestions from the public as part of efforts to enhance government service quality. To ensure that citizen feedback is effectively categorized and delivered to the relevant departments, a highly accurate text classification method is required.  This study investigates the application of N-Gram features in two popular text classification algorithms: Naïve Bayes Classification (NBC) and K-Nearest Neighbor (KNN). The research aims to analyze and compare the performance of different N-Gram models (unigram, bigram, and trigram), both individually and in combination, for classifying textual data obtained from the Sambat Online system. Experimental results demonstrate that incorporating N-Gram features significantly improves the accuracy of both NBC and KNN classifiers. Among the evaluated methods, NBC achieved consistently higher performance across multiple feature combinations. The highest classification accuracy was obtained when all three N-Gram types were combined, yielding an accuracy of 98.67% for NBC and 97.17% for KNN. These findings indicate that the integration of N-Gram features can effectively enhance text classification performance in e-government applications, particularly when implemented using NBC.

Downloads

Download data is not yet available.

Downloads

Published

21-01-2026