Improving SpaCy-based NER Accuracy using Conditional Random Fields (CRF) as Advanced Features for Javanese Legends

Authors

  • Kevin Dwi Mahendra Author
  • Danang Arbian Sulistyo Institut Teknologi dan Bisnis Asia Malang image/svg+xml Author

DOI:

https://doi.org/10.32664/icobits.v1.57

Keywords:

Named Entity Recognition (NER), Conditional Random Fields (CRF), SpaCy, Javanese Legends

Abstract

Named Entity Recognition (NER) is a crucial task for information extraction, particularly for preserving the rich cultural data within Javanese legends. However, standard NER frameworks like SpaCy can face limitations when processing languages with unique linguistic characteristics. This research addresses this gap by exploring the effectiveness of integrating Conditional Random Fields (CRF) as an advanced feature extraction layer to enhance a SpaCy-based NER model. The proposed hybrid model leverages CRF's strength in sequence labeling to improve the contextual understanding of entities within Javanese narratives. Experimental results demonstrate a significant performance increase, with the model achieving a precision of 0.8923, recall of 0.8678, and an overall F1-score of 0.8803. This study confirms that augmenting SpaCy with a CRF layer provides a robust solution for improving NER accuracy on Javanese texts. Future work could involve incorporating more complex contextual embeddings or applying this model to other genres of traditional Indonesian literature to further validate its effectiveness and adaptability

Downloads

Download data is not yet available.

Downloads

Published

14-01-2026