Multilabel Aspect-Based Emotion Analysis Pada Ulasan Aplikasi IKD: Pengaruh Focal Loss dan Threshold Tuning Menggunakan Indobert
DOI:
https://doi.org/10.70340/jirsi.v5i2.448Keywords:
Aspect-Based Emotion Analysis, IndoBERT, Focal Loss, Threshold Tuning, Multilabel ClassificationAbstract
User reviews of the Identitas Kependudukan Digital (IKD) application contain various emotions toward different service aspects. These reviews not only reflect the level of service satisfaction but also encompass user experiences, complaints, expectations, and public perceptions regarding the quality of the system. This study aims to develop a multi-label Aspect-Based Emotion Analysis (ABEA) model using an end-to-end IndoBERT architecture to identify user emotions across each service aspect of the IKD application. Additionally, it analyzes the impact of implementing Focal Loss and threshold tuning on classification performance under highly imbalanced label distributions. Data were collected from 13,197 user reviews on the Google Play Store spanning from June 2024 to November 2025 using web scraping methods, which were subsequently cleaned and filtered to yield 6,891 data entries. Service aspects were empirically identified using BERTopics. Labeling was conducted by three human annotators and two AI annotators, with the final labels determined through majority voting. The model was developed across 6 experimental scenarios varying in preprocessing, Focal Loss, threshold tuning, and data split ratios. Evaluation was performed using F1 Score Macro, F1 Score Micro, Precision, Recall, and Hamming Loss metrics. BERTopic achieved a Coherence Score of 0.6196 and a Topic Diversity of 0.92 with 5 representative aspects. The most optimal model was obtained using a configuration of Focal Loss, a threshold of 0.4, and a 60:20:20 split ratio, achieving an F1 Score Macro of 0.3916, a 24.1% increase from the baseline, alongside an F1 Score Micro of 0.9134 and a Recall of 0.9423. The selected model was successfully integrated into a web-based system using the Flask framework to visualize the classification results. Anger dominated the reviews concerning the Login & Akses Akun and Scan Barcode ke Dukcapil aspects, whereas the Dokumen & Layanan Digital aspect recorded the highest joy emotion. The combination of Focal Loss and threshold tuning proved effective in handling imbalanced label distributions in Indonesian multi-label ABEA classification.
Downloads
References
Irma Nurdiana and Khithoh Ayumi, “Implementasi Aplikasi Identitas Kependudukan Digital (IKD) Di Disdukcapil Kota Tanjungpinang,” Harmoni Sos. J. Pengabdi. Dan Solidar. Masy., vol. 1, no. 2, pp. 50–58, Apr. 2024, doi: 10.62383/harmoni.v1i2.141.
P. Hakiki, D. Satria, and A. A. Arifiyanti, “Prediksi Sentimen dan Pemodelan Topik dari Ulasan Aplikasi Identitas Kependudukan Digital,” Jutisi J. Ilm. Tek. Inform. Dan Sist. Inf., vol. 14, no. 1, p. 760, Jul. 2025, doi: 10.35889/jutisi.v14i1.2777.
Kementerian Dalam Negeri, “Peraturan Menteri Dalam Negeri Nomor 72 Tahun 2022 tentang Standar dan Spesifikasi Perangkat Keras, Perangkat Lunak, dan Blangko Kartu Tanda Penduduk Elektronik serta Penyelenggaraan Identitas Kependudukan Digital.” 2022. [Online]. Available: https://peraturan.bpk.go.id/Details/247759/permendagri-no-72-tahun-2022
D. S. Akbar Rizki, M. S. Khabib, N. Rahmayuna, and V. G. Utomo, “Klasifikasi Sentimen Ulasan Pengguna Aplikasi Layanan Publik Google Play Store Menggunakan NLP dan ML,” J. Tekno Kompak, vol. 20, no. 1, pp. 51–64, Oct. 2025, doi: https://doi.org/10.33365/jtk.v20i1.586.
D. E. Sondakh, R. C. Maringka, F. P. Ayorbaba, J. S. C. B. T. Mangi, and S. R. Pungus, “Emotion Mining User Review of the BRImo Mobile Banking Application Using the Decision Tree Algorithm,” J. Sisfokom Sist. Inf. Dan Komput., vol. 12, no. 3, pp. 350–355, Nov. 2023, doi: 10.32736/sisfokom.v12i3.1721.
L. De Bruyne, A. Karimi, O. De Clercq, A. Prati, and V. Hoste, “Aspect-Based Emotion Analysis and Multimodal Coreference: A Case Study of Customer Comments on Adidas Instagram Posts,” in Proceedings of the Thirteenth Language Resources and Evaluation Conference, European Language Resources Association, Jun. 2022, pp. 574–580. [Online]. Available: https://aclanthology.org/2022.lrec-1.61/
N. C. Mei, S. Tiun, and G. Sastria, “Multi-Label Aspect-Sentiment Classification on Indonesian Cosmetic Product Reviews with IndoBERT Model,” Int. J. Adv. Comput. Sci. Appl., vol. 15, no. 11, 2024, doi: 10.14569/IJACSA.2024.0151168.
N. K. Nissa and E. Yulianti, “Multi-label text classification of Indonesian customer reviews using bidirectional encoder representations from transformers language model,” Int. J. Electr. Comput. Eng. IJECE, vol. 13, no. 5, p. 5641, Oct. 2023, doi: 10.11591/ijece.v13i5.pp5641-5652.
A. Kesanam, G. V. R. Ram, C. S. Banoth, and G. R. M. Reddy, “NITK-VITAL at SemEval-2025 Task 11: Focal-RoBERTa: Addressing Class Imbalance in Multi-Label Emotion Classification,” in Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), Austria: Association for Computational Linguistics, Jul. 2025, pp. 1077–1081. [Online]. Available: https://aclanthology.org/2025.semeval-1.142/
M. D. Pratiwi and K. D. Tania, “Knowledge Discovery Through Topic Modeling on GoPartner User Reviews Using BERTopic, LDA, and NMF,” J. Appl. Inform. Comput., vol. 9, no. 1, pp. 1–7, Jan. 2025, doi: 10.30871/jaic.v9i1.8782.
M. Grootendorst, “BERTopic: Neural topic modeling with a class-based TF-IDF procedure,” 2022, arXiv. doi: 10.48550/ARXIV.2203.05794.
H. Aka Uymaz and S. Kumova Metin, “Collaborative Emotion Annotation: Assessing the Intersection of Human and AI Performance with GPT Models:,” in Proceedings of the 15th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, Rome, Italy: SCITEPRESS - Science and Technology Publications, 2023, pp. 298–305. doi: 10.5220/0012183200003598.
G. Marzi, M. Balzano, and D. Marchiori, “K-Alpha Calculator–Krippendorff’s Alpha Calculator: A user-friendly tool for computing Krippendorff’s Alpha inter-rater reliability coefficient,” MethodsX, vol. 12, p. 102545, Jun. 2024, doi: 10.1016/j.mex.2023.102545.
M. C. Hinojosa Lee, J. Braet, and J. Springael, “Performance Metrics for Multilabel Emotion Classification: Comparing Micro, Macro, and Weighted F1-Scores,” Appl. Sci., vol. 14, no. 21, p. 9863, Oct. 2024, doi: 10.3390/app14219863.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Viviana Purba, Eka Dyar Wahyuni, Tri Luhur Indayanti Sugata

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.







