Penerapan Model Natural Language Processing (LDA-BOW & Word2Vec) & Diagram Sankey dalam Analisis Rantai Pasokan pada Bidang Perpajakan
DOI:
https://doi.org/10.52869/ad83d368Keywords:
natural language processing, supply chain analysis, machine learning, Sankey diagram, tax analysisAbstract
The development of digital innovation and big data has created a growing need for the application of machine learning across various fields. This study seeks to integrate existing tax analysis practices with machine learning techniques. It aims to improve the efficiency and accuracy of supply chain analysis in the tax domain by applying Natural Language Processing (NLP) models alongside Sankey diagram visualizations. The NLP models employed include Latent Dirichlet Allocation Bag of Words (LDA BOW) and the Word2Vec algorithm, which serve to identify and extract transactions based on topic modeling and semantic similarity. These models are implemented within the CRISP-DM methodological framework. As a result of this application, 6.8 million PKP transactions in the pharmaceutical sector for the year 2022 were successfully classified at a rate of 73.7 %, with a 19 % improvement in accuracy following the integration of Word2Vec. In this research, Sankey diagrams are used to intuitively visualize the flow of transactions, enabling users to pinpoint critical points in the supply chain where tax-related risks or discrepancies are higher. For the supply chain analysis, the authors adopt the Supply Chain Operations Reference (SCOR) model, focusing on reliability and cost aspects that closely align with tax compliance evaluation. The findings are expected to yield a prototype application that streamlines the audit process for tax authorities and contributes to the body of text-mining literature in the field of taxation.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Scientax: Jurnal Kajian Ilmiah Perpajakan Indonesia

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.







