A Comparative Study of Dimensionality Reduction Techniques to Enhance Trace Clustering Performances

Cited 0 times inthomson ciCited 0 times inthomson ci
A Comparative Study of Dimensionality Reduction Techniques to Enhance Trace Clustering Performances
Yang, Hanna
Song, Minseok
Issue Date
Graduate School of UNIST, Master thesis
Process mining aims at extracting useful information from event logs. Recently, in order to improve processes, several organizations such as high-tech companies, hospitals, and municipalities utilize process mining techniques. Real-life process logs from such organizations are usually very large and complicated, since the process logs in general contain numerous activities which are executed by many employees. Furthermore, lots of real-life process logs generate spaghetti-like process models due to the complexity of processes. Traditional process mining techniques have problems with discovering and analyzing real-life process logs which come from less structured processes. To overcome the weaknesses of traditional process mining techniques, a trace clustering has been developed. The trace clustering splits an event log into several subsets, and each subset contains homogenous cases. Even though the trace clustering is useful to handle complex process logs, it is time-consuming and computationally expensive due to a large number of features generated from complex logs. In this thesis, we applied dimensionality reduction (preprocessing) techniques to the trace clustering in order to reduce the number of features. To validate our approach, we conducted experiments to discover relationships between dimensionality reduction techniques and clustering algorithms, and we performed a case study which involves patient treatment processes of a hospital. Among many dimensionality reduction techniques, we used three techniques namely singular value decomposition (SVD), random projection, and principal components analysis (PCA). The result shows that the trace clustering with dimensionality reduction techniques produce higher average fitness values. Furthermore, processing time of trace clustering is effectively reduced with dimensionality reduction techniques. Moreover, we measured similarity between clustering results to observe the degree of changes in clustering results while applying dimensionality reduction techniques. The similarity is resulted differently according to used clustering algorithm.
Technology Management/ Information System/ Entrepreneurship
Go to Link
Appears in Collections:
Files in This Item:
A Comparative Study of Dimensionality Reduction Techniques to Enhance Trace Clustering Performances.pdf Download

find_unist can give you direct access to the published full text of this article. (UNISTARs only)

Show full item record


  • mendeley


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.