File Download

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

A Comparative Study of Dimensionality Reduction Techniques to Enhance Trace Clustering Performances

Author(s)
Yang, Hanna
Advisor
Song, Minseok
Issued Date
2012-08
URI
https://scholarworks.unist.ac.kr/handle/201301/82715 http://unist.dcollection.net/jsp/common/DcLoOrgPer.jsp?sItemId=000001396432
Abstract
Process mining aims at extracting useful information from event logs. Recently, in order to improve processes, several organizations such as high-tech companies, hospitals, and municipalities utilize process mining techniques. Real-life process logs from such organizations are usually very large and complicated, since the process logs in general contain numerous activities which are executed by many employees. Furthermore, lots of real-life process logs generate spaghetti-like process models due to the complexity of processes. Traditional process mining techniques have problems with discovering and analyzing real-life process logs which come from less structured processes. To overcome the weaknesses of traditional process mining techniques, a trace clustering has been developed. The trace clustering splits an event log into several subsets, and each subset contains homogenous cases. Even though the trace clustering is useful to handle complex process logs, it is time-consuming and computationally expensive due to a large number of features generated from complex logs.
In this thesis, we applied dimensionality reduction (preprocessing) techniques to the trace clustering in order to reduce the number of features. To validate our approach, we conducted experiments to discover relationships between dimensionality reduction techniques and clustering algorithms, and we performed a case study which involves patient treatment processes of a hospital. Among many dimensionality reduction techniques, we used three techniques namely singular value decomposition (SVD), random projection, and principal components analysis (PCA).
The result shows that the trace clustering with dimensionality reduction techniques produce higher average fitness values. Furthermore, processing time of trace clustering is effectively reduced with dimensionality reduction techniques. Moreover, we measured similarity between clustering results to observe the degree of changes in clustering results while applying dimensionality reduction techniques. The similarity is resulted differently according to used clustering algorithm.
Publisher
Ulsan National Institute of Science and Technology (UNIST)
Degree
Master
Major
Graduate School of UNIST (by Program, 2012-2013) Technology Management/Information System/Entrepreneurship

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.