File Download

There are no files associated with this item.

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)
Related Researcher

MarcoComuzzi

Comuzzi, Marco
Intelligent Enterprise Lab.
Read More

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

A Language to Model and Simulate Data Quality Issues in Process Mining

Author(s)
Comuzzi, MarcoKo, JonghyeonMaggi, Fabrizio Maria
Issued Date
2025-06
DOI
10.1145/3743144
URI
https://scholarworks.unist.ac.kr/handle/201301/87522
Citation
JOURNAL OF DATA AND INFORMATION QUALITY, v.17, no.2, pp.1 - 36
Abstract
Real-life business process event logs may suffer from significant data quality problems negatively influencing process mining analysis. Over time, a range of approaches has been developed to detect and repair these quality problems. Validation of these approaches tends to be challenging due to the lack of a ground truth. Moreover, the identification and definition of event log quality problems have been tackled mainly through a pattern-based approach, with systematic and extensible methods currently lacking. In this article, we present FLAWD, a formal language for describing event log data quality issues that enables solutions addressing the shortcomings of process mining data quality research identified above. FLAWD can be used to formally describe and possibly reason over event log data quality errors, as well as to guide the development of tools for controlled and sophisticated “polluting” of event logs through which benchmark datasets may be systematically created. We present the abstract syntax grammar of FLAWD and an open-source software tool based on it that allows for the insertion of all so-called event log imperfection patterns in a stochastic manner. We show how FLAWD has been used in our research to generate benchmark datasets and how it can be used to formally describe and replicate a range of errors found in real-life event logs.
Publisher
ASSOC COMPUTING MACHINERY
ISSN
1936-1955

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.