A Language to Model and Simulate Data Quality Issues in Process Mining

Comuzzi, Marco; Ko, Jonghyeon; Maggi, Fabrizio Maria

doi:10.1145/3743144

Scholarworks@UNIST

UNIST Library

File Download

There are no files associated with this item.

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Related Researcher

MarcoComuzzi

Comuzzi, Marco: Intelligent Enterprise Lab.

Read More

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Full metadata record

DC Field	Value	Language
dc.citation.endPage	36	-
dc.citation.number	2	-
dc.citation.startPage	1	-
dc.citation.title	JOURNAL OF DATA AND INFORMATION QUALITY	-
dc.citation.volume	17	-
dc.contributor.author	Comuzzi, Marco	-
dc.contributor.author	Ko, Jonghyeon	-
dc.contributor.author	Maggi, Fabrizio Maria	-
dc.date.accessioned	2025-07-24T10:00:00Z	-
dc.date.available	2025-07-24T10:00:00Z	-
dc.date.created	2025-07-24	-
dc.date.issued	2025-06	-
dc.description.abstract	Real-life business process event logs may suffer from significant data quality problems negatively influencing process mining analysis. Over time, a range of approaches has been developed to detect and repair these quality problems. Validation of these approaches tends to be challenging due to the lack of a ground truth. Moreover, the identification and definition of event log quality problems have been tackled mainly through a pattern-based approach, with systematic and extensible methods currently lacking. In this article, we present FLAWD, a formal language for describing event log data quality issues that enables solutions addressing the shortcomings of process mining data quality research identified above. FLAWD can be used to formally describe and possibly reason over event log data quality errors, as well as to guide the development of tools for controlled and sophisticated “polluting” of event logs through which benchmark datasets may be systematically created. We present the abstract syntax grammar of FLAWD and an open-source software tool based on it that allows for the insertion of all so-called event log imperfection patterns in a stochastic manner. We show how FLAWD has been used in our research to generate benchmark datasets and how it can be used to formally describe and replicate a range of errors found in real-life event logs.	-
dc.identifier.bibliographicCitation	JOURNAL OF DATA AND INFORMATION QUALITY, v.17, no.2, pp.1 - 36	-
dc.identifier.doi	10.1145/3743144	-
dc.identifier.issn	1936-1955	-
dc.identifier.scopusid	2-s2.0-105009653864	-
dc.identifier.uri	https://scholarworks.unist.ac.kr/handle/201301/87522	-
dc.identifier.wosid	001532034900001	-
dc.language	영어	-
dc.publisher	ASSOC COMPUTING MACHINERY	-
dc.title	A Language to Model and Simulate Data Quality Issues in Process Mining	-
dc.type	Article	-
dc.description.isOpenAccess	TRUE	-
dc.type.docType	Article	-
dc.description.journalRegisteredClass	scopus	-

Show Simple Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1403 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.