Extending and Implementing Process Mining Techniques Improving Data Quality - Prevention and Mitigation
Team
- E/18/010, Abeywickrama A.K.D.A.S., email
- E/18/156, Jayathilake W.A.T.N., email
- E/18/329, Sewwandi D.W.S.N., email
Supervisors
- Prof. Roshan G. Ragel, email
- Dr. Asitha Bandaranayake, email
- Dr. Damayanthi Herath, email
- Prof. Athur ter Hofstede, email
- Dr. Chathura Ekanayake, email
Table of content
- Abstract
- Related works
- Methodology
- Experiment Setup and Implementation
- Results and Analysis
- Conclusion
- Publications
- Links
Abstract
Process mining has become a valuable tool for analyzing business processes. However, the quality of the data used, specifically event logs, significantly impacts the accuracy of the results. This project addresses the challenge of process-data quality by focusing on prevention and mitigation strategies. We leverage the Odigos framework to identify root causes of data quality issues within event logs. Additionally, we contribute to the PraeclarusPDQ framework by developing a software plug-in to analyze these root causes for specific event log imperfection patterns.
Background
Process mining utilizes event logs, which record process execution information, to discover, analyze, and improve business processes. The quality of these event logs is crucial for obtaining reliable results. Poor data quality, such as missing information or inconsistencies, can lead to inaccurate process models and misleading insights. Event log imperfection patterns categorize common data quality issues encountered in practice. These patterns significantly hinder the effectiveness of process mining.
Related works
Recent research by Suriadi Suriadi, Robert Andrews, Arthur H. M. ter Hofstede, and Moe Thandar Wynn. Event log imperfection patterns for process mining identified various event log imperfection patterns. While solutions exist for some patterns as demonstrated in A contextual approach to detecting synonymous and polluted activity labels in process event logs, a standardized approach is lacking. The PraeclarusPDQ framework, currently under development, aims to capture various solutions for these imperfection patterns and other process-data quality issues.
Methodology
This project falls under Project Type III: Process-Data Quality: Prevention and Mitigation. We will utilize the Odigos framework proposed by Fahame Emamjome, Robert Andrews, Arthur H. M. ter Hofstede, and Hajo A. Reijers. “Alohomora: Unlocking data quality causes through event log context and further explored in Root-cause analysis of process-data quality problems. This framework provides a structured approach for identifying the root causes of process-data quality problems. We will develop a software plug-in for the PraeclarusPDQ framework specifically designed to analyze root causes of chosen event log imperfection patterns.
Experiment Setup and Implementation
- We will utilize sample event logs containing various event log imperfection patterns
- The plug-in will be tested on its ability to identify root causes associated with these patterns within the event logs
Results and Analysis
Conclusion
This project contributes to improving process-data quality for process mining by leveraging the Odigos framework to identify root causes and developing a PraeclarusPDQ plug-in for targeted analysis of event log imperfection patterns. The project findings will provide valuable insights for preventing and mitigating process-data quality issues, ultimately leading to more reliable process mining results.
Publications
- Semester 7 report
- Semester 7 slides
-
Suriadi Suriadi, Robert Andrews, Arthur H. M. ter Hofstede and Moe Thandar Wynn “Event log imperfection patterns for process mining: Towards a systematic approach to cleaning event logs. Information Systems, 64:132–150” (2017). SAtHW17.
-
Sareh Sadeghianasl, Arthur H. M. ter Hofstede, Moe Thandar Wynn, and Suriadi Suriadi “ A contextual approach to detecting synonymous and polluted activity labels in process event logs.” (2019). StHWS19.
-
Fahame Emamjome, Robert Andrews, Arthur H. M. ter Hofstede, and Hajo A. Reijers. “Alohomora: Unlocking data quality causes through event log context.” (2020). EAtHR20.
- Robert Andrews, Fahame Emamjome, Arthur H.M. ter Hofstede, and Hajo A. Reijers. “Root-cause analysis of process-data quality problems. Journal of Business Analytics, 5(5):51–75” (2022). AEtHR22.