Extending and Implementing Process Mining Techniques Improving Data Quality - Prevention and Mitigation

Team

Supervisors

Table of content

  1. Abstract
  2. Related works
  3. Methodology
  4. Experiment Setup and Implementation
  5. Results and Analysis
  6. Conclusion
  7. Publications
  8. Links

Abstract

Process mining has become a valuable tool for analyzing business processes. However, the quality of the data used, specifically event logs, significantly impacts the accuracy of the results. This project addresses the challenge of process-data quality by focusing on prevention and mitigation strategies. We leverage the Odigos framework to identify root causes of data quality issues within event logs. Additionally, we contribute to the PraeclarusPDQ framework by developing a software plug-in to analyze these root causes for specific event log imperfection patterns.

Background

Process mining utilizes event logs, which record process execution information, to discover, analyze, and improve business processes. The quality of these event logs is crucial for obtaining reliable results. Poor data quality, such as missing information or inconsistencies, can lead to inaccurate process models and misleading insights. Event log imperfection patterns categorize common data quality issues encountered in practice. These patterns significantly hinder the effectiveness of process mining.

Recent research by Suriadi Suriadi, Robert Andrews, Arthur H. M. ter Hofstede, and Moe Thandar Wynn. Event log imperfection patterns for process mining identified various event log imperfection patterns. While solutions exist for some patterns as demonstrated in A contextual approach to detecting synonymous and polluted activity labels in process event logs, a standardized approach is lacking. The PraeclarusPDQ framework, currently under development, aims to capture various solutions for these imperfection patterns and other process-data quality issues.

Methodology

This project falls under Project Type III: Process-Data Quality: Prevention and Mitigation. We will utilize the Odigos framework proposed by Fahame Emamjome, Robert Andrews, Arthur H. M. ter Hofstede, and Hajo A. Reijers. “Alohomora: Unlocking data quality causes through event log context and further explored in Root-cause analysis of process-data quality problems. This framework provides a structured approach for identifying the root causes of process-data quality problems. We will develop a software plug-in for the PraeclarusPDQ framework specifically designed to analyze root causes of chosen event log imperfection patterns.

Experiment Setup and Implementation

Results and Analysis

Conclusion

This project contributes to improving process-data quality for process mining by leveraging the Odigos framework to identify root causes and developing a PraeclarusPDQ plug-in for targeted analysis of event log imperfection patterns. The project findings will provide valuable insights for preventing and mitigating process-data quality issues, ultimately leading to more reliable process mining results.

Publications

  1. Semester 7 report
  2. Semester 7 slides
  3. Suriadi Suriadi, Robert Andrews, Arthur H. M. ter Hofstede and Moe Thandar Wynn “Event log imperfection patterns for process mining: Towards a systematic approach to cleaning event logs. Information Systems, 64:132–150” (2017). SAtHW17.

  4. Sareh Sadeghianasl, Arthur H. M. ter Hofstede, Moe Thandar Wynn, and Suriadi Suriadi “ A contextual approach to detecting synonymous and polluted activity labels in process event logs.” (2019). StHWS19.

  5. Fahame Emamjome, Robert Andrews, Arthur H. M. ter Hofstede, and Hajo A. Reijers. “Alohomora: Unlocking data quality causes through event log context.” (2020). EAtHR20.

  6. Robert Andrews, Fahame Emamjome, Arthur H.M. ter Hofstede, and Hajo A. Reijers. “Root-cause analysis of process-data quality problems. Journal of Business Analytics, 5(5):51–75” (2022). AEtHR22.