Institute of Computer Science PhD student Ilya Verenich defended his thesis on February 11. Here is an overview of his work. 

Context. Modern enterprise systems collect a lot of data about the execution of the business processes they support. The widespread availability of such data in companies, combined with advances in machine learning, have led to the emergence of data-driven and predictive approaches to monitor the performance of business processes. By using such predictive process monitoring approaches, potential performance issues can be anticipated and addressed before they actually occur.

Motivation. Most existing approaches for predictive monitoring prioritize accuracy of predictions over explainability (or interpretability). Yet in practice, explainability is a critical property of predictive methods since it is not enough to accurately predict that a running process instance will end up in an undesired outcome. It is also important for people to understand why this prediction is made and what can be done to prevent this undesired outcome.

Contribution. This thesis proposes two methods to build predictive models to monitor business processes in an explainable manner. This is achieved by decomposing a prediction into its elementary components. For example, to explain that the remaining execution time of a process execution is predicted to be 20 hours, we decompose this prediction into the predicted execution time of each activity that is expected to be executed. We evaluate the proposed methods against each other and against methods from the related work, using several business processes from multiple domains. The evaluation shows a fundamental trade-off between explainability and accuracy of predictions.

How can people use it? The research contributions of the thesis have been integrated into an open-source web-based process analytics platform called Apromore. It can be used to train predictive models using the methods described in this thesis, as well as third-party methods. These models are then used to make predictions for ongoing process instances, at runtime, and visualize them in a dashboard.

Ilya Verenich’s theses can be found here.