Various post-hoc interpretability methods exist to evaluate the results of machine learning classification and prediction tasks. To better understand the performance and reliability of such methods, which is particularly necessary in high-risk applications, Turbe et al. have developed a framework for quantitative comparison of post-hoc interpretability approaches in time-series classification.</i>. Thanks to University of Geneva collaborators: Hugues Turbé, Mina Bjelogrcic, and Christian Lovis,

Plain-language description


You can read more here: nus-cde
and here: unige

Abstract


Post-hoc interpretability methods are critical tools to explain neural-network results. Several post-hoc methods have emerged in recent years but they produce different results when applied to a given task, raising the question of which method is the most suitable to provide accurate post-hoc interpretability. To understand the performance of each method, quantitative evaluation of interpretability methods is essential; however, currently available frameworks have several drawbacks that hinder the adoption of post-hoc interpretability methods, especially in high-risk sectors. In this work we propose a framework with quantitative metrics to assess the performance of existing post-hoc interpretability methods, particularly in time-series classification. We show that several drawbacks identified in the literature are addressed, namely, the dependence on human judgement, retraining and the shift in the data distribution when occluding samples. We also design a synthetic dataset with known discriminative features and tunable complexity. The proposed methodology and quantitative metrics can be used to understand the reliability of interpretability methods results obtained in practical applications. In turn, they can be embedded within operational workflows in critical fields that require accurate interpretability results for, example, regulatory policies.