Offered Subjects
Offered Theses
Necessity and Sufficiency in Explainable AI Methods
- Type:
- Master Thesis Business Information Systems
- Status:
- offered
- Tutor:
Abstract
The literature on artificial intelligence (AI) explanations comprises two primary explanation methods: attribution-based and counterfactual-based. Through the differences in these approaches, two criteria for good explanations are optimized: necessity and sufficiency. Methods looking for counterfactual explanations elicit necessary features, while methods that look at feature attribution focus on sufficient feature values. Mothilal et al. (2021) propose a framework unifying both methods to evaluate the different approaches with respect to those two criteria for good explanations. Research into metrics for evaluating explanations is relevant because, unlike most prediction and classification tasks, there is no ground truth to evaluate the correctness or quality of explanations. Mothilal et al. (2021) rely on three datasets from the credit-scoring domain and a case study on hospital admission to test their framework. We intend to build on this study and examine, whether the results presented by Mothilal et al. (2021) transfer to different datasets and explanation techniques.
This Master thesis project builds on previous work by reviewing novel methods for attribution-based and counterfactual-based explanations from the literature, applying these to a new selection of datasets from the medical domain, and evaluating whether more recent approaches to AI explainability better fulfill the criteria of necessity and sufficiency.
Reference:
- Mothilal, R.K., Mahajan, D., Tan, C., & Sharma, A. (2021, July). Towards unifying feature attribution and counterfactual explanations: Different means to the same end. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society (pp. 652-663).