EdgeSHAPer: Bond-Centric Shapley Value-Based Explanation Method for Graph Neural Networks

Tools

Mastropietro, Andrea, Pasculli, Giuseppe, Feldmann, Christian, Rodriguez Perez, Raquel and Bajorath, Juergen (2022) EdgeSHAPer: Bond-Centric Shapley Value-Based Explanation Method for Graph Neural Networks. iScience. ISSN 25890042

Official URL: https://www.cell.com/iscience/

Abstract

Graph neural networks (GNNs) are becoming increasingly popular for many deep machine learning (ML) applications in science. By recursively propagating neural signals along the edges of an input graph, GNNs integrate node feature information with graph structure. One of their attractions is the ability to learn object representations from graphs, hence alleviating the need for feature engineering. However, as is the case for other deep neural networks, GNN models are complex and have notorious black box character, which works against their acceptance for experimental design in interdisciplinary research settings. Hence, with the advent of deep learning in many scientific areas, increasing attention is also being paid to approaches explaining ML models and their predictions. However, for GNNs, only few approaches are currently available to rationalize model decisions. In this work, we introduce EdgeSHAPer, a generally applicable method for explaining any GNN-based model. The approach is specifically devised to assess edge importance for predictions, which is its characteristic feature. EdgeSHAPer makes use of the Shapley value concept from game theory to quantify feature importance for individual predictions. In our-proof-of-concept study, it is applied to compound activity prediction, a central task in computational medicinal chemistry and drug discovery. For chemical predictions, EdgeSHAPer’s edge centricity is particularly relevant because edges represent bonds in molecular graphs. In combination with feature mapping, we show that EdgeSHAPer produces meaningful explanations for accurate compound activity predictions, demonstrating that GNN decisions are often centered on bond information. Compared to a popular node-centric and the only other currently available edge-centric explanation method, EdgeSHAPer reveals higher resolution in differentiating features determining predictions and identifies minimal pertinent positive feature sets.

Item Type:	Article
Date Deposited:	17 Sep 2022 00:45
Last Modified:	17 Sep 2022 00:46
URI:	https://oak.novartis.com/id/eprint/47826

Search

Contact Us

oak.support@novartis.com