Explaining Protein-Protein Interaction Predictions with Genetic Programming

Abstract

Explainability is crucial to support the adoption of machine learning as a tool for scientific discovery. In the biomedical domain, ontologies and knowledge graphs are a unique opportunity to explore domain knowledge, but most knowledge graph-based approaches employ graph embeddings, which are not explainable. However, when the prediction target is finding a relation between two entities represented in the graph, such as in the case of protein-protein interaction prediction, semantic similarity presents itself as a natural explanatory mechanism.

This work uses genetic programming over a set of semantic similarity values, each describing a semantic aspect represented in the knowledge graph, to generate global and interpretable explanations for proteinprotein interaction prediction. Our experiments reveal that genetic programming algorithms coupled with semantic similarity produce global models relevant to understanding the biological phenomena.