Evidence driven image interpretation by combining implicit and explicit knowledge in a bayesian network
Computer vision techniques have made considerable progress in recognizing object categories by learning models that normally rely on a set of discriminative features. However, a drawback of those models is that, in contrast to human perception that makes extensive use of logic-based rules, they fail to benefit from knowledge that is provided explicitly. In this manuscript we propose a framework that is able to perform knowledge assisted analysis of visual content. We use ontologies to model domain knowledge and a set of conditional probabilities to model the application context. Then, a bayesian network (BN) is used for integrating statistical and explicit knowledge and perform hypothesis testing using evidence-driven probabilistic inference. Additionally, we propose the use of a Focus of Attention (FoA) mechanism that is based on the mutual information between concepts. This mechanism selects the most prominent hypotheses to be verified/tested by the BN; hence, removing the need to exhaustively test all possible combinations of the hypotheses set. We experimentally evaluate our framework using content from three domains and for three tasks, namely image categorization, localized region labeling and weak annotation of video shot keyframes. The obtained results demonstrate the improvement in performance, compared to a set of baseline concept classifiers that are not aware of any context or domain knowledge. Finally, we also demonstrate the ability of the proposed FoA mechanism to significantly reduce the computational cost of visual inference, while obtaining results comparable to the exhaustive case.