Saliency Prediction

Comparing the predictions of low level (simple intensity contrast features; blue) and high level (DeepGaze II; red) models of human fixations (Kümmerer et al., 2017).
Humans don't perceive all of their field of view at a constant resolution. Instead, we have a small central area, the fovea, where we perceive in highest detail. Outside of the fovea, we can see only in lower resolution. Because of that, humans have to make eye movements to sample their visual environment at locations that seem relevant. Learning what properties of an image are associated with human gaze placement is important both for understanding how biological systems explore the environment and for computer vision applications. By building models of fixation prediction that make use of recent advances in deep learning, we try to understand which image features contribute to fixation placement.

Unifying Saliency Metrics

There are many saliency models trying to predict where people look. In order to understand what actually drives fixations, it is important to compare the predictive power of those models. However, benchmarking saliency models is very problematic because there are many competing and disagreeing metrics in use. Many models are good in some metrics but very suboptimal in other metrics. This hinders progress since it makes an assessment of progress and state of the art impossible. We work at understanding what causes the differences between existing metrics and show that it is possible to avoid most inconsistencies in saliency benchmarking.

Key Papers

M. Kümmerer, T. S. Wallis, L. A. Gatys, and M. Bethge
Understanding Low- and High-Level Contributions to Fixation Prediction
The IEEE International Conference on Computer Vision (ICCV), 2017
Code, URL, PDF, model webservice, BibTex

M. Kümmerer, T. S. A. Wallis, and M. Bethge
Saliency Benchmarking Made Easy: Separating Models, Maps and Metrics
The European Conference on Computer Vision (ECCV), 2018
URL, PDF, BibTex
University of Tuebingen BCCN CIN MPI