Ινστιτούτο Τεχνολογιών Πληροφορικής και Επικοινωνιών

el en

Δημοσιεύσεις Ι.Π.ΤΗΛ.

Deep cross-layer activation features for visual recognition

Convolutional Neural Networks (CNNs), which have nowadays dominated image analysis tasks, constitute feed-forward methods that model increasingly complex data structures and patterns along the subsequent hidden layers of the network. However, the common practice of using the activation features from the last network layer inevitably leads to a visual recognition bottleneck. This is due to the fact that discriminative features for different objects of varying complexity do not need to be extracted from the same layer. To this end, a novel frequency domain analysis of the feature maps of the same as well as of different network layers is proposed. In this way, the proposed method exploits more efficiently the knowledge that is stored in the actual CNN and facilitates in identifying the most discriminative features for every individual object type. Experimental results in a large-scale real-world Closed-Circuit Television (CCTV) surveillance and the PASCAL VOC 2012 datasets demonstrate the efficiency of the proposed approach.

Συνημμένα

Ινστιτούτο Τεχνολογιών Πληροφορικής και Επικοινωνιών

Τ.Θ.60361, 6ο χλμ Χαριλάου - Θέρμης, 57001, Θεσσαλονίκη
τηλ. 2311 257701-3 / fax. 2310 474128
email: