Vasileios Mezaris
Electrical and Computer Engineer, Ph.D.
Accelerated Kernel Subclass Discriminant Analysis and SVM combination: An efficient dimensionality reduction and classification method, for very high-dimensional data. AKSDA is a new GPU-accelerated, state-of-the-art C++ library for supervised dimensionality reduction and classification, using multiple kernels. It greatly reduces the dimensionality of the input data, while at the same time it increases their linear separability. Used in conjunction with linear SVMs, it achieves SoA classification results, consistently higher than Kernel SVM approaches, at orders-of-magnitude shorter training times. AKSDA builds on our previous MSDA/GSDA/KMSDA methods.
Image aesthetic quality assessment tools. This is a Matlab implementation of the feature extraction process for our Image Aesthetic Quality assessment method. Each image is represented according to a set of photographic rules, and five feature vectors are extracted, describing the image's simplicity, colorfulness, sharpness, pattern and composition.
Real-time video shot and scene segmentation. Software for the automatic temporal segmentation of videos into shots and scenes. The released software can detect both abrupt and gradual shot transitions with high accuracy, by jointly examining global and local image descriptors, and then also group the shots into scenes. The whole video analysis process is more than seven times faster than real-time-processing on an i7 PC, for the CPU version of the software. A GPU version is also available.
Mixture Subclass Discriminant Analysis (MSDA) & Generalized Subclass Discriminant Analysis / Kernel Mixture Subclass Discriminant Analysis (GSDA/KMSDA) software. MSDA is a dimensionality reduction method that alleviates two shortcomings of Subclass Discriminant Analysis (SDA). In short, MSDA modifies the objective function of SDA, and utilizes a partitioning procedure to help with the discrimination of data with Gaussian homoscedastic subclass structure. KMSDA is the kernel-extension of MSDA; GSDA is a speeded-up version of KMSDA. We provide, for non-commercial use, an MSDA and a GSDA/KMSDA implementation in Matlab code.
GPU-accelerated LIBSVM. We have developed an open source package for GPU-assisted Support Vector Machines (SVMs) training, based on the LIBSVM package. Kernel SVMs have gained wide acceptance in many fields of science, due to their accuracy. However, depending on the amount and nature of the training data, as well as on the cross-valitation approach that is followed for optimizing the SVM parameters, the training of SVMs in practise can often become prohibitively slow. The modifications we implemented to the well-known LIBSVM package revolve around porting the computation of the kernel matrix elements to the GPU, so as to significantly decrease the processing time for SVM training without altering the classification results compared to the original LIBSVM.
Scene segmentation evaluation utility. We have developed a unidimensional measure for evaluating the goodness of video temporal segmentation to scenes. The developed measure is called Differential Edit Distance (DED). DED satisfies the metric properties, and is shown to be fast and effective in evaluating the results of scene segmentation methods and also in helping to optimize such methods' parameters.
Concept detection scores for IACC.3. We provide a complete set of concept detection scores, using our state of the art concept detectors, for the videos of the IACC.3 dataset used in the TRECVID AVS Task from 2016 and on (600 hr of internet archive videos). Concept detection scores for 1345 concepts (1000 ImageNet concepts and 345 TRECVID SIN concepts) have been generated using two different methods.
Concept detection scores for MED16train. We provide concept detection scores for the MED16train dataset, which is used in the TRECVID Multimedia Event Detection (MED) task. Scores for two concept sets (487 sports-related concepts of the YouTube Sports-1M Dataset, and 345 TRECVID SIN concepts) have been generated for video keyframes, at a temporal sampling rate of 2 keyframes per second.
The CERTH-ITI-VAQ700 dataset. We provide a comprehensive video dataset for the problem of aesthetic quality assessment of user-generated video. The dataset includes i) 700 videos in .mp4 format, ii) video features suitable for aesthetic quality assesssment, that we extracted, and iii) ground-truth aesthetics annotations (both by each individual annotator, and consensus results), generated by 5 annotators per video.
The 2015 and 2014 Synchronization of Multi-User Event Media datasets. The datasets, ground-truth time-synchronization results and corresponding evaluation script that were created and used in the 2015 and 2014 editions of the Synchronization of Multi-User Event Media (SEM) task of the MediaEval benchmarking activity are available for download. The datasets comprise multiple galleries of images, videos and audio recordings per event, for several real-life events attended by multiple individuals.
The 2014 Social Event Detection dataset. The dataset, challenge definitions, ground truth challenge results and corresponding evaluation script that were created and used in the 2014 edition of the Social Event Detection (SED) task of the MediaEval benchmarking activity are available for download. The SED2014 dataset has two parts, the first containing 362,578 images belonging to 17,834 events, and the second containing 110,541 images.
The CERTH Image Blur Dataset. The dataset consists of 2450 digital images (1219 undistorted, 631 naturally-blurred and 600 artificially-blurred images), divided to a training set and an evaluation set. This dataset is used for the evaluation of image quality assessment methods (specifically, methods detecting blurring in images). An important feature of it is that it contains naturally-blurred images as well, rather that only artificially-blurred ones.
The 2012 Social Event Detection dataset. The dataset (167.332 images), challenge definitions, ground truth challenge results and evaluation script that were created and used in the 2012 edition of the Social Event Detection (SED) task of the MediaEval international benchmarking activity are available for download. The SED task requires participants to discover social events and detect related media items in a collection of images that are accompanied by metadata typically found on the social web (including time-stamps, tags, geotags for a small subset of them). Finding the events, in this task, means finding a set of photo clusters, each cluster comprising only photos associated with a single event (thus, each cluster defining a retrieved event). The dataset and the related materials are publicly available for non-commercial use.
SCEF image dataset for spatial context evaluation. A dataset used in the evaluation of object-level spatial context techniques is available for download. It compises 922 outdoor images of various semantic categories, annotated at the region level with 10 different concepts. The materials available for download include a) the images, b) the image segmentation masks, c) region-level manual image annotations, d) a set of extracted low-level features, e) a set of computed fuzzy directional spatial relations, and f) a set of region classification results based solely on visual information. The dataset is publicly available for non-commercial use.
