Vasileios Mezaris
Electrical and Computer Engineer, Ph.D.
homepage curriculum vitae projects research demos downloads publications contact j b c p

Support Vector Machine with Gaussian Sample Uncertainty (SVM-GSU). This is a novel maximum margin classifier that deals with uncertainty in data input; i.e., it allows each training example to be modeled as a multi-dimensional Gaussian distribution described by its mean vector and covariance matrix (the latter modeling the uncertainty), and defines a cost function that exploits the covariance information. Experimental results verify the effectiveness of this approach in various learning problems.
- Related publications [pami17]
- Software package [download source code]

Incremental Accelerated Kernel Discriminant Analysis. This is a novel incremental dimensionality reduction (DR) technique, that offers excellent numerical stability and is specifically designed for use in incremental learning problems. Coupled with a linear SVM classifier, it offers state-of-the-art classification accuracy and an impressive training time speedup over batch AKDA (which is the current state-of-the-art) and also over traditional LSVM and kernel SVM (KSVM) methods.
- Related publications [mm17a]
- Software package [download software (~40MB zip file)]

InVID Verification Plugin. This web-browser plugin is developed as part of the InVID EU project, to help journalists verify videos on social networks. It allows to quickly fragment videos from various platforms (Facebook, Instagram, YouTube, Twitter, Daily Motion) and extract keyframes, to perform reverse image search on Google, Baidu or Yandex search engines, to collect contextual information for Facebook and YouTube videos, to enhance and explore keyframes through a magnifying lens, to perform advanced queries in Twitter, and to apply forensic filters on still images.
- Related publications [mm17b]
- Software package [download software]

Accelerated Kernel Subclass Discriminant Analysis and SVM combination: An efficient dimensionality reduction and classification method, for very high-dimensional data. AKSDA is a new GPU-accelerated, state-of-the-art C++ library for supervised dimensionality reduction and classification, using multiple kernels. It greatly reduces the dimensionality of the input data, while at the same time it increases their linear separability. Used in conjunction with linear SVMs, it achieves SoA classification results, consistently higher than Kernel SVM approaches, at orders-of-magnitude shorter training times. AKSDA builds on our previous MSDA/GSDA/KMSDA methods.
- Related publications [mm15] [mm16]
(see also [tnnls13], [spl11] for more theoretical foundations)
- Software package [download software]

Image aesthetic quality assessment tools. This is a Matlab implementation of the feature extraction process for our Image Aesthetic Quality assessment method. Each image is represented according to a set of photographic rules, and five feature vectors are extracted, describing the image's simplicity, colorfulness, sharpness, pattern and composition.
- Related publications [icip15]
- Software package [download software]

Real-time video shot and scene segmentation. Software for the automatic temporal segmentation of videos into shots and scenes. The released software can detect both abrupt and gradual shot transitions with high accuracy, by jointly examining global and local image descriptors, and then also group the shots into scenes. The whole video analysis process is more than seven times faster than real-time-processing on an i7 PC, for the CPU version of the software. A GPU version is also available.
- Related publications [csvt11] [icassp14]
- Software package (v1.4.4, updated 10/4/2017) [download software]

Mixture Subclass Discriminant Analysis (MSDA) & Generalized Subclass Discriminant Analysis / Kernel Mixture Subclass Discriminant Analysis (GSDA/KMSDA) software. MSDA is a dimensionality reduction method that alleviates two shortcomings of Subclass Discriminant Analysis (SDA). In short, MSDA modifies the objective function of SDA, and utilizes a partitioning procedure to help with the discrimination of data with Gaussian homoscedastic subclass structure. KMSDA is the kernel-extension of MSDA; GSDA is a speeded-up version of KMSDA. We provide, for non-commercial use, an MSDA and a GSDA/KMSDA implementation in Matlab code.
- Related publications [tnnls13] [spl11]
- MSDA software package [download source code]
- GSDA/KMSDA software package [download source code]

GPU-accelerated LIBSVM. We have developed an open source package for GPU-assisted Support Vector Machines (SVMs) training, based on the LIBSVM package. Kernel SVMs have gained wide acceptance in many fields of science, due to their accuracy. However, depending on the amount and nature of the training data, as well as on the cross-valitation approach that is followed for optimizing the SVM parameters, the training of SVMs in practise can often become prohibitively slow. The modifications we implemented to the well-known LIBSVM package revolve around porting the computation of the kernel matrix elements to the GPU, so as to significantly decrease the processing time for SVM training without altering the classification results compared to the original LIBSVM.
- Related publication [wiamis11]
- GPU-LIBSVM package (updated Oct. 2013) [download source code]
- Short video (demo) [watch video]

Scene segmentation evaluation utility. We have developed a unidimensional measure for evaluating the goodness of video temporal segmentation to scenes. The developed measure is called Differential Edit Distance (DED). DED satisfies the metric properties, and is shown to be fast and effective in evaluating the results of scene segmentation methods and also in helping to optimize such methods' parameters.
- Related publication [csvt12]
- Scene segmentation evaluation utility [download software]

Annotated dataset for sub-shot segmentation evaluation. We provide a video dataset with ground-truth sub-shot segmentation for 33 single-shot videos, for the purpose of evaluating algorithms performing temporal segmentation of video to sub-shots. The ground-truth segmentation was created by human annotators. Overall, our dataset contains 674 sub-shot transitions.
- Related publication [mmm18]
- Dataset [download dataset]

Concept detection scores for IACC.3. We provide a complete set of concept detection scores, using our state of the art concept detectors, for the videos of the IACC.3 dataset used in the TRECVID AVS Task from 2016 and on (600 hr of internet archive videos). Concept detection scores for 1345 concepts (1000 ImageNet concepts and 345 TRECVID SIN concepts) have been generated using two different methods.
- Related publications [mm16] [mmm17]
- Dataset [download dataset]

Concept detection scores for MED16train. We provide concept detection scores for the MED16train dataset, which is used in the TRECVID Multimedia Event Detection (MED) task. Scores for two concept sets (487 sports-related concepts of the YouTube Sports-1M Dataset, and 345 TRECVID SIN concepts) have been generated for video keyframes, at a temporal sampling rate of 2 keyframes per second.
- Related publications [trecvid16] [mmm17]
- Dataset [download dataset]

The CERTH-ITI-VAQ700 dataset. We provide a comprehensive video dataset for the problem of aesthetic quality assessment of user-generated video. The dataset includes i) 700 videos in .mp4 format, ii) video features suitable for aesthetic quality assesssment, that we extracted, and iii) ground-truth aesthetics annotations (both by each individual annotator, and consensus results), generated by 5 annotators per video.
- Related publication [icip16]
- Dataset [download dataset]

The 2015 and 2014 Synchronization of Multi-User Event Media datasets. The datasets, ground-truth time-synchronization results and corresponding evaluation script that were created and used in the 2015 and 2014 editions of the Synchronization of Multi-User Event Media (SEM) task of the MediaEval benchmarking activity are available for download. The datasets comprise multiple galleries of images, videos and audio recordings per event, for several real-life events attended by multiple individuals.
- Related publications [mediaeval14] [mediaeval15]
- SEM2014 dataset [download dataset]
- SEM2015 dataset [download dataset]

The 2014 Social Event Detection dataset. The dataset, challenge definitions, ground truth challenge results and corresponding evaluation script that were created and used in the 2014 edition of the Social Event Detection (SED) task of the MediaEval benchmarking activity are available for download. The SED2014 dataset has two parts, the first containing 362,578 images belonging to 17,834 events, and the second containing 110,541 images.
- Related publication [mediaeval14]
- SED2014 dataset [download dataset]

The CERTH Image Blur Dataset. The dataset consists of 2450 digital images (1219 undistorted, 631 naturally-blurred and 600 artificially-blurred images), divided to a training set and an evaluation set. This dataset is used for the evaluation of image quality assessment methods (specifically, methods detecting blurring in images). An important feature of it is that it contains naturally-blurred images as well, rather that only artificially-blurred ones.
- Related publication [icip14]
- Dataset [download dataset]

The 2012 Social Event Detection dataset. The dataset (167.332 images), challenge definitions, ground truth challenge results and evaluation script that were created and used in the 2012 edition of the Social Event Detection (SED) task of the MediaEval international benchmarking activity are available for download. The SED task requires participants to discover social events and detect related media items in a collection of images that are accompanied by metadata typically found on the social web (including time-stamps, tags, geotags for a small subset of them). Finding the events, in this task, means finding a set of photo clusters, each cluster comprising only photos associated with a single event (thus, each cluster defining a retrieved event). The dataset and the related materials are publicly available for non-commercial use.
- Related publication [mediaeval12]
- SED2012 dataset [download dataset]

SCEF image dataset for spatial context evaluation. A dataset used in the evaluation of object-level spatial context techniques is available for download. It compises 922 outdoor images of various semantic categories, annotated at the region level with 10 different concepts. The materials available for download include a) the images, b) the image segmentation masks, c) region-level manual image annotations, d) a set of extracted low-level features, e) a set of computed fuzzy directional spatial relations, and f) a set of region classification results based solely on visual information. The dataset is publicly available for non-commercial use.
- Related publications [wiamis09] [cviu11]
- Image dataset [download dataset]

© 2015 Vasileios Mezaris