Conference

  1. A. Goulas, D. Galanopoulos, I. Patras, V. Mezaris, "MLLM Frame Subset Ensembling for Audio-Visual Video QA and MLLM-based Reranking for Ad-hoc Video Search in TRECVID 2025.", in Proceedings of the TRECVID 2025 Workshop, December 2025.
  2. D. Galanopoulos, A. Goulas, V. Mezaris, "Cross-modal Image Recommendation for News Articles by Multimodal Foundation Models-based Retrieval-Reranking.", in Proceedings of the 2025 Multimedia Evaluation Workshop (MediaEval'25), Dublin, Ireland, October 2025.
  3. A. Goulas, V. Mezaris, I. Patras, "VidCtx: Context-aware Video Question Answering with Image Models.", in Proceedings of the IEEE International Conference on Multimedia and Expo (ICME 2025), Nantes, France, June-July 2025. https://arxiv.org/abs/2412.17415
  4. D. Galanopoulos, A. Goulas, A. Leventakis, I. Patras, V. Mezaris, "An LLM Framework for Long-form Video Retrieval and Audio-Visual Question Answering Using Qwen2/2.5.", in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Nashville, TN, USA, June 2025, pp. 3730-3739, doi: 10.1109/CVPRW67362.2025.00358