Feature Extraction and Similarity Search in Large Databases (EMPA)

Extraction, analysis and preprocessing of feature vectors as well as nearest neighbor search in (high-dimensional) feature spaces are the key components in a system that supports sophisticated image retrieval from a large collection of images. These areas traditionally belong to different fields. Image processing and graphics have considered feature extraction mainly for the purpose of compression and transmission while database research has considered indexing schemes for attribute-based retrieval in relatively low dimensional spaces. In this proposal for a joint research project, two groups from these two fields (the Database Group around Hans-J. Schek and the Computer Graphics Group around M. Gross) work together in order to solve the underlying fundamental problems: the evaluation and improvement of existing low-level signal analysis methods (so-called feature extraction methods); the influence of the dimensionality of features for the indexing and retrieval cost and for the quality of the result; the development of suitable query languages and querying metaphors that are intuitive for the user; and the adaptation of the so-called relevance feedback technique that is commonly used in information retrieval to enhance the result of a former query.
As a main challenge we want to explore behavior and algorithms while increasing the number and the variety of features much beyond the usual limits in order to better describe individual notions of similarity. This, on the other hand is known as the "dimensionality curse", meaning e.g. that the notion of a nearest neighbor becomes meaningless if the dimensionality is high enough. However, given the results from our previous theoretical and practical work with several hundred features, it seems that we will be able to support similarity search at a large scale and for a variety of notions of similarity. We are confident to solve this problem once we have jointly investigated the fundamental questions of signal analysis and high-dimensional feature vector organization in the proposed effort.


24 month


SFr. 140.000.- by Dr. K. Simon (EMPA)


Prof. M. Gross
Prof. B. Schiele
Dr. K. Simon

Related Research Area

Multimedia Information Management

contacts: Prof. Dr. H.-J. Schek

!!! Dieses Dokument stammt aus dem ETH Web-Archiv und wird nicht mehr gepflegt !!!
!!! This document is stored in the ETH Web archive and is no longer maintained !!!