The goal of this research project is to design and evaluate a strategy that exploits multimodal information for improving the user image collection exploration process. Thanks to the development of computer and communication technologies large amount of images are widely available in different contexts. The main problem is how to provide an efficient and effective access to them. Exploration has been shown to be an effective approach to access image collections, showing an improved performance when compared to traditional access mechanism such as keyword-based searching. However, up to now, exploration has not been widely adopted as an image retrieval mechanism in large scale systems. This can be attributed to: (1) lack of semantic meaningfulness in both the image representation and visualization, and (2) scalability to large image collections. There is a lot of work to address each issue, but this proposal focuses on the first one.
In most of the cases images are not alone, that is, images have additional information that can be exploited in each stage of the construction of an image collection exploration system. In this work, the combination of visual and non-visual complementary information is called multimodal information. The main research question to be addressed in this project is: How to exploit multimodal information to improve the image collection exploration process? To address this question, this thesis proposes to design a strategy that involves multimodal information in each stage of the construction of an image collection exploration system. This strategy will be based on techniques from machine learning, information retrieval and information visualization areas. Experimental evaluation to assess the effectiveness and efficiency of the proposed strategy will be performed focusing on a biomedical image exploration problem. |