next up previous
Next: Bibliography Up: Enhancement of Textual Images Previous: Combining visual and textual


Discussion and Conclusion

Our corpus being only of 600 images, our method must be tested on a basis of more significant data in order to refine the results. Other visual attributes as texture or the form could be used. Many criteria and parameters remain to be studied to improve visual description, as the influence of the size and the form of the areas of interest. The number of pixels of each local image is a parameter which could be optimized. In this first use, it is fixed a priori at 1/4 of the number of pixel of the global image. It would be interesting to compare the performances of the system by taking more reduced or focused local images, of about 1/16 of the number of pixels of the global image. Indeed, more the visual features are focused on the relevant areas of the image, thus including less background noise, more classification should be precise.

Moreover the description of certain images by local area of interest can be more beneficial for certain types of images than for others. An automatic method determining if an image is of this type or not, would increase the performance of the system. A possible extension to this study would thus consist on the adaptive calculation of the size of the local images according to a measurement of the edge density on the image or an entropic criterion. Indeed, more the image is expected to contain information, more the local images can be numerous but of reduced size.

Figure 4: The system is expected to be an efficient filter for image search results.
\begin{figure}\centerline{\psfig{figure=images/family_society.ps,width=12cm}}\end{figure}

We presented a simple system for unifying textual and visual informations. We showed that visual information reduces the errors of the textual information without thesaurus of about 50%, which is very promising because of the simplicity of the method. Our system can be added like a fast visual filter (see figure 4) on the result of a request of images on a search engine (such as Google), requested with a small number of keywords, and thus without the use of thesaurus (otherwise no image is found by the search engine).

We could reverse the experiment by considering the textual indices compared to visual classes. This method would make possible to correct a bad textual indexing using the visual content. For example, if a statistical plot image of the working population was labelled automatically by `woman' and `worker', a comparison with visual classes representing `woman' would highlight the indexation error. Therefore, it could automatically remove the word `woman' from the keyword set of this image.


next up previous
Next: Bibliography Up: Enhancement of Textual Images Previous: Combining visual and textual
Tollari Sabrina 2003-08-28