DIMATEX DATA: Fast Image Auto-annotation with Visual Vector Approximation Clusters

Welcome in the README of DIMATEX DATA

25 july 2005

This web page contains the data used and produced by the image auto-annotation model DIMATEX published in: 
H. Glotin, S. Tollari, Fast Image Auto-annotation with Visual Vector Approximation Clusters, 
International Workshop on Content-Based Multimedia Indexing (CBMI2005), 
Riga, Latvia, june 2005 [PDF] [POSTER] [BIB]

# Corpus Copyright:
COREL images in the demo are copyrighted by Wang J. of Penn State University.
Some pre-processed image signals can be downloaded from Kobus Barnard web page
http://vision.cs.arizona.edu/kobus/research/data/jmlr_2003/

Below we provide our normalized lexicon and the auto-annotations from DIMATEX model. 
You are free to use the files below for academic stuff, 
if so please cite in your productions our CBMI 2005 paper [BIB].

# Lexicon normalization:
"vocabulary_adaptation.txt" file displays our COREL normalized lexicon of 267 words
manually generated at LSIS lab from the 325 original words.
We mostly normalized some plurals and generalized some words like F-16 to plane.

# Auto-annotations from DIMATEX model:
Files TEST_LIST_WORD_REF_EST_FLAB and TEST_LIST_WORD_REF_EST_FLABT provide 
auto-annotations from DIMATEX model.

# Demonstration for FLABT DIMATEX experiments [DEMO] Login: DIMATEX Password: DIMATEX

FILES and their FORMAT:

image_name_fig6_CBMI2005.txt
NOVEL_LIST_WORD_REF
TEST_LIST_WORD_REF_EST_FLAB
TEST_LIST_WORD_REF_EST_FLABT
TRAIN_LIST_WORD_REF
vocabulary_adaptation.txt
words_13911
# File vocabulary_adaptation.txt
Give the vocabulary adaptation of 267 words manually generated at LSIS from the 325 original words.
We mostly normalized some plurals and generalized some words like F-16 to plane.
File format:
newword1:oldword1
newword2:oldword2
...

# File words_13911
The normalized lexicon used in CBMI paper. 
The line number is the word number.

# File TRAIN_LIST_WORD_REF
Columns
1 	Train set image COREL number.
2:6 	Number of the COREL reference words (in our lexicon) for each image. 0 is for no word.

# File TEST_LIST_WORD_REF_EST_FLAB/TEST_LIST_WORD_REF_EST_FLABT
Columns	
1 	Test set image COREL number.
2:6 	Number of the COREL reference words (in our lexicon) for each image. 0 is for no word.
7:16 	The 10 first estimated words sorted from high to low probabilities (Method = E1 FLAB see CBMI05).

# File NOVEL_LIST_WORD_REF (other images from Kobus Barnard set)
Columns
1 	Novel image number.
2:6 	Reference 

# File image_name_fig6_CBMI2005.txt
List the image numbers demonstrated in figure 6 of the paper.

## For any question http://webia.lip6.fr/~tollaris/, sabrina.tollari (at) lip6.fr or  http://glotin.univ-tln.fr, glotin (at) univ-tln.fr