The UCLA dataset has been the benchmark for DT recognition for several years, even though a much larger and more diverse database (the DynTex database) exists. The UCLA dataset remained the benchmark due to the following reasons:
Its DT sequences have already been pre-processed from their raw form, whereby each sequence is cropped to show its representative dynamics in absence of any static or dynamic background.
Only a single DT is present in each DT sequence.
In each DT sequence, no panning or zooming is performed.
Ground truth labels of the DT sequences are provided.
Although some researchers have applied their recognition algorithms on the DynTex dataset, it is difficult to manage/use because it lacks the above four properties, in its present form. Therefore, we proposed the compilation of a new benchmark dataset, called DynTex++. For more details, refer to my work in . Please email me with any questions/comments regarding the dataset and I will get back to you as soon as I can. To compare against the DL-PEGASOS algorithm in , use this code.
NOTE: If you use the DynTex++ dataset, please cite this work:
. Bernard Ghanem and Narendra Ahuja, "Maximum Margin Distance Learning for Dynamic Texture Recognition", European Conference on Computer Vision (ECCV 2010)
The goal here is to organize the raw data in the DynTex dataset in order to provide a richer benchmark in the same way the UCLA dataset is currently. The original database is already publicly available (~2GB of data); however, only the raw AVI videos are provided. We proceeded to filter, pre-process, and label these DT sequences. While DynTex contains a total of 656 video sequences, DynTex++ uses only 345 of them. We eliminated sequences that contained more than one DT, contained dynamic background, included panning/zooming, or did not depict much motion. The remaining sequences were then hand labeled as one of 36 classes. They were not uniformly distributed among the N classes. We preprocessed them so each class contained the same number of subsequences. The pre-processing steps were:
Each sequence was spatially down-sampled by a factor of 0.75 and converted to grayscale intensity.
Since it is infeasible to manually crop these sequences, we randomly selected a large (1000) set of subsequences of fixed size (50x50x50), each of which is attributed a relevance score that represents how much motion it entails. This score is the average optical flow energy in the subsequence. By doing this, static background subsequences are eliminated from consideration and the more relevant DT subsequences remain.
After sorting the subsequences according to their relevance scores, we selected the highest 100 in each class (uniformly chosen from the video sequences constituting this class), thus, resulting in a database of 3600 subsequences.
The ground truth classes of the 36 DynTex++ classes were determined according to the following description. The ground truth labeling was done manually by visually clustering the 345 DynTex AVI files into meaningful categories of spatiotemporal variation.
The DynTex++ dataset can be downloaded here (~570MB). In order to keep a similar structure to the UCLA benchmark, the DynTex++ directory contains the following MATLAB files:
dyntex++_classes.mat: this mat file contains information of the breakdown of the classes with respect to the DynTex original dataset. It contains class_inventory, a cell array that identifies the DynTex AVIs from which the class is formed.
dyntex++_info.mat: this mat file contains information about the 3600 subsequences that form the DynTex++ dataset. class_index_list is the list of the ground truth labels (36 classes) for the subsequences. Similar to the UCLA benchmark, imagemaster contains information of each subsequence, including the class it belongs to, information about how it was extracted from the original AVI (i.e. left corner and the AVI name), and the path to the the subsequence saved in the imgdb directory.
imgdb [directory]: this folder contains the gray scale versions of the subsequences forming the DynTex++ dataset.