A Compression Based Distance Measure for Texture - Support Site.

NOTE: All files are freely available but password protected. Simply contact Dr. Eamonn Keogh for access.

Best Student Paper SDM 2010

The Paper.

Bilson J. L. Campana and Eamonn J. Keogh. A Compression Based Distance Measure for Texture. In SIAM International Conference on Data Mining 2010. Journal Version, SDM Pdf, PPT Presentation

Short Abstract.

The analysis of texture is an important subroutine in diverse application areas. Almost all existing texture similarity measures require the careful setting of many parameters which make them exceptionally difficult to avoid over fitting. In this work, we introduce a compression based method for texture measures and construct an efficient and robust parameter-free texture similarity measure, CK-1. We demonstrate the utility of our measure with an extensive empirical evaluation on real-world case studies.

Figure 1 - The Insect dataset and Heraldic shields datasets clustered with the CK-1 distance measure (average linkage clustering). While the images are shown in color for clarity, our distance measure had only access to the grayscale version of the images

We present a simple experiment where human intuition can directly judge the effectiveness of the CK-1 measure. We clustered two sets of images, both of which have previously been used to test the utility of color and shape distance measures [12]. The two datasets are: Heraldic shields extracted from historical manuscripts from the 14th to 16th century, and Insects extracted from various amateur entomologists websites (used with permission). In both cases we selected 12 images which could be objectively or subjectively sorted into six pairs, Figure 1 shows the results.

SOURCE FILES AND DATA SETS

Source Code Instructions.

Listed below are reproductions of the experiments and figures reported in our paper. Each section holds the relevant datasets and coresponding functions to execute the experiments (included in the main source code archive). Each function runs a single experiment on the dataset using the method stated in the function name.

For example: reproCairoFamilyRI_MPEG will reproduce the results for the CAIRO dataset, with family classifcations, and using the rotation invariant MPEG measure. All functions require a single parameter: the path to the root folder named 'Data' in each dataset archive.

The source code is packaged into a single archive. Installation is simple:

  • Unzip archive to MasterSourceDirectory.
  • In matlab, cd to MasterSourceDirectory.
  • Run function installLabwork().

Source.

Source code V1.02. Zip, 7z

Brodatz Textures (Positives) experiments.

This dataset consists of a diverse set of 112 images of man-made and natural textures (grass, straw, cloth etc), digitalized from images from a reference photographic album for artists and designers. While not a particularly interesting dataset, it is, with a huge margin, the most studied dataset in texture research. Unfortunately, there are many variants of it. Our version was obtained mostly from a publicly available online image database [1]. This set was missing slate 14, which we added directly from an original copy of the text held at our campus library [2]. For our experiments, we treat each image as a separate class and divide the image into sixteen non-overlapping, uniform images. We negate the film negatives in order to produce a visually intuitive image. We also provide the uncropped, negated source images for further experimental comparisons.

Brodatz Texture dataset. Link

Uncropped Brodatz Texture dataset. Link

  • reproBrodatzMPEG
  • reproBrodatzGabor
  • reproBrodatzTexton

CAIRO experiments.

This dataset consists of 100 images of 10 species of tropical wood provided by the Center for Artificial Intelligence and Robotics [3]. Each species is represented by 10 photographs taken at a microscopic level. The images are also evenly split into two families of wood, Leguminosae and Dipterocarpaceae. The dataset is classified in two approaches: a two-class problem across family designations and a ten-class problem across species classifications.

CAIRO dataset. (Requesting distribution permissions)

  • reproCairoFamilyMPEG
  • reproCairoFamilyRI_MPEG
  • reproCairoFamilyGabor
  • reproCairoFamilyTexton
  • reproCairoSpeciesMPEG
  • reproCairoSpeciesRI_MPEG
  • reproCairoSpeciesGabor
  • reproCairoSpeciesTexton

Camouflauge experiments.

This dataset consists of seventy images of nine varieties of modern US military camouflage. The images were created by photographing military t-shirts and fabrics at random orientations.

Camouflauge dataset. Link

  • reproCamoMPEG
  • reproCamoGabor
  • reproCamoTexton

KTH-TIPS experiments.

The KTH-TIPS [4] (Textures under varying Illumination, Pose, and Scale) texture database exists as an extension of the CURet database [5] by adding variances in scale and by photographing from multiple samples in a single class. The dataset consists of 810 images from ten classes.

KTH-TIPS dataset. Link

  • reproKthtipsMPEG
  • reproKthtipsRI_MPEG
  • reproKthtipsGabor
  • reproKthtipsTexton

Moth experiments.

This collection consisting of the images of 774 live moth individuals, each moth belonging to one of 35 different species found in the British Isles [6]. It is important to note that unlike most collections, which feature dead moths, carefully posed and photographed in ideal conditions in a lab, this datasets contains images of living moths photographed outdoors in a variety of conditions over a year. We consider three variants of this dataset; the original data, in which the moth occupies about 10% of the image area, center cropped, where an approximate bounding box was placed around the image, and a cleaned version where the background was deleted with a semi-automatic technique.

Moth dataset. (Requesting distribution permissions)

  • reproMothCleanedMPEG
  • reproMothCleanedGabor
  • reproMothCleanedTexton
  • reproMothCroppedMPEG
  • reproMothCroppedGabor
  • reproMothCroppedTexton
  • reproMothOrignalMPEG
  • reproMothOrignalGabor
  • reproMothOrignalTexton

Nematode experiments.

As noted in the introduction, nematodes are a diverse phylum of "wormlike" animals, with great commercial and medical importance. The department of nematology at UCR, one of the leading institutions of in nematode research, has recently tasked us with creating a distance measure to help them sort through the largest archive of high-quality nematode images in the world [7]. For these experiments we consider a collection of fifty images of five species. Each nematode sample originally exists as a stack of images displaying over 100 focal planes of the organism. We prune the data by only selecting the focal plane image with highest variance in each sample stack (i.e., the most focused image).

Nematode dataset. Link

  • reproNematodeMPEG
  • reproNematodeGabor
  • reproNematodeTexton

Spider experiments.

This dataset consists of images of the Australasian ground spiders of the family Trochanteriidae. This is a diverse family - 121 species in fourteen genera, with high variance in inter- and intra-specific variation, thus it represents a very difficult problem for classification. Although some species in this family are relatively common, almost 80 per cent were represented by less than ten individuals (of either sex); more than 50 per cent had fewer than five. Thirteen species had twenty or more individuals. The original images were grey scaled, cropped square, enhanced (for contrast/brightness) and resized by the original authors [8], we did no further pre-processing.

Spider dataset. Link

  • reproSpidersFullMPEG
  • reproSpidersFullRI_MPEG
  • reproSpidersFullGabor
  • reproSpidersFullTexton
  • reproSpidersSubsetMPEG
  • reproSpidersSubsetRI_MPEG
  • reproSpidersSubsetGabor
  • reproSpidersSubsetTexton

Tire track experiments.

This dataset consists of idealized collection of tire imprints left on paper. Three well worn tires had paint applied to them and were rolled over paper. The tires are painted and rolled sixteen times each in varying directions and different painted sections of the tire. Discontinuities in the painted tracks resulting from dry or insufficient paint resemble the interruptions in earth tracks caused by a denser arrangement of materials in the ground and uneven weight distribution across the tire.

Tire track dataset. Link

  • reproTireMPEG
  • reproTireRI_MPEG
  • reproTireGabor
  • reproTireTexton

UIUCTex experiments.

The University of Illinois at Urbana-Champaign Texture database features twenty-five texture classes with forty samples each [9]. All images are gray-scaled and are of size 640x480 pixels. Images captured are taken at varying orientations, illuminations, and subset locations on the sample texture.

UIUCTex dataset. Link

  • reproUiuctexMPEG
  • reproUiuctexRI_MPEG
  • reproUiuctexGabor
  • reproUiuctexTexton

VISTex experiments.

The MIT Vision Texture database consists of 167 images from nineteen classes [10]. Unlike many other texture datasets the VisTex dataset does not hold rigid rules for orientation or lighting, but rather provides images from real world conditions (such as flowers within a field or the water texture from an inland location).

VISTex dataset. Link

  • reproVistexMPEG
  • reproVistexRI_MPEG
  • reproVistexGabor
  • reproVistexTexton

VVT experiments.

This dataset consist of 839 samples of wood lumber used originally for color based inspection and grading for industrial usage [11]. Square tessellations of about 2.5x2.5cm of every image are annotated to be either sound or one of about 40 types of wood defect (dry knot, small knot, bark pocket, core stripe, etc.). The annotated data is parsed and each tessellated region is cropped and label as either sound or defective. A subset consisting of 100 images from each class is then used for classification tests.

VVT dataset. Link

  • reproVvtMPEG
  • reproVvtRI_MPEG
  • reproVvtGabor
  • reproVvtTexton

FIGURES AND OTHERS

Figure 4.

Shield dataset. Link

Butterfly dataset. Link

  • figure4

Figure 8.

Comparison images. Link

  • figure8

Results file.

Unannotated excel results and notes of experiments run. Link

REFERENCES

  1. T. Randen, Brodatz Textures Image Database, http://www.ux.uis.no/~tranden/brodatz.html.
  2. P. Brodatz, Textures: A Photographic Album for Artists and Designers, New York: Dover, 1966.
  3. Center for Artificial Intelligence and Robotics, http://www.cairo-aisb.com/.
  4. A. Bratko, G. Cormack, B. Filipic, T. Lynam, B. Zupan, Spam Filtering Using Statistical Data Compression Models, Journal of Machine Learning Research 7, Dec. 2006.
  5. K. J. Dana, B. van Ginneken, S. K. Nayar, J. J. Koenderink, Reflectance and texture of real-world surfaces, ACM Trans. Graph. 18, 1, 1999.
  6. M. Mayo, A. Watson, Automatic species identification of live moths. In Ellis et. al, editor, Proc. of the 26th SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence, 195-202, 2006.
  7. P. De Ley, Assistant Professor and Assistant Nematologist at UCR, Personal communication, Feb 2009.
  8. K. Russell, H. Do, J. Huff, N. Platnick, Introducing SPIDA-web: wavelets, neural networks and Internet accessibility in an image-based automated identification system, In N. MacLeod (ed), Automated Object Identification in Systematics: Theory, Approaches, and Applications. Springer Verlag, 2007.
  9. S. Lazebnik, C. Schmid, J. Ponce, A Sparse Texture Representation Using Local Affine Regions, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 8, pp. 1265-1278, August 2005.
  10. MIT Vision and Modeling Group, http://vismod.media.mit.edu/vismod/.
  11. O. Silven, M. Niskanen, H. Kauppinen, Wood inspection with non-supervised clustering, COST action E10 Workshop - Wood properties for industrial use, 18-22, Espoo, Finland, June 2000.
  12. X. Wang, L. Ye, E. J. Keogh, C. R. Shelton, Annotating historical archives of images, JCDL, 341-350, 2008.
University of California Riverside
Department of Computer Science and Engineering
Winston Chung Hall, Rm. 368
Riverside, CA, 92521, USA

Valid XHTML 1.0 Strict Valid CSS!