Computer Science and Engineering

Christian R. Shelton, Professor

Annotating Historical Archives of Images (2010)

by Xiaoyue Wang, Lexiang Ye, Eamonn Keogh, and Christian Shelton

Abstract: Recent programs like the Million Book Project and Google Print Library Project have archived several million books in digital format, and within a few years a significant fraction of world’s books will be online. While the majority of the data will naturally be text, there will also be tens of millions of pages of images. Many of these images will defy automation annotation for the foreseeable future, but a considerable fraction of the images may be amiable to automatic annotation by algorithms that can link the historical image with a modern contemporary, with its attendant metatags. To perform this linking, there must be a suitable distance measure that appropriately combines the relevant features of shape, color, texture and text. However, the best combination of these features will vary from application to application and even from one manuscript to another. In this work, the authors propose a simple technique to learn the distance measure by perturbing the training set in a principled way.

Download Information

Xiaoyue Wang, Lexiang Ye, Eamonn Keogh, and Christian Shelton (2010). "Annotating Historical Archives of Images." International Journal of Digital Library Systems, 1(2), 59-80.            

Bibtex citation

   author = "Xiaoyue Wang and Lexiang Ye and Eamonn Keogh and Christian Shelton",
   title = "Annotating Historical Archives of Images",
   journal = "International Journal of Digital Library Systems",
   volume = 1,
   number = 2,
   pages = "59--80",
   year = 2010,

More Information


University of California, Riverside
Chung Hall, room 327
Riverside, CA 92521
Tel: (951) 827-2554
E-mail: cshelton@cs.ucr.edu


Other Links