Annotating Historical Archives of Images (2010)

by Xiaoyue Wang, Lexiang Ye, Eamonn Keogh, and Christian Shelton


Abstract: Recent programs like the Million Book Project and Google Print Library Project have archived several million books in digital format, and within a few years a significant fraction of world’s books will be online. While the majority of the data will naturally be text, there will also be tens of millions of pages of images. Many of these images will defy automation annotation for the foreseeable future, but a considerable fraction of the images may be amiable to automatic annotation by algorithms that can link the historical image with a modern contemporary, with its attendant metatags. To perform this linking, there must be a suitable distance measure that appropriately combines the relevant features of shape, color, texture and text. However, the best combination of these features will vary from application to application and even from one manuscript to another. In this work, the authors propose a simple technique to learn the distance measure by perturbing the training set in a principled way.

Download Information

Xiaoyue Wang, Lexiang Ye, Eamonn Keogh, and Christian Shelton (2010). "Annotating Historical Archives of Images." International Journal of Digital Library Systems, 1(2), 59-80.            

Bibtex citation

@article{WanYeKeoShe10,
   author = "Xiaoyue Wang and Lexiang Ye and Eamonn Keogh and Christian Shelton",
   title = "Annotating Historical Archives of Images",
   journal = "International Journal of Digital Library Systems",
   journalabbr = "IJDLS",
   volume = 1,
   number = 2,
   pages = "59--80",
   year = 2010,
}