Image Mining of Historical Manuscripts to Establish Provenance

[This is a supporting webpage for the paper acceptedy by SDM 2012; The page contains all the codes and datasets that used in the paper]

[Here is theslides that explaining the whole paper]



CODE and EXECTUABLES

  • Here is the code used in the paper.
  •      Samples of Initial Letters

    DATASETS: Ornamental Initial Letters and Historical Manuscripts

    • Virtual Library Humanist Program (VLHP) digital library that contains more than 6, 000 historical manuscripts online, the dataset pool is getting larger and larger.
    • GOLD1 Dataset that contains 578 ornamental initial letters. This dataset is used in the classification experiment in Section 3.1
    • 6405 Annotated Initial Letters This annotated initial letter dataset pool; ; Warning: the size of this dataset is 324 MB.
    • Three Books This dataset is used in Section 5.3 ; Warning: the size of this dataset is 262 MB.
    • Twenty Books These historical manuscripts that contain 6,956 pages are used in Section 5.6. They were from the 15th and 16th century. Warning: the size of this dataset is 1.13 GB.
    •      Samples of Initial Letters

    POWERFUL CK1 DISTANCE MEASURE



    ADDITIONAL EXPREIMENTS THAT OMITTED IN THE SUBMITTED PAPER