
At SIGKDD 2007
The competition datasets will be kept private to allow a new service to the data mining community, true blind testing of algorithms. If you want to really test your time series classification algorithms, first test on the fully public 20 datasets , then test on these 20 dataset by submiting your predictions (see below). We will send you your results, and the results of baseline 1NN Euclidean Distance and 1NN DTW. You can use these results to help convince reviewers that your results are not due to overfitting or tweaking.
The official deadline for entries to the Workshop and Challenge on Time Series Classification has past, we had 12 participants.
We are giving an extended deadline to anyone else that wants to participate. They may submit their results upto midnight on the 11th of July, 2007.
These late entries will NOT be eligible for the official prizes, but if the results would have put the entry in the top 50%, they will be noted and acknowledged at the workshop.
Entry checklist:
- Did you choose a unique and descriptive name for your team, such as "laurel_hardy_mit"?
- Is your entry a single zip (not rar or tar or anything else) file, with the filename being your team name "laurel_hardy_mit.zip".
- Within the zip file, are there 20 text files, with the extension ".txt", which are named for your team and the problem number, i.e.
- laurel_hardy_mit01.txt
- laurel_hardy_mit02.txt
- :::
- laurel_hardy_mit20.txt
- Don't forget that the first nine problems require the "0" before the digit, so laurel_hardy_mit02.txt is correct but laurel_hardy_mit2.txt is wrong.
- Did you have a one more text file, called "laurel_hardy_readme.txt", that contains the following:
- A list of all members of your team (no additions or deletions after submission).
- A paragraph describing your approach, between 50 and 200 words, or a pointer to a publicly available paper.
- A “signed” affirmation that all your results were created directly by the algorithm(s), without human intervention beyond pointing the algorithm at the right data.
- Are all 20 files in the right format (given in "Details on how to format your entry are here. ").
- If your algorithm decides it cannot do well on some problems, or you time out on some datasets because you scale poorly in dimensionality, or for some other reason you cannot finish all datasets, you can still compete. But you must submit all 20 entry files in the right format (otherwise my scripts will crash and you will be disqualified). In such a case, the best thing to do would be put the most frequent class as your class prediction for everything.
Organized by
Technical assistance from Xiaopeng Xi, Jill Brady, Xiaoyue Wang, Anthony Lam, Dragomir Yankov.