Data Mining Techniques (CS-235)

Fall 2012


General Info

Instructor: Vagelis Hristidis (aka Evangelos Christidis)

Lecture time: M/W 3:40-5 pm

Location: WCH 142

Office hours: Wednesdays 2:30-3:40 pm or by appointment


Teaching Assistant: Shiwen (Sean) Cheng, WCH 363

Office hours: Tuesdays 10-11 am, or by appointment



Assignment: 15%
Midterm: 40%
Project: 40%
Participation: 5%

Course Description

This is a graduate level course that introduces the principles of data mining.

Project Description

In groups of two (or three if project is sufficiently sophisticated and you get approval). If you cannot find partner, please email Shiwen to match you with a partner. There will be a short presentation for each project.

Late submissions, submitted before assignments are graded, will receive a 20% score reduction.


NAME TITLE presentation date
Steven Jacobs, Skyler Windh On the places you might not go 11/28/2012
 Panruo Wu Data Mining Crime Data for Cities to Hire or Lessen their Law Enforcement 11/28/2012
Matthew Alpert,Calvin Phung making predictions about businesses and reviews 11/28/2012
Ildar Absalyamov, Li Min sport game recommendation engine, 11/28/2012
Lingli Wang, Zeng Zhao Find the boundary between different number of layers graphene 11/28/2012
Chuang Yao-Jung, Zhaosheng Zhang Recommend Advertisement to Twitter Users 11/28/2012
Wenhong Qu, Daimeng Wang A survey on studies of opinion mining from product reviews 12/3/2012
Zhigang Wu, Lei Gong Salt stress associated gene identification in Arabidopsis 12/3/2012
Chi Gou, Shujun Xue Public Health Data Mining on Twitter 12/3/2012
Yipeng Qu, Chaoyi Liu Economics from Twitter’s Mood 12/3/2012
Azeem Aqil,Keval Vora predict population.of news and blog post 12/3/2012
Santosh Sangalad, Suraj Thippasandra Narayana Insect Classification 12/3/2012
Anumeha Bhasker, Gurneet Kaur Mood Detection In Music 12/3/2012
Sara Nasseri,Rachid Ounit Aviation Accident Statistics 12/5/2012
Pavan Kumar Panjam,Mark Louton training gesture classifiers 12/5/2012
Deepanshu Madan, Navin Kumar Football Statistics Data 12/5/2012
Nicholas Rhodes, Matthew Ung predict student performance in class 12/5/2012
Edward Lixandru, Farzad Khorasani factors correlated with power consumption 12/5/2012
Fuxin Yu, Yuhang Guo recommend potential followees to a twitter user 12/5/2012

Assignment Description

Assignments are individual. No groups allowed.


Tentative Lectures’ Schedule

(for book slides go to, slides below are modified versions of the book slides)

Date Topic Chapters and other material
10/1-3 Intro   Ch. 1 (slides)
10/8 Know your data   Ch. 2 (slides)
10/10 Weka tutorial, slides
10/15 Know your data (cont'd)  
10/17-22 Preprocessing Ch. 3 (slides)  
10/24 Frequent patterns Ch. 6 (slides)
10/29-31 no class, instructor at conference  
11/5 Frequent patterns (cont'd)  
11/7 Classification  Ch. 8 (slides
11/12 holiday  
11/14 Classification (cont'd)  
11/19-21 Clustering Ch. 10 (slides)
11/26 Midterm  
11/28 Project Presentations  
12/3-5 Project Presentations  




Jiawei Han, Micheline Kamber and Jian Pei

Data Mining: Concepts and Techniques, 3rd ed.

The Morgan Kaufmann Series in Data Management Systems

Morgan Kaufmann Publishers, July 2011. ISBN 978-0123814791

Other Resources 



Academic Integrity:

Standards of Conduct: