UNIVERSITY of CALIFORNIA, RIVERSIDE
Department of Computer Science and Engineering

CS236: Database Management Systems
Lecture: Tuesday/Thursday 3:40-5pm, ENG II 139
Fall 2005

Instructor:
Dimitrios Gunopulos
office: ENG II 324
tel: 951-827-2479
e-mail: dg@cs.ucr.edu
Office Hours:
Tuesday/Thursday 2:00-3:30pm or by appointment.

The first part of the course will use the following textbook:

Database Management Systems,” by Raghu Ramakrishnan and Johannes Gehrke, McGraw-Hill, 3rd edition, ISBN 0-07-246563-8.
We will first cover Chapters 3,4 and 5 (quickly) as an introduction.

Then we will concentrate on:

Indexing (chapters 10 and 28; see also papers on R-trees)

Join Processing and Spatial Joins (see below and section 14.4)

Interesting Book Exercises: 10.1, 10.4, 10.5, 10.9, 14.4, 14.5

Query Evaluation (Chapters 12, 13 and 15)

Transaction Management (Chapters 16, 17 and 18)

Normalization (Chapter 19)

Decision Support (Chapter 25 and paper on data cubes)

Distributed Databases (Chapter 22)

Data Mining (Chapter 26 and paper on clustering in SQL)

XML Data Management (Chapter 27)


Additional Lecture Slides:


Indexing in High Dimensional Spaces and Dimensionality Reduction Techniques: dimred.pdf
More slides on R-Trees: rtee2.pdf
Spatial Joins: spjoins.pdf


We will also use the following papers:

R-tree indices:
Antonin Guttman: R-Trees: A Dynamic Index Structure for Spatial Searching. SIGMOD Conference 1984: 47-57 rtree.pdf

N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger. The R*-tree: An Efficient and Robust Access Method For Points and Rectangles. SIGMOD Conference 1990. rstar.pdf
Leonard D. Shapiro: Join Processing in Database Systems with Large Main Memories. TODS 11(3): 239-264 join.pdf

M. Vlachos, M. Hadjieleftheriou, D. Gunopulos, E. Keogh: "Indexing Multi-Dimensional Time-Series with Support for Multiple Distance Measures", In Proc. of 9th International Conf. on Knowledge Discovery and Data Mining (SIGKDD), Washington, DC, 2003 traj.pdf

P. N. Yianilos: Data Structures and Algorithms for Nearest Neighbor Search in General Metric Spaces, SODA 1993. vptree.pdf

Ming-Ling Lo, Chinya V. Ravishankar: Spatial Joins using Seeded Trees. SIGMOD Conference 1994: 209-220 seeded.trees.pdf

Ming-Ling Lo, Chinya V. Ravishankar: Spatial Hash-Joins. SIGMOD Conference 1996: 247-258 shj.pdf

Nick Koudas, Kenneth C. Sevcik: Size Separation Spatial Join. SIGMOD Conference 1997: 324-335 ssj.pdf

DataCubes:
Jim Gray, Adam Bosworth, Andrew Layman, Hamid Pirahesh: Data Cube, A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total. ICDE Conference 1996, pp: 152-159. DataCube.doc, or, here.

Clustering in SQL:
# Dimitris Papadopoulos, Carlotta Domeniconi, Dimitrios Gunopulos, Sheng Ma: Clustering Gene Expression Data in SQL Using Locally Adaptive Metrics. 8th ACM Sigmod Workshop on Research Issues in Data Mining and Knowledge Discovery, 2003: 35-41. sql-c.ps

Data streams:
Shivnath Babu, Jennifer Widom: Continuous Queries over Data Streams. SIGMOD Record 30(3), pp: 109-120 (2001). streams.pdf

Brian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, Jennifer Widom: Models and Issues in Data Stream Systems. PODS 2002, pp: 1-16. streams-issues.pdf

Graham Cormode,Flip Korn, S. Muthukrishnan, Divesh Srivastava: Diamond in the Rough: Finding Hierarchical Heavy Hitters in Multi-Dimensional Data. Proceedings of the 2004 ACM SIGMOD Conference on Management of Data.

Peer-to-Peer:
K. Aberer, A. Datta, M. Hauswirth, R. Schmidt, "Indexing data-oriented overlay networks", 31st International Conference on Very Large Databases (VLDB), Trondheim, 30 Aug - 2 Sep, 2005.

Evaluation:

Midterm (40%) and Project (60%)

Possible Projects:

1. Implement a P2P keyword search engine using Peerware.
2. Design and implement spatial access methods for sensor networks.
3. Design and implement optimal placement algorithms for sensor networks.
4. Design and implement visualization techniques for analyzing network data
5. Design and implement subspace clustering algorithms with constraints