CS236: Database Management Systems
Lecture: Tuesday/Thursday 3:40-5pm, ENG II 139
Fall 2005
Instructor:Dimitrios Gunopulos
office: ENG II 324
tel: 951-827-2479
e-mail: dg@cs.ucr.edu
Office Hours:Tuesday/Thursday 2:00-3:30pm or by appointment.The first part of the course will use the following textbook:
“Database Management Systems,” by Raghu Ramakrishnan and Johannes Gehrke, McGraw-Hill, 3rd edition, ISBN 0-07-246563-8.We will first cover Chapters 3,4 and 5 (quickly) as an introduction.Then we will concentrate on:
Indexing (chapters 10 and 28; see also papers on R-trees)Join Processing and Spatial Joins (see below and section 14.4)
Interesting Book Exercises: 10.1, 10.4, 10.5, 10.9, 14.4, 14.5
Query Evaluation (Chapters 12, 13 and 15)
Transaction Management (Chapters 16, 17 and 18)
Normalization (Chapter 19)
Decision Support (Chapter 25 and paper on data cubes)
Distributed Databases (Chapter 22)
Data Mining (Chapter 26 and paper on clustering in SQL)
XML Data Management (Chapter 27)
Additional Lecture Slides:
Indexing in High Dimensional Spaces and Dimensionality Reduction Techniques: dimred.pdf
More slides on R-Trees: rtee2.pdf
Spatial Joins: spjoins.pdf
We will also use the following papers:R-tree indices:
Antonin Guttman: R-Trees: A Dynamic Index Structure for Spatial Searching. SIGMOD Conference 1984: 47-57 rtree.pdfN. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger. The R*-tree: An Efficient and Robust Access Method For Points and Rectangles. SIGMOD Conference 1990. rstar.pdf
Leonard D. Shapiro: Join Processing in Database Systems with Large Main Memories. TODS 11(3): 239-264 join.pdfM. Vlachos, M. Hadjieleftheriou, D. Gunopulos, E. Keogh: "Indexing Multi-Dimensional Time-Series with Support for Multiple Distance Measures", In Proc. of 9th International Conf. on Knowledge Discovery and Data Mining (SIGKDD), Washington, DC, 2003 traj.pdf
P. N. Yianilos: Data Structures and Algorithms for Nearest Neighbor Search in General Metric Spaces, SODA 1993. vptree.pdf
Ming-Ling Lo, Chinya V. Ravishankar: Spatial Joins using Seeded Trees. SIGMOD Conference 1994: 209-220 seeded.trees.pdf
Ming-Ling Lo, Chinya V. Ravishankar: Spatial Hash-Joins. SIGMOD Conference 1996: 247-258 shj.pdf
Nick Koudas, Kenneth C. Sevcik: Size Separation Spatial Join. SIGMOD Conference 1997: 324-335 ssj.pdf
DataCubes:
Jim Gray, Adam Bosworth, Andrew Layman, Hamid Pirahesh: Data Cube, A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total. ICDE Conference 1996, pp: 152-159. DataCube.doc, or, here.Clustering in SQL:
# Dimitris Papadopoulos, Carlotta Domeniconi, Dimitrios Gunopulos, Sheng Ma: Clustering Gene Expression Data in SQL Using Locally Adaptive Metrics. 8th ACM Sigmod Workshop on Research Issues in Data Mining and Knowledge Discovery, 2003: 35-41. sql-c.psData streams:
Shivnath Babu, Jennifer Widom: Continuous Queries over Data Streams. SIGMOD Record 30(3), pp: 109-120 (2001). streams.pdfBrian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, Jennifer Widom: Models and Issues in Data Stream Systems. PODS 2002, pp: 1-16. streams-issues.pdf
Graham Cormode,Flip Korn, S. Muthukrishnan, Divesh Srivastava: Diamond in the Rough: Finding Hierarchical Heavy Hitters in Multi-Dimensional Data. Proceedings of the 2004 ACM SIGMOD Conference on Management of Data.
Peer-to-Peer:
K. Aberer, A. Datta, M. Hauswirth, R. Schmidt, "Indexing data-oriented overlay networks", 31st International Conference on Very Large Databases (VLDB), Trondheim, 30 Aug - 2 Sep, 2005.Evaluation:
Midterm (40%) and Project (60%)Possible Projects:
1. Implement a P2P keyword search engine using Peerware.2. Design and implement spatial access methods for sensor networks.3. Design and implement optimal placement algorithms for sensor networks.4. Design and implement visualization techniques for analyzing network data5. Design and implement subspace clustering algorithms with constraints