Advanced Database Systems (COP-6727)

Spring 2008

Announcements

·         Welcome to the class!

General Info

Instructor: Vagelis Hristidis

Lecture time: Wednesdays 6:25 pm-9:05 pm

Location: ECS 145

Office hours: Wednesdays 5:00 pm-6:25 pm

Grading

20% class participation (attendance required)

30% paper presentation in class

50% project

Course Description

The purpose of the course is to present and discuss the state-of-the-art on a set of topics in current database research.

Each week, 2-3 papers on a databases-related topic will be presented and discussed. Some papers will be presented by the instructor and some by students. A choice of papers will be given for the students to pick which one to present.

Also, the students are required to read the papers scheduled for presentation each week, to be able to participate in discussions in class.

Some of the topics presented will be:

The projects will be in groups of two. Groups of three or one are only allowed if a reason exists, after the instructor’s approval.
The students may propose their own projects or receive a project description from the instructor. In general, there are two types of projects:

Lectures

Date

Topic

Papers

Slides (if any)

1/9/2008

Link-based Web Search

P1:  L. Page, S. Brin, R. Motwani, T. Winograd. The PageRank Citation Ranking: Bringing Order to the Web. 1999
P2: J. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM 46(1999).

 Link-based Web Search.pdf

 1/16/2008

Databases and Information Retrieval 1

P3: Amit Singhal: Modern Information Retrieval: A Brief Overview. IEEE Bulletin 2001

P4: V. Hristidis, L. Gravano, Y. Papakonstantinou: Efficient IR-Style Keyword Search over Relational Databases. VLDB, 2003

Intro to IR and Proximity Search in Databases,

IR-Style VLDB03

 1/23/2008

 Databases and Information Retrieval 2

P5:  L. Guo, F. Shao, C. Botev, J. Shanmugasundaram: XRANK: Ranked Keyword Search over XML Documents. SIGMOD 2003

P6:  R. Goldman, N. Shivakumar, S. Venkatasubramanian, H. Garcia-Molina: Proximity Search in Databases. VLDB 1998

 

 

1/30/2008

Spatial Databases 1

P7: R. H. Güting. An introduction to spatial database systems. The VLDB Journal 3, 4 (Oct. 1994), 357-399.

P8: A. Guttman. R-trees: a dynamic index structure for spatial searching. ACM SIGMOD 1984.

R-Trees

2/6/2008

Spatial Databases 2

P9: G.R. Hjaltason and H. Samet. Distance browsing in spatial databases. In ACM Transactions on Database Systems, Vol. 24, No. 2, 1999

P10: X. Xiong, M. F. Mokbel, and W. G. Aref. SEA-CNN: Scalable Processing of Continuous K-Nearest Neighbor Queries in Spatio-temporal Databases. International Conference on Data Engineering (Icde'05)

Slides for Distance browsing in spatial databases

 2/13/2008

 Data Warehouses

P11: S. Chaudhuri, and U. Dayal. An overview of data warehousing and OLAP technology. SIGMOD Rec. 26, 1 (Mar. 1997), 65-74.

P12: Jim Gray, Surajit Chaudhuri, Adam Bosworth, Andrew Layman, Don Reichart, Murali Venkatrao, Frank Pellow, Hamid Pirahesh: Data Cube: A Relational Aggregation Operator Generalizing Group-by, Cross-Tab, and Sub Totals. Data Min. Knowl. Discov. 1(1): 29-53 (1997)

 2/20/2008

 Ranked Queries 1

P13: Ronald Fagin, Amnon Lotem, Moni Naor: Optimal Aggregation Algorithms for Middleware. PODS 2001

P14: Nicolas Bruno, Luis Gravano, Amelie Marian: Evaluating Top-k Queries over Web-Accessible Databases.ICDE 2002

 TA

 2/27/2008

Ranked Queries 2

P15: Vagelis Hristidis, Nick Koudas, Yannis Papakonstantinou: PREFER: A System for the Efficient Execution of Multi-parametric Ranked Queries. SIGMOD, 2001

P16: Ihab F. Ilyas, Walid G. Aref, Ahmed K. Elmagarmid:
Supporting Top-k Join Queries in Relational Databases. VLDB 2003

 Ihab-VLDB03

 3/5/2008

XML 1

1. Handout

2. XQuery tutorial

 XML

XQuery

3/12/2008

XML 2

P17:J. Shanmugasundaram, K. Tufte, C. Zhang, G. He, D. DeWitt, J. Naughton. Relational Databases for Querying XML Documents: Limitations and Opportunities. Very Large Data Bases 1999

P18:Roy Goldman, Jennifer Widom: DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. VLDB 1997: 436-445

XML Indexing

 3/26/2008

Semantic Web and Web Services

P19: Introduction to the Semantic Web
P20: An Introduction to Web Services

 WS

4/2/2008

Project presentations

 

 

4/9/2004

No class, instructor out of town

 

 

4/16/2004

Project presentations

 

 

 

Other Resources

writing tips

presentation tips

 

 

Policies

Code of Academic Integrity:  

http://www.fiu.edu/~oabp/misconductweb/2codeofacainteg.htm

University Policies: academic misconduct, sexual harassment, religious holydays, and information on services for students with disabilities.

http://www.fiu.edu/provost/polman/sec2/sec2web2-44.htm

––

Paper assignments

Name   paper

Batista,Raidel p20

Bello,Jesus      p6

Bhattacharya,Abhishek           p3

Calleiro,Michael A      p6

Diaz,Lester Felipe       p12

Espinoza,Roberto A    p14

Gomez,Marco A         p4

Gonzalez,Hector         p7

Hernandez,Frank Ernesto       p10

Hu,Yuheng     p5

Leon,Nisbel     p7

Majeed,Tariq   p17

Mohammad Abdullah,Nayeem  p11

Ortega,Francisco R     p19

Varry, Sandeep p18

bakthavatsalam,narendran      p14

 

Projects

Project1:          2-3 people       Abhishek Bhattacharya, 16 Apr

Use PubMed tools (http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html) to build a query interface. 

            a. retrieve top-k

            b. rank according to various factors (date, IR function)

            c. group by MeSH terms

           

Project 2:        

Create a viewer for MeSH ontology. Dynamic tree. Idea to handle cycles?

           

           

Project 3:               2-3 people 

create patent database with columns: patentid, patent name, date, title, abstract, text, status,… 

subset of patents (e.g., by date or topic)            

also references table that has pairs of patentids 

extra credit: system to rank by IR, ObjectRank,…           

              

Project 4:               Sandeep Varry, Nayeem Mohammad Abdullah, 2 Apr

survey of systems that run on top of pubmed, e.g., GoPubMed      

              

Project 5:               Tariq Majeed, 16 Apr

Survey of EMR systems.     

              

Project 6:               Nisbel Leon, Gonzalez,Hector, 16 Apr

Survey of Patent Search systems        

              

              

Project 7:               2-3 people Jesus Bello, Michael Calleiro, Marco A. Gómez, 16 Apr

BCIN information extraction (IE). Study IE papers (Learning Information Extraction Rules

for Semi-Structured and Free Text) and help Daniel who has already been working on this project.          

Also create IE survey if groups has 2 people.    

              

Project 8 Hernandez,Frank Ernesto, Batista,Raidel , 2 Apr

Create database from pdf files. Use Information Extraction (IE) techniques. Study and refer to IE papers in your report.

              

Project 9 Lester Diaz and Francisco R. Ortega, 2 Apr

describe, documented, and implement SqLite running on Android

 

Project 10 Roberto Espinoza, bakthavatsalam,narendran, 2 Apr

Web Search (to be refined later)