This is intended to give you a sense of what I think is important from the course so far, and what I will be thinking of when creating the exam.

Here are some disclaimers. This is not a contract. I may have inadvertently left something off this list that ends up in an exam question. I make no guarantees that the exam will be 100% limited to items listed below. Moreover, I will not be able to test all of this material given the time limitations of the exam. I will have to pick and choose some subset of it.

You are permitted one 8.5 x 11 sheet of paper with notes (both sides) for use as a reference during the exam.

Here are the specifics: Students should be able to...

Indexing: Be able to quantitatively describe advantages and disadvantages of indexing. Be able to define and quantitatively assess merit of indexing strategies such as primary, secondary, clustering, dense, and sparse. Demonstrate how an index-sequential file works and quantify usage costs. Be able to show detailed examples of how inserting works in B+ trees and extendable hashing when sufficient assumptions on implementation are provided. Be able to explain advantages and disadvantages of each of the above techniques, and why each might be chosen. Be able to work out approximate I/O costs for retrieving data using a particular indexing technique.

Query Evaluation: Be able to explain and/or demonstrate...

Query Optimization: Be able to evaluate alternative query evaluation strategies to determine which is more likely to be chosen. Be able to generate query evaluations strategies for a particular query, and indicate how to decide amongst them. Be able to describe how query optimizer approaches above problems, and show specific examples.

Transactions: Be able to explain what transactions are, and why they matter. ACID: Be able to identify what the acronym stands for, what each of the four parts mean, and why each of them are important. Be able to identify if a particular schedule causes a conflict or not, and/or whether or not a particular schedule is serializable. Explain the role of locking in making transactions work, and how deadlocks might be handled.

NoSQL: Compare and contrast tradeoffs between using relational systems vs. NoSQL systems. Be able to describe how the major categories of NoSQL systems compare and differ. Be able to state the CAP Theorem and its implications.

Finally, note that the practice exercises in the textbook are a great study tool -- all of them have solutions online at the textbook website.

You can practice all of these ideas by looking at the textbook Practice Problems. You can test yourself against any of them; the solutions to all of them are here.