CS 334: Buffer Manager

Introduction

In this assignment, you will have to implement a prototype of a buffer manager for a database system.


Getting Started

Start by grabbing the files utils.py, bufmgr.py, and bufmgr_tester.py. Here's what you're getting:


The Buffer Manager Interface

You will have to implement a buffer manager that uses the clock replacement policy. The buffer manager interface that you will implement allows a program from higher levels of the database system to:

  1. Allocate and deallocate pages on disk.
  2. Bring a disk page to the buffer pool and pin it.
  3. Unpin a page in the buffer pool.

The list of the specific methods that you need to implement are contained in bufmgr.py.


Design Overview and Implementation Details

The buffer pool is a collection of frames (page-sized sequences of main memory bytes) that is managed by the Buffer Manager. It should be stored as a list of Frame objects. Within each frame, you should be storing a page, its page id, its filename, its pin count, and a dirty bit.

You'll need to be able to map a filename and a page id to a particular frame in your buffer pool. A Python dictionary should do this quite handily.

When a page is requested for pinning, the buffer manager should do the following: Check the buffer pool (by using the hash table) to see if it contains the requested page. If the page is not in the pool, it should be brought in as follows:

  1. Choose a frame for replacement, using the Clock replacement policy.
  2. If the frame chosen for replacement is dirty, flush it (i.e., write out the page that it contains to disk, using utils.py appropriately).
  3. Read the requested page (again, by using utils.py) into the frame chosen for replacement; the pin count and dirty bit for the frame should be initialized to 0 and false, respectively.
  4. Delete the entry for the old page from the Buffer Manager's hash table and insert an entry for the new page. Also, update the entry for this frame to reflect these changes.
  5. Pin the requested page by incrementing the pin count for this frame and return a pointer to the page to the requester.

The Clock Replacement Policy

There are many ways of deciding which page to replace when a free frame is needed. Commonly used policies in operating systems are FIFO, MRU and LRU. Even though LRU is a commonly used policy, it has a high overhead and is not the best strategy to use in a number of common cases that occur in database systems (e.g., repeated sequential scans of the same file). Instead, many database systems use the Clock algorithm which approximates LRU behavior with less overhead.

clock diagram
Clock Replacement Diagram

The above figure illustrates the execution of the clock algorithm. Conceptually, all the frames in the buffer pool are arranged in a circle around the face of a clock. Associated with each frame is a referenced flag. Each time a page in the buffer pool is unpinned, the referenced flag of the corresponding frame is set to true. Whenever we need to choose a frame for replacement, the current frame pointer, or the "clock hand" (an integer whose value is between 0 and poolSize-1), is advanced, using modular arithmetic so that it does not go past poolSize-1. This corresponds to the clock hand moving around the face of the clock in a clockwise fashion. For each frame that the clock hand goes past that is unpinned, the referenced flag is examined and then cleared. If the flag had been set, the corresponding frame has been referenced "recently" and is not replaced. On the other hand, if the referenced flag is false, the page is selected for replacement. If the selected buffer frame is dirty (i.e. it has been modified), the page currently occupying the frame is written to disk.

Note that when a frame is needed, the clock hand is advanced before the frame is considered for replacement. Otherwise, the frame most recently used will always be the first one checked, which is silly. You should initialize your clockhand appropriately so that frame 0 is the very first frame that your buffer pool uses.


Multiple parts

Part 1: Make some attempt at implementing the methods newPage and pinPage. They can be buggy. So long as you submit something that illustrates you clearly gave this an honest try, you'll get full credit for this part. Your main goal is to get your head into this so you can bring questions to class. Submit your work so far.

Part 2: Get the first test (i.e., test1) to run correctly. This means you'll need to get some version of newPage, pinPage, and unpinPage working. In order to pass test1, you don't need to get the clock algorithm working. ANY replacement policy will work. Use the first frame in the buffer pool the whole time, if you like.

Part 3: Get the rest of the methods in the buffer manager implemented, and get as many of the tests as you can to run correctly.


Handing in Your Code

If you are working in a team, only one of you should submit the code. Make sure that both of your names appear in program documentation at the top.

Zip up all of your code, and submit it via Moodle.

Good luck, have fun, and ask questions!