Minibase Assignment #3: Index Nested Loops Join
Assigned on Wednesday, 10/29.
Due on electronically on Wednesday, 11/5, by 4:45 PM.
Introduction
In this assignment, you will implement the index nested loops join
algorithm for equijoins. You will be given libraries for the
lower layers (Disk Space Manager, Buffer Manager, Heap File, and Hash
File), and some driver routines to test the code.
You should begin by reading Chapter 14 of the textbook,
particularly the
description of Index Nested Loops Join in Section 14.4.1.
Getting Started
Create a directory for the assignment, and change to that directory.
Copy all the files from /Accounts/courses/cs347/proj3/
to your current directory. The files include:
- Makefile: A Makefile to compile your project.
- index_nl.h: This file contains the specification for the
class
index_nested_loop, which you have to implement. The implementation
should
be in a file that is called index_nl.C. You are free to add
functions,
data members, or type definitions to this class.
- main.C, test_driver.C, INLJTester.C: Test programs. You
are
given four test cases in INLJTester.C. We will grade your
code
based on these four test cases plus three additional test cases.
test_driver.C and main.C contain driver code that runs
the
tests.
- correct_output: Output of a correct implementation.
What You Have to Implement
class index_nested_loop{
public:
index_nested_loop(
char* filename1, // Name of heap file for relation R
int len_in1, // # of columns in R
AttrType in1[], // Array containing field types of R
short t1_str_sizes[], // Array containing size of columns in R
int join_col_in1, // The join column of R
char* filename2, // Name of heap file for relation S
int len_in2, // # of columns in S
AttrType in2[], // Array containing field types of S
short t2_str_sizes[], // Array containing size of columns in S
int join_col_in2, // The join column of S
char* filename3, // Name of heap file for join results
IndexType in_type, // Index type for inner relation (S)
Status& s // Status of constructor
);
~index_nested_loop();
};
The index_nested_loop constructor joins two relations R and S,
represented by the heap files filename1 and filename2,
respectively, using the index nested loops join algorithm. The columns
for relations R
(S) are numbered from 0 to len_in1-1 (len_in2-1).
You should concatenate each matching pair of records and write the
concatenated records into the heap file filename3.
You will need to use the following classes, which are given:
HeapFile, Scan, StatHashFile and StatHashFileScan. You will call
StatHashFile
methods to build a static hash file index for the inner heap file S.
Then you
scan the outer heap
file R; for each tuple in the outer heap file R, retrieve the matching
tuples in S by scanning the hash file on the inner heap file S.
You can ignore the value of the input parameter in_type.
The parameter s is a return parameter. If all goes well,
the constructor will finish and return s==OK. If an error
occurs,
you should exit the constructor and return an error code in s.
The index_nested_loop destructor should destroy the static hash
index
that you created for S, in addition to the usual clean up of local and
temporary state.
Notes on Some of the Functions You Will Use
- HeapFile::insertRecord(char* recPtr, int recLen, RID&
outRid) inserts the record of length recLen pointed to
by recPtr into the heap file and passes back the newly
allocated RID of the record in outRid.
- HeapFile::getRecord(const RID& rid, char* recPtr,
int& recLen) takes in a RID, rid, and passes back
the record associated with rid in the character array pointed
to by recPtr. It
also passes back the length of the record in recLen. The
caller is responsible for allocating memory for recPtr.
- Scan::getNext(RID& rid, char* recPtr, int& recLen)
retrieves the next record in a sequential scan of a heap file. Passes
back the RID of the next record in rid, copies the actual
record into the array pointed to by recPtr, and passes back
the length of
the record in recLen. The caller is responsible for
allocating memory
for recPtr.
- StaticHashFile::insert(const void* key, const RID rid)
inserts the
pair (key,rid) into a static hash file.
- StaticHashFile::new_scan(const void* lo_key, const void*
hi_key)
creates a StaticHashFileScan object to retrieve RIDs from a
static hash file
and returns a pointer to this object. De-allocating this object is the
responsibility of the caller.
- IndexFileScan::get_next(RID& rid, void* keyptr)
retrieves the next (key,rid) pair from an index file scan.
Passes back the RID in rid and the key in the memory area
pointed to by keyptr . Allocating memory for keyptr
is the responsibility of the caller. For this assignment, the index
file scan we will be using is a StaticHashFileScan .
Error Reporting in the Code
As in previous project assignments, you should follow the
Minibase Error Protocol . The subsystem code for the
index_nested_loop class is JOINS.
Handing in Your Code
Use hsp to hand in the entire directory containing your code. Do not
submit the individual files, but rather submit the directory as a
whole. For
example, if I am sitting in the directory above proj3, I would execute
the following command:
hsp musicant CS347 proj3
Good luck!