CS 227, Winter 1997

Due Friday 1/31/97

Work in groups of two or three for this assignment.

For this assignment, you will implement several sorting algorithms and collect data on their running times. I want you to turn in your source code and a type-written report. Your report should include the questions you are trying to answer, a discussion and justification of your data-collection strategy, your data (presented in whatever forms you feel appropriate, including tabular and/or graphical forms), and your interpretation of the data. In particular, you should discuss the extent to which your data agree with the predictions of the theory presented in Chapter 2 of our textbook.

Choose one of the following topics.

Choice 1: Sorting for small N

The question here is which sorting algorithms are best for small N's, and over what ranges of N they are best. In discussing improvements to Quicksort, Baase suggests that letting the recursion go all the way down to 1-item lists is inefficient, and that Insertion Sort, perhaps, should be used for lists smaller than size 10. Is she right? Is 10 the right cutoff? Does the same argument apply to Mergesort? Is Insertion Sort the right choice, or might one of the other N^2 sorts be better for small N?

Don't just try to answer my list of questions (generated as it was off the top of my head late at night). Try to write code and collect data that will help you understand sorting for small values of N, and report your findings.

For this project, you should implement at least three sorting algorithms. I'd do Quicksort, Mergesort, and Insertion Sort for starters, but that's me. Make your own choices.

Choice 2: Sorting for big N

This is really an exercise in skepticism. The mathematical arguments made in Chapter 2 are convincing enough based on a bunch of assumptions (e.g. the number of comparisons is a good measure of running time), and I would be one of the first to sing the praises of mathematical analysis. Still, when it comes to choosing an algorithm for a critical piece of software, I want to see my choice of sorting algorithm running fast in practice, when all the messy details of implementation and platform are cluttering the complexity picture. I fully expect the running times to fall in line with what the theory predicts, but let's try it out, anyway.

What I want you to do if you choose to study large N's is implement at least three algorithms, including Radix sort and one of the NlgN sorts (Quick, Merge, or Heap). Collect timing data in whatever fashion you deem appropriate, but make sure to include data reflecting the effects of platform choice and compiler optimization. The -O flag for the Unix C/C++ compilers turns on optimization. We have gnu C compilers on Intel systems (Pentium and 486 in CMC 306, Pentium Pro in several faculty offices), on MIPS systems (the SGI's in CMC 307), and on Motorola 68040 systems (the black NeXTs).

A few more details

Your sorting routines should sort arrays of short integers. Each of your sorting functions should have an interface of the form


	  void Sort( int *key, int N ),
or
	  void Sort( short *key, int N ),
where the array key[] contains the integers to be sorted, and N is the number of integers to be sorted. I recommend short integers because they are implemented as two-byte quantities on all of our department's platforms, so you won't have to worry about whether your data are being messed up by the difference between 16-bit and 32-bit integers.

By keeping a consistent interface, you'll simplify your data-collection code, and you'll factor out function-calling overhead from some of your comparisons between algorithms (though, of course, function-calling overhead is a real difference between recursive and non-recursive algorithms). If you want to modify the interface, go ahead, but I recommend keeping it simple.

Turn in your code using HSP. Put your sorting routines in a file named sorts.c (or sorts.cc) and your main() and data-collection code in a file named sortmain.c.

Do your best, but don't consider the open-endedness of this assignment to be a mandate for exhaustive 50-page reports. State your questions, present your data, draw your conclusions, and be done with it. The evaluation of implemented algorithms is hard stuff. This is an opportunity for you to get introduced to it.

If you can, please print your reports 2-up.

Start early, have fun, and keep in touch.



Jeff Ondich, Department of Mathematics and Computer Science, Carleton College, Northfield, MN 55057
(507) 646-4364, jondich@carleton.edu