CS127 Assignment: Timing sorting algorithms

Due 1:10PM Friday, 2/20/04. Submit your analysis, pictures, and code on paper.

The basic idea

For this assignment, you'll collect and discuss timing data for a few sorting algorithms. Simply put, your goal is to argue from empirical evidence that, for example, Insertion Sort's average running time is approximately proportional to N^2, Heapsort's is approximately proportional to N*log_2(N), etc. You should make these arguments by collecting timing data, displaying it graphically along with graphs of functions such as A*N^2 and B*N*log_2(N) with suitably chosen A's and B's, and discussing the results.

To time a program, you can use the timing structure in the main method in Sorter.java. Alternatively, you can use the Unix time command. For example, here's what happened when I commented out the Java timing code and then compiled and ran Sorter.java with insertionSort active:


atanasoff> time java Sorter 10000
5.670u 0.040s 0:05.75 99.3%     0+0k 0+0io 1299pf+0w

The 5.670u tells you that your program used 5.67 seconds of "user time," which is the time during which your own code was running. "System time" is the time during which system code resulting from input/output statements, etc. The third field, the 0:05.75 is the actual time by the clock on the wall between the time you launched the program and the time it stopped running. This is often significantly different from the user time, since other programs are running on the computer, too. If you use the time command for this assignment, you should report user times.

There are many software packages with which you can graph data and functions (e.g. Mathematica, SPSS, Excel, AppleWorks,...). If you need help producing graphs for this assignment, feel free to talk to me.

The specifics

  1. Comment out the insertionSort call in main in Sorter.java. For the N's you use in the later parts of this assignment, find out how long the program infrastructure (notably the shuffle method) takes to run. You may wish to subtract these times from your timings of the sorting routines.

  2. Using the insertionSort method, argue that Insertion Sort's average running time is approximately proportional to N^2.

  3. Use insertionSort again, but now determine its (empirical) time complexity if the initial array is already sorted. Explain your observations.

  4. Implement both Heapsort and Quicksort, and collect run-time data on both of them. From your data, argue that both of these algorithms have average running time approximately proportional to N*log_2(N).

  5. Collect timing data for your binary search tree sorting from the most recent programming assignment. How does it perform?

  6. Give an overall comparison of the sorting algorithms you studied in this assignment. Which ones would you use, under which circumstances?

Have fun, start early, and keep in touch.





Jeff Ondich, Department of Mathematics and Computer Science, Carleton College, Northfield, MN 55057, (507) 646-4364, jondich@carleton.edu