CS127 Final Project

Due 5:00PM Wednesday, 3/14/01.

You may work in pairs for this assignment.

What to hand in

By 5:00 Wednesday, March 14, submitted using HSP.

Grading Criteria

Project Option #1: Changelings

In the crossword puzzle books I buy, there is a kind of word puzzle called a Changeling. A changeling (in this context, anyway) is a pair of words with the same number of letters. To solve the changeling, you show a sequence of words, each differing from the previous word by a single letter, that transforms the first word of the changeling into the second.

For example, one solution to the changeling "CAT to DOG" is "CAT, COT, DOT, DOG." The intermediate words must, of course, be words (and in fact, my crossword books say "words beginning with a capital letter, slang, obsolete words, and contractions are not allowed"). Note that in this example, we have the smallest possible number of words, since CAT and DOG differ in three letter positions, and there are three letter transitions in this solution. In a contest of changeling solutions, short solutions win. (Actually, short and clever solutions would be better than the short and dull solutions that are most common, but I won't ask you to write a program to judge cleverness.)

For this project, you will write a program that uses a dictionary to compute solutions to changelings.

To make your changeling programs uniform (so it's easy for me to test them all in the same way), I want you to implement the following command-line interface. The user should be able to run your program in one of two ways. First,


changeling dictionaryfile firstword secondword
should run compute a solution to the changeling starting at firstword and going to secondword using the indicated dictionary file. Second,

changeling dictionaryfile changelingfile
should compute solutions to the changeling word pairs that appear, one space-delimited pair per line, in changelingfile. If the user runs the program with some number of command-line arguments other than two or three, the program should print an appropriate usage message and exit.

So, how are you going to compute changeling solutions? There are undoubtedly many ways to solve this problem, but here is one way. Build a graph out of the dictionary file where each word is a vertex of the graph and two words/vertices are connected if they are the same length and differ by only one letter. Then use a breadth-first search starting at the first word until you find the second word. Then print out the path, if any, that you discovered in this way.

Building the graph can be done more efficiently than by comparing every pair of words (an Omega(N^2) process that would take a long time on an 80,000-word dictionary). How to do so is part of what you need to figure out.

Think about this problem before running to the computer to start coding. It's not easy, but the program should be lots of fun to play with when you're done.

Project Option #2: C++ Profiler

A profiler is a program that collects data for you about the run-time behavior of your programs. For example, a "line-count" profiler will print out your source code with a count of the number of times each line of code was executed during a run of the program. Other profilers print out the total time consumed by the running of various portions of your program.

Profilers are most frequently used to help programmers improve their programs' performance. By first finding out what portions of your code get executed most frequently and for the longest total time, you can concentrate your fine-tuning efforts where they will have the greatest effect. It also turns out that profilers can help in tracking down tricky bugs. You can find an interesting discussion of profilers in More Programming Pearls, Confessions of a Coder, by Jon Bentley, Addison-Wesley 1988.

For this project, your job is to write a program that will take C source code as input, and produce C source code as output. The output source, when compiled and run, will perform its expected task, but will also dump profiling information to a file.

Your profiler should report the number of times each of your functions (including main()) gets called, and the total time consumed by each function.

If you get your timing and function-call-counting done, you might try to implement line-count profiling.

You could also modify your timing reports to take into account the fact that functions call other functions. For example, suppose main() calls bigfunction(), which in turn calls littlefunction(), and none of them calls any other functions. If the entire program takes 2 seconds to run and bigfunction() takes 1.5 seconds to run, and littlefunction() takes .2 seconds to run, you could report:


main:  2 s
bigfunction: 1.5 s
littlefunction: .2 s
On the other hand, it might be more useful to report

main:  .5 s
bigfunction: 1.3 s
littlefunction: .2 s
This way, you are reporting the time each function spends running, not including the time the functions it calls take to run. The total run-time of the program will thus be the sum of the run-times of the separate functions.

Advice

Plan your program on paper. Do not start your work at the computer. For relatively large programs like these, forethought can save you a lot of time.

Plan the order in which you are going to implement the features of your program. You need a list of code segments to write, along with ways to test them. If you design a sequence of small, testable steps for your development, you'll be much more likely to finish your program successfully than if you try to write large blocks of code before testing anything. Planning ahead may seem the stodgy way to go, but it's also more successful in the long run (and, usually, in the short run, too).

Ask questions. I will be very happy to help you when you're stuck. Pride and a two-week deadline don't mix very well.

Start early, have fun, and keep in touch.



Jeff Ondich, Department of Mathematics and Computer Science, Carleton College, Northfield, MN 55057, (507) 646-4364, jondich@carleton.edu