CS 111: Introduction to Computer Science

Collecting weather statistics

Due by 11:10AM Monday, January 31. Hand in as: weatherstats.py

When you write a function, you are giving a name to a collection of operations. We do this in non-programming life all the time. "Make breakfast" is a name for a complex sequence of operations, most of them conditional ("if the milk hasn't gone bad, get the bowl out for cereal, otherwise if there is bread, get out the toaster"). In a computer program, writing a function enables you to name a complex operation and reuse it anywhere else in your program--even in another function. By naming operations in this way, we can decompose very complex problems into simple parts, which enables us to think more effectively about the problems.

In this assignment, you will write some functions and then use those functions to solve a problem involving weather data.

The program: basic version

Your program, called weatherstats.py, will read data from this file of daily high temperatures recorded at the Minneapolis-St. Paul airport since 1939, with Feb 29 removed in all leap years, and produce a report summarizing some of the interesting features of the temperature data. Your program should accept the name of the data file as a command-line argument, and then produce output looking something like this:

$ python weatherstats.py msp-temperatures.txt High temperature data (Fahrenheit) Year Mean Std Dev Max Min 1939 52.74 24.47 98 -1 1940 53.89 23.03 103 -7 ... 2008 53.28 25.95 94 -4 Highest mean temperature: 53.22 (1954) Lowest mean temperature: 55.87 (1989) Highest standard deviation: 24.34 (1972) Lowest standard deviation: 21.03 (2003) Highest temperature: 107 (1963) Lowest temperature: -43 (1945)

Note: My numbers in the example above are fictional. I made them up to demonstrate the kinds of display I wanted to see. Your results will be different from these.

Though you are free to implement much of your program however you wish, I want you to start by writing the following two functions:

def mean(listOfNumbers): '''Returns the mean of the specified list of numbers if the list contains at least one element. Otherwise, returns 0. This function assumes that its parameter refers to a (possibly empty) list of numbers.''' pass def standardDeviation(listOfNumbers): '''Returns the standard deviation of the specified list of numbers if the list contains 2 or more elements. Otherwise, returns 0. This function assumes that its parameter refers to a (possibly empty) list of numbers.''' pass

Note that the Python instruction "pass" is just a place-holder for the function body that does nothing at all. You will replace the "pass" in each case with the actual code required by these three functions.

The idea here is that if you write these two simple functions, your main program will be able to exploit these functions to perform the overall task of the program. Of course, you are free to define additional functions if you find it useful to do so. But you should definitely implement the functions described above, and use them in your main program where appropriate.

The raw data for this exercise came from the United States Historical Climatology Network.

The program: the A version

The program described above, written correctly and with good organization, will be worth a grade of B. A notable feature of that version of the program is that you can complete it in a line-by-line fashion. That is, you can read one line from the file, do some computations using your mean and standardDeviation fuctions, produce some output, and repeat. The program described here, however, is best handled by first loading all of the temperature data into appropriate list variables within your program, and then looping over those lists.

If you choose to work on this version (and thus have a shot at an A for the assignment), your finished program should produce the report shown above, followed by a day-by-day report like so:

Day Mean Std Dev Max Min 1 8.26 6.50 44 -12 2 9.01 6.38 45 -14 ... 365 8.76 7.40 38 -8

That is (in this fictional example), the highest high temperature on New Year's Day was 44, the lowest high on New Year's Day was -12, etc.

One way to approach this problem is by loading all your data into a list of lists, as described in this sample function.

def loadWeatherData(fileName): '''Loads temperature data from the specified file into a list of lists of integers, and returns the list. This function assumes that each line of the file consists of a year followed by some number of temperatures, all separated by commas. For example, if the file looks like this: 1939,-10,5,3,14,... 1940,2,8,-1,4,... ... 2008,7,12,3,5,-2,... (which means that on Jan 1, 1939, the high temperature was 10, etc.), then loadWeatherData will return a list of lists like this: [[1939,-10,5,3,14,...], [1940,2,...], ..., [2008,7,12,...]] ''' pass

Some suggestions

We will, of course, discuss this program in class on Wednesday. Have fun!