Assignment 4 - Parsing Data

Due: Thursday, October 9, 2025, at 10pm

You may work alone or with a partner, but you must type up the code yourself. You may also discuss the assignment at a high level with other students. You should list any student with whom you discussed each part, and the manner of discussion (high-level, partner, etc.) in a comment at the top of each file. You should only have one partner for an entire assignment.

You should submit your assignment on Gradescope. See below for which files to submit.


Getting started:

You will work in two different files for this assignment. Additionally, you will need some data files.

  • Part 1: skeleton code and data file. You should save the code file as shapeParser.py and the data file as shapes.csv, both in a folder with graphics.py.
  • Part 2: skeleton code and data file. You should save the code file as weatherParser.py and the data file as weatherData.csv.

Goals

The primary goal for this assignment is to give you practice parsing CSV files. A secondary goal is to get you started thinking about how we can work with the data we use.


Parts of this assignment:


Comments and collaboration

# You should be equipped to complete this part right away.

As with all assignments in this course, for each file in this assignment, you are expected to provide top-level comments (lines that start with # at the top of the file) with your name and a collaboration statement. For this assignment, you have multiple programs; each needs a similar prelude.

You need a collaboration statement, even if just to say that you worked alone.


Note on style:

The following style guidelines are expected moving forward, and will typically constitute 5–10 points of each assignment (out of 100 points).

  • Variable names should be clear and easy to understand, should not start with a capital letter, and should only be a single letter when appropriate (usually for f as a file, i, j, and k as indices, potentially for x and y as coordinates, and maybe p as a point, c for a circle, r for a rectangle, etc.).
  • It’s good to use empty lines to break code into logical chunks.
  • Comments should be used for anything complex, and typically for chunks of 3–5 lines of code, but not every line.
  • Don’t leave extra print statements in the code, even if you left them commented out.
  • Make sure not to have code that computes the right answer by doing extra work (e.g., leaving a computation in a for loop when it could have occurred after the for loop, only once).
  • Avoid having tons of lines of code immediately after another that could have been in a loop.

Note: The example triangle-drawing program on pages 108–109 of the textbook demonstrates a great use of empty lines and comments, and has very clear variable names. It is a good model to follow for style.


Part 1: Parsing a CSV file into shapes

# You should already be fully equipped to complete this part (it’s primarily based on Lesson 9—Friday Oct. 3).

In this part, you will complete the implementation of a program that parses a .csv file into a window of shapes. An example is in shapes.csv:

window,800,600

circle,600,200,40,blue
triangle,700,200,700,300,600,100,red
rectangle,50,60,200,400,yellow
square,40,40,30,magenta

circle,60,400,25
triangle,400,200,400,300,300,400
rectangle,770,500,650,250
square,500,400,100

As you can see, most lines of the file contain either the window information or information for a single shape, and a shape may or may not have a color specified. Some lines are empty, and your code should simply ignore them (but not crash when they are encountered).

We talked in class about top-down design for the earthquake-plotting program. An alternative is bottom-up design. In bottom-up design, you determine the functions you need to complete the smaller-scale tasks, and then determine how to connect them together.

Note: there are useful libraries that can help you parse CSV files, but you should not use them for this assignment. I want to be sure that you can parse the file on your own first.

Part a: complete shape-parsing functions

For this first subpart, you should complete the implementations of the four shape-parsing functions:

  • parseCircle
  • parseSquare
  • parseRectangle
  • parseTriangle

Each function should take in a list of information about the shape, including a possible color. The specific list varies for each shape type. Your functions should create the relevant shape object from graphics.py, fill it in (if the provided list contains a color string), and return it. These functions have a return value (a shape object), but no side effect. They should not draw their shape.

Pay careful attention to the expected input types in the docstring for each function. For example, for a circle, note that it takes in a list of strings, and that list should have either three or four values in it:

def parseCircle(vals):
    """
    Creates a Circle object based on the values given in vals.

    Assumes vals is either a list [x,y,r,color] of strings,
    or [x,y,r].

    x: x-coordinate of center point (expect an int in the str)
    y: y-coordinate of center point (expect an int in the str)
    r: radius (expect a float in the str)
    color: optional color (str)

    Returns: the Circle object
    """
    # TODO: Part 1a
    return Circle(Point(0,0),0) # replace with your code

Part b: putting the pieces together

These functions all need to get used to parse a file into a list of shapes and a GraphWin object. An early part of software design is pseudocode. Pseudocode represents plans for code without having the actual syntax (so no colons, function calls, etc.). The function parseShapesFromFile needs to read in the lines of the CSV file and parse each line (at least, each one that isn’t blank) into a shape. The pseudocode for this function is given to you in comments:

def parseShapesFromFile(filename):
    """
    Opens the file specified by filename for reading, and parses
    the window dimensions and the shapes.

    Returns: the Window object and a list of shapes
    """
    # Specify variables for the window and the shape list
    # Open the file for reading
        # Go through each line in the file
            # Skip over any empty lines
            # Split the line into strings on the commas
            # Use the first string in the line to determine which object to create
    # Return the window and shapes list
    
    # TODO: Part 1b
    pass # replace with your code

Complete this function. Note that you should plan to keep the pseudocode comments around; you don’t have to do work to comment this function, and they will hopefully help you as you code!

The parseShapesFromFile function returns both a window and a list of shapes. It should not draw any shapes.

Note that this function requires you to return two values, which we haven’t seen before. Here is a short example. (It’s the same as unpacking a tuple when you call it!)

def f(x, y):
    """
    Returns the sum and difference of x and y.
    """
    plus = x + y
    minus = x - y
    return plus, minus # secretly it returns the pair as a tuple

def main():
    total, diff = f(10, 7) # total is plus, diff is minus
    print(total) # prints: 17
    print(diff)  # prints: 3

main()

Part c: drawing the shapes

One of the beautiful things about the shapes in graphics.py is that they all have a draw method. For a variable shape that represents any type of shape object, you can call shape.draw(win) to draw that shape in the GraphWin object win. This is called polymorphism.

To see this for yourself, you need to add a couple of lines of code to main to actually draw the shapes to the window.

def main():
    # Read in the provided file and parse it into a window and shapes
    filename = "shapes.csv"
    win, shapes = parseShapesFromFile(filename)

    # Draw each shape (in order) in the window
    # Part 1c
    pass # TODO: replace with your code

    # Wait to close until the user clicks
    win.getMouse()

Here is the output you should see for the example file:

<image: shapes>


Part 2: Visualizing weather data

# You should be fully equipped to complete most of this part after Lesson 9 (Friday Oct. 3), with just the very end needing Lesson 10 (Monday Oct. 6).

Part a: parsing the data

I downloaded a bunch of weather data from NOAA’s website. It is provided to you in a CSV file called weatherData.csv.

Disclaimer: All of the data is as-provided by NOAA, except that I changed the station name to not include a comma.

Here are the first few lines of the file:

NAME,DATE,PRCP,SNOW,TMAX,TMIN
MINNEAPOLIS ST. PAUL INTERNATIONAL AIRPORT,1/1/2023,0,0,35,22
MINNEAPOLIS ST. PAUL INTERNATIONAL AIRPORT,1/2/2023,0.02,0.1,27,22
MINNEAPOLIS ST. PAUL INTERNATIONAL AIRPORT,1/3/2023,0.65,6,31,24
MINNEAPOLIS ST. PAUL INTERNATIONAL AIRPORT,1/4/2023,0.61,8.8,33,30
MINNEAPOLIS ST. PAUL INTERNATIONAL AIRPORT,1/5/2023,0.01,0.2,30,18

For the first task of this part, you should implement the parseData function. This function should grab the date strings, precipitation and snow amounts (as floats) and minimum and maximum temperatures (as ints) for each date from the file, and return five lists.

Be careful with ordering: each line has the maximum temperature befor the minimum temperature.

def parseData(filename):
    """
    Opens the CSV file with name filename for reading, and reads in the data.

    filename: string
    returns: five lists:
      - one of dates (strings)
      - one of precipitation (floats)
      - one of snow (floats)
      - one of minimum temps (ints)
      - one of maximum temps (ints)
    """
    dates = []
    precip = []
    snow = []
    minTemps = []
    maxTemps = []

    # TODO: Part 2a
    # your code here

    return dates, precip, snow, minTemps, maxTemps

Part b: processing the data

Given the data you parsed from the CSV file in Part a, you should be able to compute some statistics. Fill in the following function definitions:

def getLowestTemp(dates, temps):
    """
    Finds the date and temperature corresponding to the lowest
    temperature across a date range.

    dates: list of strings
    temps: list of ints

    Returns: lowest temperature (int) and its corresponding date (string)
    """
    # TODO: Part 2b
    return lowestDate, lowestTemp

def getHighestTemp(dates, temps):
    """
    Finds the date and temperature corresponding to the highest
    temperature across a date range.

    dates: list of strings
    temps: list of ints

    Returns: highest temperature (int) and its corresponding date (string)
    """
    # TODO: Part 2b
    return highestDate, highestTemp

As a smaller example, here is how these functions should work (with smaller, made-up data):

dateList = ["1/2/2025", "2/5/2025", "7/7/2025"]
tempList = [10, 4, 92]

d1, t1 = getLowestTemp(dateList, tempList)
d2, t2 = getHighestTemp(dateList, tempList)

print(d1, t1) # prints: 2/5/2025 4
print(d2, t2) # prints: 7/7/2025 92

Part c: a better user experience

# You should be fully equipped to complete this part after Lesson 10 (Monday Oct. 6).

Finally, let’s make this temperature analysis interactive! Complete the following function:

def outputResults(dates, minTemps, maxTemps):
    """
    Asks the user which temperature (mins or maxes) to report
    metrics (highest/lowest) about.  Keeps asking until
    the user enters "done".

    dates: a list of strings
    minTemps: list of ints
    maxTemps: list of ints

    Returns: None
    """
    # TODO: Part 2c
    pass # replace with your code

Here is an example interaction with this program:

Welcome! You can learn about temperatures in 2023.

Would you like more info (type 'done' to quit)? yes
Which temps ('lows'/'highs')? lows
Which metric ('min'/'max')? min
The minimum low temperature in 2023 was -13 Fahrenheit on 2/3/2023.

Would you like more info (type 'done' to quit)? yes
Which temps ('lows'/'highs')? highs
Which metric ('min'/'max')? min
The minimum high temperature in 2023 was 2 Fahrenheit on 1/30/2023.

Would you like more info (type 'done' to quit)? done
Good bye!

Reflection

# You should be equipped to complete this part after finishing your assignment.

Were there any particular issues or challenges you dealt with in completing this assignment? How long did you spend on this assignment? Write a brief discussion (a sentence or two is fine) in your readme.txt file.


Grading

This assignment will be graded out of 100 points, as follows:

  • 5 points - submit all files to Gradescope with correct names

  • 6 points - all code files contain top-level comments with file name, purpose, and author names

  • 4 points - each code files’ top-level comments contain collaboration statement

  • 10 points - code style enables readable programs

  • 16 points - parseCircle, parseSquare, parseRectangle, and parseTriangle functions correctly create and return shape objects (Part 1a)

  • 10 points - parseShapesFromFile function correctly parses a .csv file and returns a window and list of shape objects (Part 1b; should call functions from Part 1a)

  • 4 points - main function in shapeParser.py correctly draws the shapes returned from parseShapesFromFile (Part 1c)

  • 15 points - parseData function correctly parses a valid .csv weather data file and returns the lists of dates, precipitation, snow, and min/max temperatures (Part 2a)

  • 10 points - getLowestTemp and getHighestTemp functions return the correct date and temperature (Part 2b)

  • 15 points - displayResults function uses a loop (10 pts) to let the user request certain metrics, and displays them to the user (5 pts) (Part 2c)

  • 5 points - readme.txt file contains reflection


What you should submit

You should submit the following files on Gradescope:

  • readme.txt (reflection)
  • shapeParser.py (Part 1)
  • weatherParser.py (Part 2)