CS 257 Assignment

Labs

Code samples

Flask sample
psycopg2 sample
books_and_authors_converter.py: the program that converts Jeff's original books/authors data into one CSV file per table
books/authors sample web application

Data

Jeff's original books/authors data, in both CSV and JSON form

Keep your CREATE TABLE statements handy

I keep my CREATE TABLE statements in a text file so I can reuse them if necessary. If you need help with SQL SELECT statements, it's great if you can easily share your CREATE TABLE statements with me.

Jeff's books/authors CREATE TABLE statements

CREATE TABLE authors ( id SERIAL, last_name text, first_name text, birth_year integer, death_year integer ); CREATE TABLE books ( id SERIAL, title text, publication_year integer ); CREATE TABLE books_authors ( book_id integer, author_id integer );

Jeff's books/authors conversion

The original books/authors data
The books_and_authors_converter.py program that converts the original data into one CSV file per database table.

Move your files to perlman

You login to perlman via: ssh YOURUSERNAME@perlman.mathcs.carleton.edu
Your browser-accessible directory on perlman.mathcs.carleton.edu is /Accounts/courses/cs257/jondich/web-f2018/YOURUSERNAME. Another path that gets you to exactly the same place is /var/www/html/cs257/YOURUSERNAME.
There are many ways to copy files from your computer to perlman. See this list of options for details.
You'll want to move your CSV files (the ones that correspond to the tables in your database design) in your perlman account. Later, you'll need your Flask app there, too.

Create your tables

Login to perlman
cd to wherever your one-CSV-file-per-table files are
Run psql (this will connect you to the database named the same thing as your perlman username, using the PostgreSQL username which is also the same thing as your perlman username).
If you have tables you want to get rid of, including obsolete copies of your own tables, you can get rid of them by doing:
DROP TABLE tablename
for each table you want to get rid of.
Copy your CREATE TABLE statements to the psql prompt. Don't forget the semi-colon at the end of each statement.
Do \dt at the psql prompt to list all your tables.
Do \d tablename at the psql prompt to see the structure of the table "tablename".

Load your data into your tables

Make sure you're still at the psql prompt (which you launched while in the directory containing your CSV files on perlman).
For each of your tables, execute at the psql prompt:
\copy tablename from 'tablename.csv' DELIMITER ',' CSV NULL AS 'NULL'
(By the way, you don't end \ commands like \copy with a semi-colon.)
Check to see whether the data loaded properly:
SELECT * FROM tablename;
If your table is really big and you don't want to see all the rows, you can view the first five rows like this:
SELECT * FROM tablename LIMIT 5;

Pick an easy endpoint

In the books/authors example, I would start with my simplest endpoint: /authors with no GET arguments. Ultimately, this endpoint should just return a JSON list of dictionaries, one dictionary per author.

Copy my Flask example to your web directory and test it

Here's the Flask sample code
Copy my Flask sample to /var/www/html/cs257/YOURUSERNAME/api.py on perlman.
Run the API on perlman by executing:
python3 api.py perlman.mathcs.carleton.edu YOURPORT
where YOURPORT is the 51xx port number Mike Tie assigned to you in the email he sent you in August.
Test the sample API on perlman by going here with a browser:
http://perlman.mathcs.carleton.edu:YOURPORT/
This should give you the "Hello, Citizen of CS257" message.
Kill the API with Ctrl-C in your perlman terminal (or with kill PROCESSID as discussed a few days ago in class if you ran the API in the background).
While you're working on api.py, you can either edit the copy on perlman directly using vim or emacs or nano, or you can edit it on your machine and then scp it to perlman whenever you want to test it.
When you add non-trivial endpoints to your API, you'll use the URL:
http://perlman.mathcs.carleton.edu:YOURPORT/...
where the ... is the rest of the API endpoint you're testing.

Create your easy endpoint stub

Delete my /actors/<last_name> endpoint from api.py
Cut the guts out of my /movies endpoint and replace "movies" with your own endpoint's name (in both the @app.route line and the def line)
For now, just put a return json.dumps(['goat', 'moose']) or similar return statement at the bottom of your new endpoint stub.
Run the API on perlman and test the new endpoint to make sure you get your expected results (e.g. the ["goat", "moose"] JSON string).

Write a SELECT statement corresponding to your easy API endpoint

For my /authors endpoint (just show me all the authors), this would probably be:
SELECT last_name, first_name, birth_year, death_year FROM authors;
Test your SELECT query in psql on perlman

Write a tiny Python program to execute the SELECT statement

Grab my psycopg2 sample code
Edit it so it executes your SELECT statement and prints enough of the results so you can check the query's correctness.

Combine steps 4 and 5 above

Take a look at my books_api.py
Go ahead and borrow any of that code. At this stage, you'll definitely want to grab my get_connection function, since you'll need it in each of your endpoints.
Move your psycopg2 code from step 5 above into your API endpoint stub function from step 4 above.
Test it by running the updated API on perlman and pointing your browser at the endpoint.

Keep your steps small

Suppose you're doing a colleges/universities database and you're working on an endpoint /colleges that has ten different GET parameters. First, just get /colleges to run as a stub. Then, get it to return five colleges with no restrictions. Then, implement just one of your GET parameters and test that, etc.
Do frequent tiny git commits after these substeps. You can then see the evolution of your code, and you don't have to be afraid of breaking things with your later work.
Make sure your code runs before you stop working. That ensures, among other things, that you could, in an emergency, just hand in what you've got so far, and it will work.

Really, I mean it.

Do your work in little stages, and test each little bitty stage.

And then commit each little bitty stage.

This is me looking out for your happiness. Do it.

What's this pg_dump thing in the Phase 5 assignment?

Once your database on perlman is populated with your tables and your data, you can login to perlman and run:
pg_dump --no-owner --no-privileges yourdatabasename > dump.sql
(Recall that your database name is exactly the same as your perlman user name.) This creates a file containing all the SQL instructions required to rebuild your whole database from scratch.
Suppose you DROPped all your TABLEs for some reason and just needed to start over. If you had created dump.sql as above, then you could do:
psql < dump.sql
to reload the whole thing.
Why do I want you to save dump.sql in your repository? Because the graders and I can then use it to run your database and API on our own computers, which turns out to be easier for grading than working directly on perlman (mainly because we don't have permissions to access your databases directly).

Some example queries from the PostgreSQL lab

SELECT authors.first_name, authors.last_name, books.title, books.publication_year FROM books, authors, books_authors WHERE books.id = books_authors.book_id AND authors.id = books_authors.author_id;

SELECT books.title, books.publication_year FROM books, authors, books_authors WHERE books.id = books_authors.book_id AND authors.id = books_authors.author_id AND authors.last_name = 'Willis';

SELECT authors.first_name, authors.last_name, books.title, books.publication_year FROM books, authors, books_authors WHERE books.id = books_authors.book_id AND authors.id = books_authors.author_id AND authors.last_name LIKE 'B%' ORDER BY authors.last_name, authors.first_name;

SELECT authors.first_name, authors.last_name, books.title, books.publication_year FROM books, authors, books_authors WHERE books.id = books_authors.book_id AND authors.id = books_authors.author_id ORDER BY authors.last_name;

CS 257: Software Design

A Little Help for Web Application, Phase 5

0. References

1. Decide on your database table structure

2. Convert your raw data to one-CSV-per-table

3. Set up your database

4. Create a stub for your simplest API endpoint

5. Get Python to talk to your database

6. Incorporate your psycopg2 code into your API endpoint

7. Repeat steps 4, 5, and 6 until done

9. Some extra stuff in case you find it helpful