CS 257: Software Design

A Little Help for Web Application, Phase 5

Implementing your API involves the assembly of multiple pieces. Here's a little guide that might help you make your way through the underbrush.

0. References

Labs
Code samples
Data

1. Decide on your database table structure

Keep your CREATE TABLE statements handy

I keep my CREATE TABLE statements in a text file so I can reuse them if necessary. If you need help with SQL SELECT statements, it's great if you can easily share your CREATE TABLE statements with me.

Jeff's books/authors CREATE TABLE statements
CREATE TABLE authors ( id SERIAL, last_name text, first_name text, birth_year integer, death_year integer ); CREATE TABLE books ( id SERIAL, title text, publication_year integer ); CREATE TABLE books_authors ( book_id integer, author_id integer );

2. Convert your raw data to one-CSV-per-table

Jeff's books/authors conversion

3. Set up your database

Move your files to perlman
  • You login to perlman via: ssh YOURUSERNAME@perlman.mathcs.carleton.edu
  • Your browser-accessible directory on perlman.mathcs.carleton.edu is /Accounts/courses/cs257/jondich/web-f2018/YOURUSERNAME. Another path that gets you to exactly the same place is /var/www/html/cs257/YOURUSERNAME.
  • There are many ways to copy files from your computer to perlman. See this list of options for details.
  • You'll want to move your CSV files (the ones that correspond to the tables in your database design) in your perlman account. Later, you'll need your Flask app there, too.
Create your tables
  • Login to perlman
  • cd to wherever your one-CSV-file-per-table files are
  • Run psql (this will connect you to the database named the same thing as your perlman username, using the PostgreSQL username which is also the same thing as your perlman username).
  • If you have tables you want to get rid of, including obsolete copies of your own tables, you can get rid of them by doing:
    DROP TABLE tablename
    for each table you want to get rid of.
  • Copy your CREATE TABLE statements to the psql prompt. Don't forget the semi-colon at the end of each statement.
  • Do \dt at the psql prompt to list all your tables.
  • Do \d tablename at the psql prompt to see the structure of the table "tablename".
Load your data into your tables
  • Make sure you're still at the psql prompt (which you launched while in the directory containing your CSV files on perlman).
  • For each of your tables, execute at the psql prompt:
    \copy tablename from 'tablename.csv' DELIMITER ',' CSV NULL AS 'NULL'
    (By the way, you don't end \ commands like \copy with a semi-colon.)
  • Check to see whether the data loaded properly:
    SELECT * FROM tablename;
    If your table is really big and you don't want to see all the rows, you can view the first five rows like this:
    SELECT * FROM tablename LIMIT 5;

4. Create a stub for your simplest API endpoint

Pick an easy endpoint

In the books/authors example, I would start with my simplest endpoint: /authors with no GET arguments. Ultimately, this endpoint should just return a JSON list of dictionaries, one dictionary per author.

Copy my Flask example to your web directory and test it
  • Here's the Flask sample code
  • Copy my Flask sample to /var/www/html/cs257/YOURUSERNAME/api.py on perlman.
  • Run the API on perlman by executing:
    python3 api.py perlman.mathcs.carleton.edu YOURPORT
    where YOURPORT is the 51xx port number Mike Tie assigned to you in the email he sent you in August.
  • Test the sample API on perlman by going here with a browser:
    http://perlman.mathcs.carleton.edu:YOURPORT/
    This should give you the "Hello, Citizen of CS257" message.
  • Kill the API with Ctrl-C in your perlman terminal (or with kill PROCESSID as discussed a few days ago in class if you ran the API in the background).
  • While you're working on api.py, you can either edit the copy on perlman directly using vim or emacs or nano, or you can edit it on your machine and then scp it to perlman whenever you want to test it.
  • When you add non-trivial endpoints to your API, you'll use the URL:
    http://perlman.mathcs.carleton.edu:YOURPORT/...
    where the ... is the rest of the API endpoint you're testing.
Create your easy endpoint stub
  • Delete my /actors/<last_name> endpoint from api.py
  • Cut the guts out of my /movies endpoint and replace "movies" with your own endpoint's name (in both the @app.route line and the def line)
  • For now, just put a return json.dumps(['goat', 'moose']) or similar return statement at the bottom of your new endpoint stub.
  • Run the API on perlman and test the new endpoint to make sure you get your expected results (e.g. the ["goat", "moose"] JSON string).

5. Get Python to talk to your database

Write a SELECT statement corresponding to your easy API endpoint
  • For my /authors endpoint (just show me all the authors), this would probably be:
    SELECT last_name, first_name, birth_year, death_year FROM authors;
  • Test your SELECT query in psql on perlman
Write a tiny Python program to execute the SELECT statement
  • Grab my psycopg2 sample code
  • Edit it so it executes your SELECT statement and prints enough of the results so you can check the query's correctness.

6. Incorporate your psycopg2 code into your API endpoint

Combine steps 4 and 5 above
  • Take a look at my books_api.py
  • Go ahead and borrow any of that code. At this stage, you'll definitely want to grab my get_connection function, since you'll need it in each of your endpoints.
  • Move your psycopg2 code from step 5 above into your API endpoint stub function from step 4 above.
  • Test it by running the updated API on perlman and pointing your browser at the endpoint.

7. Repeat steps 4, 5, and 6 until done

Keep your steps small
  • Suppose you're doing a colleges/universities database and you're working on an endpoint /colleges that has ten different GET parameters. First, just get /colleges to run as a stub. Then, get it to return five colleges with no restrictions. Then, implement just one of your GET parameters and test that, etc.
  • Do frequent tiny git commits after these substeps. You can then see the evolution of your code, and you don't have to be afraid of breaking things with your later work.
  • Make sure your code runs before you stop working. That ensures, among other things, that you could, in an emergency, just hand in what you've got so far, and it will work.
Really, I mean it.

Do your work in little stages, and test each little bitty stage.

And then commit each little bitty stage.

This is me looking out for your happiness. Do it.

9. Some extra stuff in case you find it helpful

What's this pg_dump thing in the Phase 5 assignment?
  • Once your database on perlman is populated with your tables and your data, you can login to perlman and run:
    pg_dump --no-owner --no-privileges yourdatabasename > dump.sql
    (Recall that your database name is exactly the same as your perlman user name.) This creates a file containing all the SQL instructions required to rebuild your whole database from scratch.
  • Suppose you DROPped all your TABLEs for some reason and just needed to start over. If you had created dump.sql as above, then you could do:
    psql < dump.sql
    to reload the whole thing.
  • Why do I want you to save dump.sql in your repository? Because the graders and I can then use it to run your database and API on our own computers, which turns out to be easier for grading than working directly on perlman (mainly because we don't have permissions to access your databases directly).
Some example queries from the PostgreSQL lab
SELECT authors.first_name, authors.last_name, books.title, books.publication_year FROM books, authors, books_authors WHERE books.id = books_authors.book_id AND authors.id = books_authors.author_id;
SELECT books.title, books.publication_year FROM books, authors, books_authors WHERE books.id = books_authors.book_id AND authors.id = books_authors.author_id AND authors.last_name = 'Willis';
SELECT authors.first_name, authors.last_name, books.title, books.publication_year FROM books, authors, books_authors WHERE books.id = books_authors.book_id AND authors.id = books_authors.author_id AND authors.last_name LIKE 'B%' ORDER BY authors.last_name, authors.first_name;
SELECT authors.first_name, authors.last_name, books.title, books.publication_year FROM books, authors, books_authors WHERE books.id = books_authors.book_id AND authors.id = books_authors.author_id ORDER BY authors.last_name;