Memory management: talloc

Table of Contents

In the last assignment, you built a linked list, and wrote code that hopefully cleaned up the list appropriately. Perhaps you have been missing the convenience of using a language with a garbage collection system that spares you from having to remember to clean individual things up. For this assignment, we're going to build an exceedingly dumb but effective garbage collector. This garbage collector is so inefficient that this may bother some of you; if so, consider improving the garbage collector to be an optional extension that you can think about when the project is complete.

This is an interpreter project team assignment, which means that if you are working on a team with someone else, you and your partner should do your best to even the workload between you. Strict pair programming is not required, but you should make sure that both team members are contributing to the project. I do recommend that you try to do pair programming if it is convenient to do so.

1 Get started

1.1 Use GitHub classroom to create a repository for your assignment

The first thing you'll need to do is to create a repository for your project using GitHub classroom. If you are working on a team, this repository will be shared, so only one of you should do this. To help reduce a round of communication, I'm going to designate that if you are working in a pair, the member of your team whose Carleton official email address comes first alphabetically should do this next step of setting up the GitHub repository. This is opposite from the previous pair assignment; I'm trying to give everyone an equal chance to participate. If you are working alone, then you should do this step.

If you are the designated person above, visit this GitHub classroom link (which I've placed in Moodle). Log into GitHub if your browser prompts you to after you get to GitHub. You should be taken to a screen that invites you to either "Join an existing team" or to "Create a new team". Pick the team that you created for the last team assignment. GitHub should then hopefully respond with a screen that says "You are ready to go!" and supplies you with a link for the repository that you are creating. Click on that link, and you should be taken to the repository. If you've gotten this far, you have successfully created the repository, and you the user who is logged in have access to it. Contact your partner if you have one and tell them that this step worked. (If you are working alone on the project, you can skip the next step.)

The other one of you, after the above is complete, should visit the same GitHub classroom link in the first sentence in the previous paragraph. You'll see the screen asking you to "Join an existing team" or to "Create a new team." You should be able to see the team that your partner created above; click the "Join" button. This should then take you to the "You are ready to go!" screen, and you can hopefully find a link to the repository. Click on it. You now both have access to the same GitHub repository.

1.2 Select the correct repl in Repl.it (only if you're in a pair)

You should use the same repl in Repl.it that you used for the last assignment.

Again for the person designated above as the starting person, you should now clone the repository from GitHub into Repl.it. If you're in a pair, you can in principle both be watching in multiplayer mode while you do this.

In Repl.it, make sure that you are in your home directory in the terminal window. You can always navigate there by typing

cd ~

in the terminal window. To further confirm, type pwd (which is short for "print working directory.") You should see /home/runner. If you see something different, you're not in the home directory, so try again or ask for help.

In a different browser tab (leave the one with Repl.it open), visit our our class GitHub organization page online. Once you've landed there, you should see the GitHub repository you created in the GitHub Classroom step above. It should be named talloc-username (where username is either own username, or a combination of the two of you if you're working in a pair). Contact us for help if the repository isn't there. Click on that repository title. This should take you to the home page for that repository. There is a green box on the page that says "Clone or download." Click that box, and make sure it says "Clone with HTTPS." Highlight the https link shown, and copy it to the clipboard. The link that you're copying should look something like

https://github.com/carleton251-term/talloc-username.git

… though your username and term will be different.

Then navigate back to the Repl.it tab with the repl that you created above, and click on the terminal window within your new repl. Your next step is to clone the repository from GitHub. To do that, type git clone, then a space, then paste in the URL from the github page that you copied from above. (To paste, you might need to right-click in the terminal window and choose paste rather than using a keyboard shortcut.)

For example, I would type the following, though your username and term will be different:

git clone https://github.com/carleton251-term/talloc-username.git

You'll be prompted to enter your GitHub username and password.

If all goes well, you should receive a directory, titled talloc-username. If you type ls at the prompt, you should be able to see it. It will also appear in your file browser window with Repl.it on the left. Then navigate into that directory by typing

cd talloc-username

… and you should be all set to work.

2 The idea

You'll be creating your own replacement for malloc, which we'll call talloc (for "track malloc"). For a user, talloc seems to work just like malloc, in that it allocates memory and returns a pointer to it. Inside your code for talloc, you'll need to call malloc to do exactly that. Additionally, talloc should store the pointer to that memory in a linked list that we'll call the "active list" for purposes of discussion. Every time talloc is called, another pointer to memory gets added to that active list.

You'll then also create a function called tfree, which will free up all memory associated with pointers accumulated do to calls to talloc. Calling tfree at arbitrary points in your program would be a complete disaster, as it would free up memory that you may still be using. The idea is that we will be using talloc as a replacement for malloc, and then calling tfree at the very end of our main function. You'll then be able to program with the illusion of using a garbage collector, except that the garbage collector never actually kicks in until the program is about to end. (This is actually an option for the leJOS system for programming LEGO Mindstorms in Java; ask me in class as to why a real system would implement such a crazy-seeming idea.)

You'll also write the function texit, which is a simple replacement for the built-in function exit. texit calls exit, but calls tfree first.

Finally, you'll then modify your linked list from the previous assignment. The function cleanup that you wrote will be eliminated, as it is no longer necessary. You should also modify reverse so that it no longer duplicates data between the two linked lists. When you reverse a list, that should return a new list with a new set of CONS_TYPE Value nodes, but the actual data in that list should not be copied from the old list to the new. This would be a disaster to try to clean up manually, but tfree will handle it easily. This change will make some later aspects of the project much easier. Your linked list code should now exclusively use talloc, and should not use malloc at all.

3 Storing the active list

One issue you'll need to think through is where the variable for the head of the active list should be. In an object-oriented language, this would likely be a private static variable in a memory management class. Oops. You can't make the active list head a local variable in talloc, because tfree wouldn't be able to see it. We could make it a parameter to talloc and tfree, but then the programmer using talloc has to keep track of this, and could conceivably have multiple active lists, which sounds ugly. This is an occasion where global variable makes sense, and so you should use one. A global variable in C is declared outside of any functions. Typically, it is placed near the top of your file, underneath the include statements.

There's one bit of circular logic you've got to untangle. talloc needs to store a pointer (returned by malloc) onto a linked list. Your linked list code, in turn, uses talloc. Rather than trying to make this work in some complex mutually dependent structure, my recommendation is to break the circularity. In your talloc code, the single linked list that you use to store allocated pointers should be a linked list generated via malloc, instead of talloc. That means you'll need to duplicate some of your linked list code. Duplicated code is generally to be avoided, but avoiding this circular nightmare is worth it.

4 Some specifics

After you clone your repository, you should be able to see that you get the following starting files:

  • value.h: this defines the Value structure again
  • linkedlist.h: this is a modification form the previous assignment that removes the function cleanup, and also changes the documentation on reverse to indicate that data is not to be copied.
  • talloc.h: this defines the functions that you'll need to write from scratch for this assignment.
  • main.c: this is a tester function.
  • Makefile: contains instructions for the command make, which will compile and test your code

The missing files here are linkedlist.c and talloc.c. You should create talloc.c from scratch yourself. For linkedlist.c, you should copy in code from your previous assignment, and then modify it accordingly. To compile your code, issue the command make at the command prompt. This will follow the instructions in the Makefile for building your project in order to produce an executable called linkedlist. At first, it won't build at all because your talloc.c and linkedlist.c files aren't there. To get started, copy in linkedlist.c (remove the cleanup function), and for now, within talloc.c just create every function that you need with no code inside it so that you can get everything to build. Once you have done that, you can begin implementing your functions, and testing appropriately.

The tester code creates an executable that you can run by typing ./talloc. However, if you find it more convenient, you can type make test which will compile your code and run the tests all in one shot.

Your code should have no memory leaks or memory errors when run on any input (correct or incorrect) using valgrind. We will be checking this during grading. To run your code through valgrind, I have added to the Makefile a section to do this. You can compile your code and run the tests through valgrind by typing make valgrind at the terminal prompt.

5 Capstone work

Work in this section is 100% optional, and not worth any points. Nonetheless, if you're looking for an extra challenge, these are fun additional exercises to try.

Write a function that that can report a count, in bytes, of how much memory is being used in total by this talloc technique. It should include both all of the memory that the user asked for when calling talloc, but also all of the additional overhead in Value structs that talloc creates for assembling a linked list. Here is a signature for that function. You can add it to your talloc.h file:

int tallocMemoryCount();

6 What to submit

Add all of your files into your Git repository and push to GitHub. Your submitted files should be exactly what you checked out, but also with files named linkedlist.c and talloc.c. (You can leave out your compiled executables as well as the .o object files if you wish.)

The graders and I need to know which commit is the one that we should use for grading. This is handled well in Git with tags. You did this back in the original clab. If you've forgotten how to tag in Git, look at the end of part 1 of the C lab. Name your tag talloc-done.

Good luck, and have fun!