Memory management: talloc

1. Get started
- 1.1. GitHub and Repl.it
2. The idea
3. Storing the active list
4. Some specifics
5. Capstone work
6. What to submit

In the last assignment, you built a linked list, and wrote code that hopefully cleaned up the list appropriately. Perhaps you have been missing the convenience of using a language with a garbage collection system that spares you from having to remember to clean individual things up. For this assignment, we’re going to build an exceedingly dumb but effective garbage collector. This garbage collector is so inefficient that this may bother some of you; if so, consider improving the garbage collector to be an optional extension that you can think about when the project is complete.

This is an interpreter project team assignment, which means that if you are working on a team with someone else, you and your partner should do your best to even the workload between you. Strict pair programming is not required, but you should make sure that both team members are contributing to the project. I do recommend that you try to do pair programming if it is convenient to do so.

1 Get started

1.1 GitHub and Repl.it

Hopefully, you know the routine by now. Let’s have the member of your team whose Carleton official email address comes first alphabetically do the step of setting things up. Here is the GitHub classroom link. Feel free to look back at the previous assignment if you need a reminder on getting everything set up.

2 The idea

You’ll be creating your own replacement for malloc, which we’ll call talloc (for “track malloc”). For a user, talloc seems to work just like malloc, in that it allocates memory and returns a pointer to it. Inside your code for talloc, you’ll need to call malloc to do exactly that. Additionally, talloc should store the pointer to that memory in a linked list that we’ll call the “active list” for purposes of discussion. Every time talloc is called, another pointer to memory gets added to that active list.

You’ll then also create a function called tfree, which will free up all memory associated with pointers accumulated do to calls to talloc. Calling tfree at arbitrary points in your program would be a complete disaster, as it would free up memory that you may still be using. The idea is that we will be using talloc as a replacement for malloc, and then calling tfree at the very end of our main function. You’ll then be able to program with the illusion of using a garbage collector, except that the garbage collector never actually kicks in until the program is about to end. (This is actually an option for the leJOS system for programming LEGO Mindstorms in Java; if interest, ask me for further info as to why a real system would implement such a crazy-seeming idea.)

You’ll also write the function texit, which is a simple replacement for the built-in function exit. texit calls exit, but calls tfree first.

Finally, you’ll then modify your linked list from the previous assignment. The function cleanup that you wrote will be eliminated, as it is no longer necessary. You should also modify reverse so that it no longer duplicates data between the two linked lists. When you reverse a list, that should return a new list with a new set of CONS_TYPE Value nodes, but the actual data in that list should not be copied from the old list to the new. This would be a disaster to try to clean up manually, but tfree will handle it easily. This change will make some later aspects of the project much easier. Your linked list code should now exclusively use talloc, and should not use malloc at all.

3 Storing the active list

One issue you’ll need to think through is where the variable for the head of the active list should be. In an object-oriented language, this would likely be a private static variable in a memory management class. Oops. You can’t make the active list head a local variable in talloc, because tfree wouldn’t be able to see it. We could make it a parameter to talloc and tfree, but then the programmer using talloc has to keep track of this, and could conceivably have multiple active lists, which sounds ugly. This is an occasion where global variable makes sense, and so you should use one. A global variable in C is declared outside of any functions. Typically, it is placed near the top of your file, underneath the include statements.

There’s one bit of circular logic you’ve got to untangle. talloc needs to store a pointer (returned by malloc) onto a linked list. Your linked list code, in turn, uses talloc. Rather than trying to make this work in some complex mutually dependent structure, my recommendation is to break the circularity. In your talloc code, the single linked list that you use to store allocated pointers should be a linked list generated via malloc, instead of talloc. That means you’ll need to duplicate some of your linked list code. Duplicated code is generally to be avoided, but avoiding this circular nightmare is worth it.

4 Some specifics

After you clone your repository, you should be able to see that you get the following starting files:

value.h: this defines the Value structure again
linkedlist.h: this is a modification form the previous assignment that removes the function cleanup, and also changes the documentation on reverse to indicate that data is not to be copied.
talloc.h: this defines the functions that you’ll need to write from scratch for this assignment.
main.c: this is a tester function.
Makefile: contains instructions for the command make, which will compile and test your code
test-e and test-m: usual
test_utilities.py: helper utilities used by test-e and test-m

The missing files here are linkedlist.c and talloc.c. You should create talloc.c from scratch yourself. For linkedlist.c, you should copy in code from your previous assignment, and then modify it accordingly. To compile your code, issue the command make at the command prompt. This will follow the instructions in the Makefile for building your project in order to produce an executable called linkedlist. At first, it won’t build at all because your talloc.c and linkedlist.c files aren’t there. To get started, copy in linkedlist.c (remove the cleanup function), and for now, within talloc.c just create every function that you need with no code inside it so that you can get everything to build. Once you have done that, you can begin implementing your functions, and testing appropriately.

The tester code creates an executable that you can run by typing ./linkedlist. The easiest way to run the tests is to use ./test-m and ./test-e, as usual, which will automatically compile all of your code and run the ./linkedlist executable for you.

Your code should have no memory errors when running on any input (correct or incorrect) using valgrind. The testing scripts will automatically run valgrind on your code, and show you if there are memory errors.

5 Capstone work

Work in this section is 100% optional, and doesn’t contribute towards your grade. Nonetheless, if you’re looking for an extra challenge, these are fun additional exercises to try.

Write a function that that can report a count, in bytes, of how much memory is being used in total by this talloc technique. It should include both all of the memory that the user asked for when calling talloc, but also all of the additional overhead in Value structs that talloc creates for assembling a linked list. Here is a signature for that function. You can add it to your talloc.h file:

int tallocMemoryCount();

6 What to submit

Make sure that you have added your new .c correctly, then push to GitHub. (You can leave out your compiled executables as well as the .o object files if you wish.)

Make sure to label your submission as usual via an appropriate commit message, and make sure your tests have run on GitHub.

Good luck, and have fun!