CarlSTM

Introduction
- Terminology
- Implementation challenges
Two types of thread-safe hash tables
- Submission 1a: Coarse-grained locking
- Submission 1b: Fine-grained locking
Implementing the STM
STM Submissions
STM hash table
- Submission 3: STM Hash Table and writeup
Write-Up Questions
Bonus Work
Submission Details

This is a pair programming assignment. If you are on a team, this means that you and your partner should be doing the entirety of this assignment side-by-side, on a single computer, where one person is "driving" and the other is "navigating." Take turns every so often who is driving; you should each spend approximately 50% of the time driving.

Introduction

Software Transactional Memory (STM) is a mechanism for concurrency control in shared-memory programs, the idea for which originated in the database community. The basic concept is that critical sections of code in your program can be labeled as atomic blocks:

atomic {
    int balance = account.getBalance();
    balance += 100;
    account.setBalance(balance);
}

Within atomic blocks, you can reason about your code as if the program were sequential. The language implementation ensures that atomic blocks execute atomically (all-or-nothing) and in isolation (without interference from other threads). Or, to be more precise, the execution behaves as if atomic blocks executed atomically and in isolation; in reality, atomic blocks in different threads may execute in parallel, but the program's behavior should be indistinguishable from an execution with no parallelism.

Terminology

A few pieces of terminology are standard:

A transaction is an attempted execution of atomic block at runtime.
When a transaction finishes executing its code, it attempts to commit. Committing a transaction makes its effects permanent.
The attempt to commit may fail if, for example, a memory location that the committing transaction read has been changed. In this case the transaction aborts instead of committing.
If a transaction aborts, the implementation should undo the side effects of the transaction (if any) and retry the transaction from the beginning.

In this project, we'll take a so-called "lazy" (as opposed to "eager") approach to STM. This means that updates to memory are kept in a thread-local buffer, and only made visible to other threads after the transaction has committed. (An eager approach updates memory locations directly during transaction execution and undoes its updates in the event the transaction aborts.) Implementing a lazy STM means that the most important and delicate part of your code will be the commit operation: this method must determine if it is safe to commit and, if so, update any modified memory locations.

Implementation challenges

Implementing software transactional memory properly is quite difficult:

As mentioned earlier, aborting means that all side effects of the transaction should be undone. In practice, however, some side effects cannot be aborted. For example, if your program fires a missile, it's unlikely the STM implementation will be able to recall it.
The implementation must do extra work at every shared memory access inside an atomic block. In particular, it needs to keep enough bookkeeping information so that it is possible to (1) detect conflicts with other transactions and (2) abort the transaction if necessary. It's difficult to balance accurate bookkeeping with adequate performance.
If memory accesses not in atomic blocks are not supervised by the STM, they could potentially observe inconsistent state from incomplete transactions. This means implementations may choose to sacrifice true atomicity and isolation to avoid incurring high overheads in non-atomic code.

For these reasons and more, STM has not been adopted as standard in most languages. However, it is still of great interest to language experts and efforts are being made to bring it to mainstream programming languages (e.g., this proposal for adding transactions to C++). Furthermore, the language Clojure supports STM, and has been gaining significant traction.

Two types of thread-safe hash tables

Part of writing parallel programs involves making sure your data structures are thread-safe. Set.java gives a minimal interface for an abstract set data type. The file HashSet.java gives a simple implementation of a separate chaining hash table. This implementation is not thread-safe; you will implement several thread-safe versions of this class.

The first two implementations (parts 1a and 1b) should use standard Java locks, and exemplify the two extremes of how you might use locks to properly synchronize memory accesses in your program.

Submission 1a: Coarse-grained locking

Create a version called CoarseHashSet that uses a single lock for the entire set. Threads should acquire this lock for each method.

Submission 1b: Fine-grained locking

Create a version called FineHashSet that uses a lock per array index.

Implementing the STM

It's your job to implement a library called CarlSTM. You'll be submitting this in multiple revisions, described at the end of this section. What follows first is a description of the entire STM implementation.

The beginnings of the library are in the folder carlstm. The class CarlSTM provides a static method execute, which takes a Transaction to run. This is our version of an atomic block; instead of using atomic { CODE }, we use a more cumbersome syntax:

T result = CarlSTM.execute(new Transaction<T>() {
    public T run() throws NoActiveTransactionException,
            TransactionAbortedException {
        CODE
    }
}

It ends up being very convenient to have transactions return a value, so the Transaction class is parameterized by a return type T (which can be Void if appropriate), and the Transaction.run() and CarlSTM.execute() methods return a value of type T.

Transactional objects

We'll avoid some of the challenges described in the introduction by not providing true atomicity and isolation, as we would expect from a "real" STM. We will instead support atomicity and isolation only for limited types of operations. Specifically, transactions can read and write special transactional objects, which are objects from the class TxObject. If you implement your STM correctly, all reads and writes of TxObjects within a transaction should happen atomically and in isolation.

As an example, look at the file SimpleTransaction.java in examples. This program creates a shared TxObject x and two transactions that increment x five times each. As it is, the updates from the two transactions may be interleaved; the goal is that one transaction should perform five updates in row, then the second should perform five more in a row. (Note that we're cheating a bit here by putting printlns in the transactions. These are an example of an operation that has an un-undoable side effect (printing to the screen). If a transaction aborts and retries, you may see printlns from the aborted transaction. This is OK.)

TxObjects are defined in TxObject.java, much of which has been left for you to implement. TxObjects are parameterized by a type T, which represents the type of the object stored in a particular TxObject. They support two operations: read() and write. Both of these operations must be called from within a transaction; any other calls should result in a NoActiveTransactionException.

The value field of each TxObject should be protected by a lock. This will be important during transaction commit.

Lazy buffering

So, how should you actually implement the STM? The technique you should use is called lazy buffering. In lazy buffering, each transaction maintains a private data structure recording which TxObjects it has read and/or written during its current execution attempt. Specifically, for each TxObject touched by the transaction, you should track the object's "old" value (the value seen the first time the TxObject was accessed) and the object's "new" value (the value most recently written to the object by the transaction, or the old value if no writes have occurred).

You should store each transaction's updates in an object of type TxInfo. You'll need to add the appropriate data structure(s) to this file. When a transaction calls read() or write() on a TxObject, those methods should delegate to the current transaction's TxInfo instead of directly reading or writing the value field of the TxObject class. The value field should only be read or written in three cases: (1) when the TxObject is created; (2) the first time a transaction calls read() or write() on the TxObject; and (3) during transaction commit.

You'll have to deal with a few tricky details in writing the TxInfo class:

One difficulty is how to find a transaction's TxInfo. The read() and write() methods for TxObject do not take any parameters that would allow us to access the current transaction's info. However, Java provides a convenient way of storing thread-local data. Look at the documentation for java.lang.ThreadLocal for more information. You should be able to use this functionality to retrieve the current thread's TxInfo from within the TxObject methods or CarlSTM.execute.
Java's generics can't really express the idea of "I want a list of pairs of objects in which both objects have the same type, but different pairs may have different types." (This isn't exactly what you need, but it's close.) You'll have to use the "unknown" type ? (e.g., TxObject<?>) and/or "raw" types (e.g., TxObject). This means you'll have warnings about unsafe casts in your code; just add @SuppressWarning annotations to get rid of them. (In case you're curious, in order to avoid using raw types, Java would need to support so-called existential types.)

Executing a transaction

When a program executes a transaction using CarlSTM.execute(), the following steps should take place:

Find the current thread's TxInfo and call its start() method. This should toggle a flag in the TxInfo indicating a transaction is active. If a transaction is already active, TxInfo.start() should throw a TransactionAlreadyActiveException. (This is somewhat unsatisfying, as part of the appeal of transactions is the ease with which we can combine multiple transactions by nesting them inside one larger transaction. See the extra credit for more on nested transactions.)
Call the transaction's run() method to execute the atomic block.
If run() throws a TransactionAbortedException (which it shouldn't until you've implemented part of the performance optimizations listed below), call TxInfo.abort() to clean up any transactional state and go back to step 1 to retry the transaction.</li>
Call TxInfo.commit() in order to attempt to commit the transaction. (More later on what exactly happens during commit().)
If commit() returns true, the transaction is complete! Return the transaction's result.
If commit() returns false, the attempt to commit failed and the transaction has aborted. In this case, commit() should have taken care of calling abort(). Go back to step 1 to retry the transaction.

Committing a transaction

The commit() operation is the most important part of our STM. As stated earlier, every transaction maintains a private buffer containing the old and new values for every TxObject read or written by the transaction. commit() tries to make the transaction's updates permanent. Since transactions must behave as if they executed atomically and in isolation, commit must ensure that all of the TxObjects have their expected ("old") value. (We could probably ignore the "old" value for TxObjects that were written before being read, but that is unusual and not worth adding a special case for.)

Of course, it is crucial that no other thread change the value of a TxObject after commit() compares the TxObject's current value to the old value but before commit() updates the TxObject with the new value. Therefore, commit() should acquire locks for all of the TxObjects read or written by the transaction, and it should not release those locks until it has updated the value of every TxObject written by the transaction.. This ensures that the updates appear to happen "all at once" to other threads.

One tricky detail here is that acquiring multiple locks at the same time introduces the possibility of deadlock, which is not acceptable. You can prevent deadlock in one of two ways:

You can require that every transaction acquire locks in the same global order. This will require creating some way of ordering the TxObjects (and, presumably, storing them in sorted order in the buffer). A downside of this approach is that transactions can block waiting for a lock, only to discover that the TxObject's value has changed and the transaction needs to abort.
A second approach is to try to acquire the lock for each TxObject, and if the lock is already held by another transaction, pessimistically choose to abort the transaction. This is the approach I recommend. You can use the trylock method of the java.util.concurrent.locks.Lock class to try to acquire a lock without blocking.

Performance optimizations

You are required to implement the following performance optimizations.

Read/write locks: Using regular locks for protecting TxObjects means that we cannot effectively exploit read parallelism. Therefore, you should modify your implementation to use read/write locks instead of normal locks. During run(), acquire read locks only, since the values are not updated until commit(). During commit(), for TxObjects where the old and new values are the same, you should acquire only the read lock. It turns out that it is safe to release the read locks before updating the written TxObjects, so the commit operation should proceed as follows:

Acquire all necessary TxObject locks and check all values match the old values.
Release the read locks.
Update the written TxObject values.
Release the write locks.

Exponential backoff: Consider the case in which you try to acquire a read lock during run(). If the call to lock() blocks, that means some other thread is holding a write lock on that TxObject and will likely change its value. Therefore in this case, it makes sense to preemptively abort and retry the transaction. Use tryLock() to attempt to acquire the read lock; if tryLock() fails, throw a TransactionAbortedException and catch it in CarlSTM.execute().

It is usually not the best choice to retry the transaction immediately, since the other thread will still be holding the write lock for a while afterwards and the thread will fail again. Therefore you should implement <i>exponential backoff</i>: delay for a short period before retrying the transaction, and double the length of the delay every time the transaction fails again. Even delaying by a nanosecond should save you time wasted running and re-running doomed transactions. You should only delay if the transaction aborted due to a failed call to tryLock() in run() or commit().

STM Submissions

Breaking this project up into multiple pieces is challenging because so much of it is interdependent. You'll be submitting drafts that may not be entirely functional, but will demonstrate that you are making adequate progress.

Submission 2a: Lazy buffering

For this submission, lazy buffering should be working as described above. You should have appropriate code written for TxObject, TxInfo, and/or other classes as necessary to get the lazy buffering aspect of this assignment working. You should add debugging output (print statements and the like) to demonstrate that the methods you have written here are working. Submit a very short writeup describing what we should be looking for in order to see that you have lazy buffering implemented and working.

Submission 2b: Executing and committing a transaction

You should have code that will execute and commit a transaction, though at this stage it may still be buggy. Nonetheless, it should be clear that you have all the essential pieces in place. Submit a very short writeup describing where we should look in your code to observe that you have the framework essentially implemented.

Submission 2c: Executing and committing a transaction working and optimized

This is a repeat of the previous step, where your code should now be working. You should submit test code that indicates that this is working, and you should also implement the performance optimizations described above. Submit a very short writeup describing what you have implemented and how we can best observe that you have done so.

STM hash table

Submission 3: STM Hash Table and writeup

Implement a version of the hash set using the CarlSTM transactional constructs. Each method that was synchronized before (or that used Locks) should now use a single atomic block. Note that every reference that is neither read-only nor thread-local should be represented with a TxObject.

Write-Up Questions

You implemented three versions of the hash set: coarse-grained locked, fine-grained locked, and transactional. Compare the performance of the three implementations as your vary the number of threads in the system. Plot your results both for low-capacity hash sets (less than 20 slots) as well as for high-capacity hash sets (more than 1000 slots). Discuss.
Plot the performance of your transactional hash table while varying the length of the exponential backoff. Include a version in which the exponential backoff is disabled. For this experiment, you should use more threads than you have processors; this makes the time wasted retrying doomed transactions more obvious (since there might be other threads waiting to use the processor). Discuss.
Modify your code so that it prints the number of transactions committed and aborted for each thread. How does varying the initial length of the exponential backoff change these statistics? Include a version in which the exponential backoff is disabled. Again, use more threads than processors. Discuss.
What bonus projects, if any, did you implement?

Bonus Work

If you've finished the project, you can try these extensions out for as a bonus:a point of extra credit each:

Nested transactions: Modify your implementation to allow nesting of transactions. This will require modifying the start(), commit() and abort() methods of TxInfo. The updates of inner transactions should not become visible until the outermost transaction commits. Aborting an inner transaction should abort the entire transaction. Include in your submission a test case that demonstrates nested transactions.
Print buffering: Add a print(String) method to CarlSTM. Calling this method allows "safe" printing inside a transaction; the printed Strings are buffered until the transaction commits, then printed to the screen. Include in your submission a test case that demonstrates print buffering.

Submission Details

Submit all your new code files, including any additional Java files you created for testing, and any provided files you modified, to Moodle as a zip file. Make sure your code is properly documented.

As we have discussed, the grader for this course is a student taking the course. As such, please make sure your submission, other than the single file I will describe in a moment, is anonymous. In particular, your name(s) should not appear in your write-up or in any of your Java code.

You should include a single file credits.txt that contains the following information:

Your name and your partner's name (if applicable).
A list of people, books, websites or other sources you used in the course of completing this project. Include anyone or anything except your partner, me, and the course materials. In particular, if you had more than a passing interaction with another student, or if you used code from a website as a starting point, list that information.

To recap, you should submit three files, the first two of which should be anonymized:

A zip file containing any Java code you created or modified and any other relevant files (e.g. scripts for testing).
Your write-up - PDF, doc, txt, etc.
Citations in a text file credits.txt.