Sunday, 28 October 2012

Investigating Deadlocks – Part 4: Fixing the Code

In the last in this short series of blogs in which I’ve been talking about analysing deadlocks, I’m going to fix my BadTransferOperation code. If you've seen the other blogs in this series, you'll know that in order to get to this point I’ve created the demo code that deadlocks, shown how to get hold of a thread dump and then analysed the thread dump, figuring out where and how a deadlock was occurring.

In order to save space, the discussion below refers to both the Account and DeadlockDemo classes from part 1 of this series, which contains full code listings.

Textbook descriptions of deadlocks usually go something like this: “Thread A will acquire a lock on object 1 and wait for a lock on object 2, while thread B acquires a lock on object 2 whilst waiting for a lock on object 1”. The pile-up shown in my previous blog, and highlighted below, is a real-world deadlock where other threads, locks and objects get in the way of the straightforward, simple, theoretical deadlock situation.

Found one Java-level deadlock:
=============================
"Thread-21":
  waiting to lock monitor 7f97118bd560 (object 7f3366f58, a threads.deadlock.Account),
  which is held by "Thread-20"
"Thread-20":
  waiting to lock monitor 7f97118bc108 (object 7f3366e98, a threads.deadlock.Account),
  which is held by "Thread-4"
"Thread-4":
  waiting to lock monitor 7f9711834360 (object 7f3366e80, a threads.deadlock.Account),
  which is held by "Thread-7"
"Thread-7":
  waiting to lock monitor 7f97118b9708 (object 7f3366eb0, a threads.deadlock.Account),
  which is held by "Thread-11"
"Thread-11":
  waiting to lock monitor 7f97118bd560 (object 7f3366f58, a threads.deadlock.Account),
  which is held by "Thread-20"


If you relate the text and image above back to the following code, you can see that Thread-20 has locked its fromAccount object (f58) and is waiting to lock its toAccount object (e98)



    private void transfer(Account fromAccount, Account toAccount, int transferAmount) throws OverdrawnException {

     
synchronized (fromAccount) {
       
synchronized (toAccount) {
         
fromAccount.withdraw(transferAmount);
          toAccount.deposit
(transferAmount);
       
}
      }
    }

Unfortunately, because of timing issues, Thread-20 cannot get a lock on object e98 because it’s waiting for Thread-4 to release its lock on that object. Thread-4 cannot release the lock because it’s waiting for Thread-7, Thread-7 is waiting for Thread-11 and Thread-11 is waiting for Thread-20 to release its lock on object f58. This real-world deadlock is just a more complicated version of the textbook description.

The problem with this code is that, from the snippet below, you can see that I’m randomly choosing two Account objects from the Accounts array as the fromAccount and the toAccount and locking them. As the fromAccount and toAccount can reference any object from the accounts array, it means that they're being locked in a random order.

        Account toAccount = accounts.get(rnd.nextInt(NUM_ACCOUNTS));
        Account fromAccount = accounts.get
(rnd.nextInt(NUM_ACCOUNTS));

Therefore, the fix is to impose order on how the Account object are locked and any order will do, so long as it's consistent.

    private void transfer(Account fromAccount, Account toAccount, int transferAmount) throws OverdrawnException {

     
if (fromAccount.getNumber() > toAccount.getNumber()) {

       
synchronized (fromAccount) {
         
synchronized (toAccount) {
           
fromAccount.withdraw(transferAmount);
            toAccount.deposit
(transferAmount);
         
}
        }
      }
else {

       
synchronized (toAccount) {
         
synchronized (fromAccount) {
           
fromAccount.withdraw(transferAmount);
            toAccount.deposit
(transferAmount);
         
}
        }
      }
    }

The code above shows the fix. In this code I’m using the account number to ensure that I’m locking the Account object with the highest account number first, so that the deadlock situation above never arises.

The code below is the complete listing for the fix:

public class AvoidsDeadlockDemo {

 
private static final int NUM_ACCOUNTS = 10;
 
private static final int NUM_THREADS = 20;
 
private static final int NUM_ITERATIONS = 100000;
 
private static final int MAX_COLUMNS = 60;

 
static final Random rnd = new Random();

  List<Account> accounts =
new ArrayList<Account>();

 
public static void main(String args[]) {

   
AvoidsDeadlockDemo demo = new AvoidsDeadlockDemo();
    demo.setUp
();
    demo.run
();
 
}

 
void setUp() {

   
for (int i = 0; i < NUM_ACCOUNTS; i++) {
     
Account account = new Account(i, rnd.nextInt(1000));
      accounts.add
(account);
   
}
  }

 
void run() {

   
for (int i = 0; i < NUM_THREADS; i++) {
     
new BadTransferOperation(i).start();
   
}
  }

 
class BadTransferOperation extends Thread {

   
int threadNum;

    BadTransferOperation
(int threadNum) {
     
this.threadNum = threadNum;
   
}

   
@Override
   
public void run() {

     
for (int i = 0; i < NUM_ITERATIONS; i++) {

       
Account toAccount = accounts.get(rnd.nextInt(NUM_ACCOUNTS));
        Account fromAccount = accounts.get
(rnd.nextInt(NUM_ACCOUNTS));
       
int amount = rnd.nextInt(1000);

       
if (!toAccount.equals(fromAccount)) {
         
try {
           
transfer(fromAccount, toAccount, amount);
            System.out.print
(".");
         
} catch (OverdrawnException e) {
           
System.out.print("-");
         
}

         
printNewLine(i);
       
}
      }
     
System.out.println("Thread Complete: " + threadNum);
   
}

   
private void printNewLine(int columnNumber) {

     
if (columnNumber % MAX_COLUMNS == 0) {
       
System.out.print("\n");
     
}
    }

   
/**
     * This is the crucial point here. The idea is that to avoid deadlock you need to ensure that threads can't try
     * to lock the same two accounts in the same order
     */
   
private void transfer(Account fromAccount, Account toAccount, int transferAmount) throws OverdrawnException {

     
if (fromAccount.getNumber() > toAccount.getNumber()) {

       
synchronized (fromAccount) {
         
synchronized (toAccount) {
           
fromAccount.withdraw(transferAmount);
            toAccount.deposit
(transferAmount);
         
}
        }
      }
else {

       
synchronized (toAccount) {
         
synchronized (fromAccount) {
           
fromAccount.withdraw(transferAmount);
            toAccount.deposit
(transferAmount);
         
}
        }
      }
    }
  }
}

In my sample code, a deadlock occurs because of a timing issue and the nested synchronized keywords in my BadTransferOperation class. In this code, the synchronized keywords are on adjacent lines; however, as a final point, it’s worth noting that it doesn't matter where in your code the synchronized keywords are (they don't have to be adjacent). So long as you're locking two (or more) different monitor objects with the same thread, then ordering matters and deadlocks happen.


For more information see the other blogs in this series.

All source code for this an other blogs in the series are available on Github at git://github.com/roghughe/captaindebug.git

2 comments:

Mario Guerrero said...

Thank,

Very good post.

Danilo said...

thanks a lot, very helpful