Tuesday, 16 August 2011

How Big Should a Function Be?

The other day I was talking to a colleague; we were comparing horror stories about the largest functions that we’d ever seen and I quoted a 14,500 line absolute monster that I’d seem a long time ago that was part of a stock settlement system written in C++ and, oh how we laughed. This got me thinking that you never really see many opinions, recommendations or rules of thumb for the optimum length of a function, so when it comes down to it I like to quote Bob Martin’s two rules covering the length of a function from his book ‘Clean Code - A Handbook of Agile Software Craftsmanship’: "The first rule is that they should be small. The second rule is that they should be smaller still". So how small is small? Bob recommends that they should “hardly ever be 20 lines long”. You may think that writing your functions so that they hardly ever reach 20 lines long is a tall order, but actually, it isn’t. What you should do is apply Bob’s next rule on page 35 of his book:

FUNCTIONS SHOULD DO ONE THING. THEY SHOULD DO IT WELL. THEY SHOULD DO IT ONLY.

Where as I could criticise the English of the final sentence, I cannot disagree with its sentiment, and the book states this rule in capital letters underlining how important this is.

You may now argue that you’ll end up with a whole bunch of new functions in your class making your class longer and more difficult to read. It’s true, your class does get longer, so you need to apply Bob’s next rule known as 'The Stepdown Rule'. The idea here is that you write you code so that is reads from top to bottom. Any functions that your function calls are below your function in the file and are in the order in which they are called. This way your code becomes a narrative, with every function followed by all those at the next level of abstraction. To me, code written this way becomes almost like reading a newspaper column, with a headline paragraph at the top followed by more and more detail.

To help illustrate what I’m talking about, I’m going to go back to a blog I wrote in February 2011, which demonstrated how to write a generic toString() method helper class that uses reflection to create a string representation of a bean. Below is the toString() method from that blog:

public static String toString(Object obj) {

        StringBuilder sb = new StringBuilder("[Bean : ");
        sb.append(obj.getClass().getSimpleName());

        sb.append("  {");

        Method[] ms = obj.getClass().getMethods();

        for (Method method : ms) {
            String name = method.getName();

            if ((name.startsWith("is")) ||
                ((name.startsWith("get") &&
                (!name.equals("getClass"))))) {
                try {
                    sb.append(name);

                    Object data = method.invoke(obj, (Object[]) null);
                    if (data != null) {
                        sb.append("() = \"");
                        sb.append(data);
                        sb.append("\"");
                    } else {
                        sb.append(" = null");
                    }

                    sb.append(" | ");

                } catch (Exception e) {
                    String str = "Problem calling getter: " + name + " on object: "
                            + obj.getClass().getName() + "\nMessage: " + e.getMessage();

                    LogFactory.getLog(obj.getClass()).debug(str);
                    throw new SystemException(str, e);
                }
            }
        }

        sb.append("}]\n");
        return sb.toString();
    }

I’ve never really liked this function as it always seemed a bit clunky and that’s because it breaks the DO ONE THING rule given above. If you look at the code you’ll see that it:
  1. creates objects
  2. validates method names
  3. loop and appends all method information
  4. adds in state data
  5. checks for errors

...so, I spent a bit of time doing some refactoring. Note that the original method had a bunch of JUnit tests that it had to pass, and the brilliant thing about this was that I could use these tests to give me confidence that I’d messed nothing up. The result of this refactor was:

  public static String toString(Object obj) {

   
StringBuilder sb = new StringBuilder("[Bean : ");
    appendStringHeader
(obj, sb);
    appendAllMethodInfo
(obj, sb);
    sb.append
("}]\n");
   
return sb.toString();
 
}

 
private static void appendStringHeader(Object obj, StringBuilder sb) {

   
sb.append(obj.getClass().getSimpleName());
    sb.append
("  {");
 
}

 
private static void appendAllMethodInfo(Object obj, StringBuilder sb) {

   
Method[] ms = obj.getClass().getMethods();

   
for (Method method : ms) {
     
if (isValidMethodName(method.getName())) {
       
appendMethodInfo(obj, method, sb);
     
}
    }
  }

 
private static boolean isValidMethodName(String name) {

   
return name.startsWith("is") || (name.startsWith("get") && !name.equals("getClass"));
 
}

 
private static void appendMethodInfo(Object obj, Method method, StringBuilder sb) {

   
try {
     
sb.append(method.getName());
      Object attributeData = method.invoke
(obj, (Object[]) null);
      handleAttributeData
(attributeData, sb);
      sb.append
(" | ");
   
} catch (Exception e) {
     
handleException(obj, method.getName(), e);
   
}
  }

 
private static void handleAttributeData(Object attributeData, StringBuilder sb) {

   
if (isNotNull(attributeData)) {
     
appendMethodData(sb, attributeData);
   
} else {
     
sb.append(" = null");
   
}
  }

 
private static boolean isNotNull(Object obj) {

   
return obj != null;
 
}

 
private static void appendMethodData(StringBuilder sb, Object data) {

   
sb.append("() = \"");
    sb.append
(data);
    sb.append
("\"");
 
}

 
private static void handleException(Object obj, String name, Exception e) {

   
String str = "Problem calling getter: " + name + " on object: "
       
+ obj.getClass().getName() + "\nMessage: " + e.getMessage();

    LogFactory.getLog
(obj.getClass()).debug(str);
   
throw new SystemException(str, e);
 
}

You can see that the overall number of lines has increased, but to me, so to has the clarity. Reading from the top, the toString(...) method is now a headline method; it creates the StringBuilder, calls a function to add some header information, calls a function to add all the bean attribute info and finishes off by adding a small footer. Reading down to the next level of abstraction, the appendStringHeader(..) appends the class name and a format string, whist the appendAllMethodInfo(...) loops through each of the beans methods calling isValidMethodName(...)to validate the method name and, if this validation succeed, calls appendMethodInfo(...) to append the method information.

Hopefully, I’ve got this right and each method does one thing, and one thing only, but please let me know if you come up with any improvements. The new code above comes out at an average of 5.5. lines per function, whilst the original function was, for me, a whopping 27 lines long (okay, 27 lines isn't that long, but I am only demonstrating a technique).

Finally, you may think that this blog is just a plug for Bob’s book, and you’re right, it is, but I have to say that I don’t know the guy and have never met him - I just like the book.

No comments: