Sunday, February 19, 2012

Factories and Builders, Idioms and Patterns

I have to admit to being confused at times between the various meanings of the word "Factory" in object oriented design. For example, recently I was reviewing an Item in Josh Bloch's Effective Java where he said that static factory methods are not an implementation of the Gang of Four Factory pattern. In the next Item he then says that the Builder "pattern" (idiom?) he displays can be used as part of the Gang of Four Abstract Factory pattern. But then the Gang of Four have a Builder pattern as well - how does that relate?

So that's five somewhat related concepts:

  • Static Factory Methods
  • The Builder idiom recommended by Josh Bloch in Effective Java
  • The GoF Builder Pattern
  • The GoF Factory Pattern
  • The GoF Abstract Factory Pattern

In this blog entry I review these five ideas and how they relate.


/* ---[ Static Factory Methods ]--- */

The Static Factory is a simple idiom for encapsulating creational details into a single place accessible to other classes. For example:

// imports from Google Guava shown
import com.google.common.collect.Maps;  
import com.google.common.collect.ImmutableList; 

// the old-fashioned, hardcoded way
List<Integer> lint1 = new ArrayList<Integer>();
lint1.add(1);
lint1.add(2);
lint1.add(3);

// the first two are from Guava's static 
// factory methods
List<Integer> lint2 = Lists.newArrayList(1, 2, 3);
List<Integer> lint3 = ImmutableList.of(1, 2, 3);
List<Integer> lint4 = Collections.unmodifiableList(lint1);

On lines 12 - 14, we use static factory methods to create Lists. There are at least two benefits to this:

  1. We don't have to repeat the generic parameter on the right hand side
  2. We allow the static factory to determine the best type of concrete List class to create

In fact, in this case the three concrete classes created by the static factory methods all differ:

final PrintStream stdout = System.out;
stdout.println(lint1.getClass()); // java.util.ArrayList
stdout.println(lint2.getClass()); // java.util.ArrayList
stdout.println(lint3.getClass()); // com.google.common.collect.RegularImmutableList
stdout.println(lint4.getClass()); // java.util.Collections$UnmodifiableRandomAccessList

I won't spend much time on why using static factories is Good Design - Josh Bloch's very first entry in Effective Java -- which every Java developer has read, right? -- spends 6 pages on it. But I will mention one of my favorite benefits though (quoting): unlike constructors, they are not required to create a new object each time they are invoked. Caching is good.


/* ---[ The Builder Idiom ]--- */

In the Getting Started documentation for DBUnit, there is a curious bit of openness about not knowing how to complete their API design. In talking about doing row ordering with their SortedTable object, they have this code snippet:

SortedTable sortedTable1 = new SortedTable(table1, 
                                           new String[]{"COLUMN1"});
// must be invoked immediately 
// after the constructor
sortedTable1.setUseComparable(true); 

It is followed by this statement in italics:

The reason why the parameter is currently not in the constructor is that the number of constructors needed for SortedTable would increase from 4 to 8 which is a lot. Discussion should go on about this feature on how to implement it the best way in the future.

This is the problem of non-atomic object creation where you have to build up the state of the object piece by piece. When there are a lot of pieces, creating all possible constructors to cover the permutations makes it a nightmare for both the library writer and API user.

So one solution is to have an empty constructor and the set neccessary state via a series of setters, which is the JavaBeans spec. But that is very problematic. The object is in an incomplete state until all relevant adders and setters have been called and if the object escapes before the user constructs everything correctly, then you have a source of bugs. With the JavaBeans model, there is no clearly defined point at which to check whether the object's invariants have been violated before it is sent off to the world.

So you get comments in the API like "you must invoke this setter immediately after the constructor in order for the object to work correctly", as with the DBUnit case.

The Builder idiom is a nice solution to this problem, including cases where you want to create immutable objects. It allows objects to be constructed with any number of variable attributes and settings. Some may be required, others are optional. And you can construct the object in a multi-step, self-documenting fashion, which is one of the nice features of the JavaBeans model, but without the possibility of leaving an object in an inconsistent state.

A typical way to use the builder idiom in Java these days is to combine it with a fluent interface. One starts by getting a reference to a builder, which is often an inner class of the class that the builder creates. Then one calls a series of functions on the builder to tell it how to set up the object and finally invoke a build() method to return the desired object.

For the DBUnit example, the builder idiom could be used this way:

SortedTable sortedTable1 = SortedTable.
  newBuilder(table1, new String[]{"COLUMN1"}).
  setUseComparable(true).
  build();

// alternative way without having required fields in 
// the builder constructor
SortedTable sortedTable1 = SortedTable.newBuilder().
  setTable(table1).
  setColumns(new String[]{"COLUMN1"}).
  setUseComparable(true).
  build();

I prefer to put required fields in the builder constructor, since it emphasizes that they are required, but it is not absolutely necessary, especially if there a lot of required fields. One of the beautiful things about the builder idiom is that the build method is the perfect place for the Builder to enforce any invariants that must be met in order to create a valid object.

Builder could be used rather using a variety of different static factories. For example, here is a fictitious example of how Guava could have used a Builder pattern to create a List with various attributes:

// fake API - not runnable code !
List<Integer> lint1 =
  Lists.newBuilder().unmodifiable().
    add(1).add(2).add(3).
    build();

List<Integer> lint2 =
  Lists.newBuilder().immutable().
    add(1).add(2).add(3).
    build();

List<Integer> lint3 =
  Lists.newBuilder().synchronizedColl().
    maxSize(5).add(1).add(2).add(3).
    build();

But for collections, it is more typical to use the static factory pattern, as we saw in the first section.

Where the builder idiom particularly shines is when you are constructing objects with a variable and potentially complex set of attributes. A good example: Guava uses the builder idiom to create a Cache object:

Cache<Key, Graph> graphs = CacheBuilder.newBuilder()
     .concurrencyLevel(4)
     .weakKeys()
     .maximumSize(10000)
     .expireAfterWrite(10, TimeUnit.MINUTES)
     .build(
         new CacheLoader<Key, Graph>() {
           public Graph load(Key key) throws AnyException {
             return createExpensiveGraph(key);
           }
         });

Another aspect of the builder idiom that is particularly satisfying is that it is a great way to create immutable objects that have more than a couple optional attributes. For example, suppose you are modeling an inventory item that only directly requires custodian and quantity, but can take many different optional attributes, such as status, location, container-id, RFID-tag, etc. If you want to create immutable inventory objects, the builder pattern is a pleasant way to manage this:

// custodian (thornydev) and location are required, 
// so go in the builder constructor
InventoryItem item = InventoryItem.
  newBuilder("thornydev", "location 123").
  status("Available").
  quantity(35).quantityUnit(InventoryItem.KILOGRAMS).
  build();

In the build method, the Builder can check that key business rules have been met. For example, quantity and quantityUnit must either both be set or neither set, otherwise build will throw an IllegalStateException.


/* ---[ The Builder Pattern ]--- */

OK, now that we've reviewed the idioms, let's analyze the formal GoF patterns and see how they compare.

The Builder Pattern is actually quite similar to the builder idiom. In the GoF version, the Builder is an interface. A concrete version is created to create a product. In the idiom, the Builder doesn't have an interface, since it is tightly coupled to creating a particular product (and in Java is often implemented as an inner class of that object).

The advantage of using an abstraction, in this case an interface, is, as usual, to be able to transparently swap out a different Builder implementation, or to allow multiple different types of entities to use a common interface. An example of the latter is found in the JDK's java.lang.Appendable.

Appendable is an interface that has three flavors of the same method, append. BufferedWriter, PrintStream and StringBuilder, to name a few, implement its interface. StringBuilder is a nice, though simple, example of the Builder pattern - you build up the string bit by bit, can do some morphs, reversals, substrings, or other sorts of changes and then produce an immutable String, threadsafe and ready for production use. In this case toString is the "build" method:

String s = "devil";
StringBuilder sb = new StringBuilder();
sb.append(s).append("ish").  // now is "devilish"
  replace(5, 7, " e").       // now is "devil eh" 
  reverse().toString();      // => "he lived"

The GoF book illustrates a more sophisticated use of the Builder pattern. They create an RTFParser than will convert RTF text to other formats. I've updated their example a bit and redrawn it:

This kind of looks like the Abstract Factory pattern (see below). So why is this a Builder? Because it will be used to construct one document (say in HTML format) by calling its builder methods multiple times in some arbitrary order in a multi-step process to create that document. For example:

HTMLConverter c = new HTMLConverter();
Document d = c.
  convertBold(string1).
  convertParagraph(line1).
  convertHeader1(string2).
  convertParagraph(line2).
  // ... etc.
  .build();

Note: what I'm calling the builder idiom in this article has also been referred to as the "revised builder pattern". As noted here, the two variants use the same approach with different emphasis. The builder idiom (revised builder pattern) usually tightly couples a builder to a specific concrete class, with the intent of simplifying the construction of an object with a complex set of attributes. The GoF Builder pattern is more about providing an abstraction to build various types of entities, akin to Abstract Factory in that sense.


/* ---[ The Factory Pattern ]--- */

The Factory Pattern uses classic Object Oriented (OO) reuse through inheritance, along with all the shortcomings and foibles of inheritance-based design

The essence of the Factory Pattern is to delegate to a subclass the creation of an entity that can vary based on application needs. It typically uses either an abstract or concrete class, not a pure interface, to provide a default implementation and handle logic that will be common to all subclasses.

A trivial example from the JDK is Object#toString. Object, a concrete class, provides a basic toString method that subclasses can accept as-is or tailor to their needs.

A slightly less trivial example is java.lang.Number#intValue (and floatValue etc.). These methods are abstract and have to be implemented in a way appropriate to the subclasses. Number provides an implementation of a few common methods and the rest are delegated to subclasses, which includes Integer, Double, BigInteger and AtomicLong.

The above examples are not really "factories" as I normally think of them, though one could argue they are factories of Strings and primitives (but it's a weak argument, which is why I think of Factory as just OO inheritance-based design).

Here is the GoF UML class diagram of the Factory pattern.

In many cases, the Factory pattern starts to blur into the Template pattern for me. Both use OO inheritance-based reuse of methods defined in the parent class. As a side note, the Template Pattern may be a more valid and robust version of inheritance than is often practiced.

The Template Pattern can be implemented via the Factory Pattern, as the GoF book states: Factory methods are usually called within Template Methods (p. 116).

The excellent Head First Design Patterns book does exactly this in their implementation of a Factory method (see my comments in the code below):

package headfirst.factory.pizzaaf;

public abstract class PizzaStore {

  // the abstract method to be implemented in subclasses
  // such as NYPizzaStore or ChicagoPizzaStore
  protected abstract Pizza createPizza(String item);

  // this is the (unacknowledged) Template pattern here
  // delegating the factory method createPizza to a
  // concrete subclass
  public Pizza orderPizza(String type) {
    Pizza pizza = createPizza(type);
    System.out.println("Making a " + pizza.getName());
    pizza.prepare();
    pizza.bake();
    pizza.cut();
    pizza.box();
    return pizza;
  }
}


/* ---[ The Abstract Factory Pattern ]--- */

The Abstract Factory Pattern is intended for situations where you need to create families of related "products". For example, a widget library needs to create multiple related widgets - buttons, labels, frames, pick lists, text fields, scrollbars, etc. If you want to offer different "skins" or look-and-feels (is that an outdated term now?), you could use the Abstract Factory Pattern.

The canonical example provided by the Gang of Four is a WidgetFactory interface that can be implemented by any number of concrete Factories to produce all the various widgets with a defined Look-and-Feel.

From this quick look we can immediately see two differences between Factory and Abstract Factory:

  1. Abstract Factory uses a pure interface, while Factory has concrete methods and may or may not have any abstract methods.
  2. Abstract Factory is useful when you need to create multiple different categories of things (like widgets) that are related (say by look-and-feel). Factory produces one thing (a PizzaStore).

So you could think of an Abstract Factory as creating a factory of little factories. And here's where we can start to tie together some the threads of this investigation. What patterns or idioms can the "little factories" of the Abstract Factory use?

Well, they can use the static factory idiom, if the thing they are creating is easily bound to one method call with no or few parameters. Or it could use a Builder (either form) to construct products that need to be constructed in a multi-step fashion or with lots of variable attributes. Or we could use the Factory method to be able to swap different concrete implementations as needed.

Abstract Factory Example from the JDK

In JDBC, one obtains a connection to a datastore by calling DriverManager.getConnection. This is a static factory method that returns an implementation of java.sql.Connection. Connection is a pure interface that gets implemented by JDBC library implementers, so there is one for each flavor of database. So Connection, in this case, is an Abstract Factory interface and the specific implementations by database vendors are the concrete classes. For example, with PostgreSQL's JDBC driver, the concrete Connection class is org.postgresql.jdbc4.Jdbc4Connection (if you are using the JDBC4 version).

So what "products" are created by the "little factories" in the Connection class? Many: Statement, PreparedStatement, CallableStatment, SavePoint, Blob, Clob and a few others, each one, of course, differing from those in other JDBC implementations.


/* ---[ Summary ]--- */

So, to finish, a few summarizing points:

  • The static factory idiom is very common in Java. Using it precludes the possibility of delegating responsibility to a subclass for creating specific types of objects. Therefore, use Static Factory when you are sure that you only need this one implementation of the factory.
  • The GoF Builder Pattern and "revised builder" (aka builder idiom) are basically the same, except in whether they are tightly coupled to the object they are creating. In either cases, use a builder when you want to be able to flexibly create objects that need multi-step set up, particularly when you need to handle multiple optional attributes.
  • The Factory Pattern uses inheritance to allow different implementations of a specific method intended to be overridden subclasses.
  • The Abstract Factory Pattern creates an interface whose implementations use composition to create a series of little factories to produce related items or products. Those little factories can use any of the previous patterns to do their object creation.

1 comment:

  1. thanks for this information .
    you can find factory design pattern in simple and easy to understand language in a detail on
    http://www.itsoftpoint.com/?page_id=2666

    ReplyDelete