Factory functions in Go

Static factory methods are extremely powerful in Java, as noted in Item 1 of Effective Java. Here's how factory functions are effective in Go.

There are a few ways to create a value of a struct.

// Allocating memory with new.
buf := new(bytes.Buffer) // type *bytes.Buffer

// Simply declaring the variable.
var buf bytes.Buffer

// Using a struct initializer.
buf := bytes.Buffer{}

Structs that cannot be used in their zero value, require an initialization constructor.

f := os.File{fd, name, nil, 0}

There is another way for packages to let clients allocate and initialize structs. A package can provide an factory function, which is an exported function that returns an initialized struct using one of the options seen above. Here's an example from the strings package. This function returns a newly initialized strings.Reader struct.

func NewReader(s string) *Reader {
  return &Reader{s, 0, -1}
}

One advantage of factory functions is that, unlike initializers, they provide initialization logic. Packages can use this for a variety purposes. For example, http.NewRequest checks that the given URL string is a valid URL. Packages can also use this to initialize internal dependencies. The http.NewServeMux function transparently initializes internal fields.

package http

func NewServeMux() *ServeMux {
  return &ServeMux{m: make(map[string]muxEntry)}
}

package foo

mux := http.NewServeMux()

Without a factory function, both the field m and type muxEntry would have to be exported in it's public API, additionally forcing clients to initialize them.

package foo

mux := &http.ServeMux{M: make(map[string]http.MuxEntry)}

A second advantage of factory functions is that, unlike initializers, they are not required to create a new value each time they're invoked. This allows packages to return preinitialized values, or cache values as they're initialized and dispense them repeatedly to avoid unnecessary allocations. These implementations can all be provided with a single API using a factory function, and gives packages better control over the behavior and performance characteristics as needed.

A third advantage of factory functions is that, unlike initializers, they can return a value of any subtype of their return type. This gives packages greater flexibility in choosing the type of the returned value. One application of this flexibility is that an API can return interface with making their actual type public. For example, the errors.New function returns an error interface and not *errorString. This requires callers to reference the return value by its interface rather than it's concrete type. In a future version, Go could completely remove the errorString type, and replace it with another implementation of error without breaking API compatibility. Consumers of the API would be none the wiser.

The main disadvantage of factory functions is that, unlike initializers, they are not readily distinguishable from other functions. This makes it difficult to figure out how to use a struct that must be initialized via a factory function. The best workaround for now is to rally around established conventions. Factory functions are typically named New (as in errors.New) or New{type} (as in bufio.NewReader). GoDoc will also surface factory functions under their struct types.

A second disadvantage of factory functions is that, unlike initializers, they don't support named arguments. This makes consumer code harder to read and figure out what the values represent.

package foo

// Using a factory function.
req, _ := http.NewRequest("GET", "http://f2prateek.com", nil)

// Using a struct initializer.
req := &http.Request{
  Method: "GET",
  URL: MustParse("http://f2prateek.com"),
  Body: nil,
}

Arguably, the former can be easier to comprehend since the http.Request type contains a total of 19 fields. This makes it confusing for new consumers, who have no idea which fields are required to correctly initialize a request. Using a factory function allows the http package to highlight the fields required for proper initialization. For structs that do need lots of configuration options, using a dedicated config struct is also a viable option.