Swift Associated Types, cont.

Sure, let's go down this rabbit hole again

Russ Bishop

April 28, 2016

Update: I originally hit publish too soon; this is the updated article.

I don't feel like I fully covered one aspect of protocols with associated types: why can they be such a pain to work with?

Why Associated Types

This rabbit hole just keeps on going; see my third article in the Associated Types series for a better explanation for why Swift uses associated types.

I removed the old explanation because it was more of a statement on compiler optimization than an explanation for why associated types are part of Swift and this topic is difficult enough without misleading explanations!

The Problem With Associated Types

The main reason people even bother to ask this question is the fact that the compiler won't materialize an existential, leading to the dreaded "Protocol X can only be used as a constraint because it has Self or associated type requirements".

Existential Digression

Your next question might be "what the heck is an existential?". Good question.

A value defined as var x: protocol<SomeProtocol, OtherProtocol> is existential because we have no idea at compile time what the real true type of x is. All we (and the compiler) know is it implements the protocol. Under the covers Swift can represent the instance itself inline if the size of the type is small enough, otherwise it boxes it on the heap and stores a pointer. More importantly it also stores a pointer through which it can find all the specific protocol conformances required to satisfy SomeProtocol and OtherProtocol. This indirection is critical because the type underlying the protocol can change (multiple types can adopt the protocol after all). Your local var x: SomeProtocol can be re-assigned so the compiler can't even necessarily cache the protocol conformance pointer inside a local function.

If you ever wondered what the deal is with typealias Any = protocol<>, that's because Swift's ultimate root type is an existential that doesn't conform to any protocols.

So what is an existential? It's a value that has a protocol type. It means we know nothing except the underlying type adopts the protocol.

Generic Specialization

Generic types with their fancy generic type parameters are different. Generic functions can be specialized because the compiler knows what some type T is for any given instantiation. There is no need to store and pass conformance pointers around, nor take care to follow any indirection. The compiler can emit direct field offsets or directly jump to specific functions.

For infrequently used types the compiler may choose to use a runtime generic version rather than a specialization to limit the size of the generated binary; this is the equivalent of the C++ template problem where every single combination of type parameters cause the template to instantiate a new compiler-named type and all those types end up bloating the size of the binary.

To put it another way, a ContiguousArray<Int> has a statically known layout. In theory the compiler can figure out that myArray[15] is really just *(myArrayBasePtr + (15 * sizeof(Int))). Dereferencing a pointer is faster than figuring out that IndexType = Int, locating the RandomAccessIndexType.advancedBy() conformance for Int, setting up the stack frame, calling the implementation, then returning.

What happens when we don't know the type statically?

func fancyFunction<T: protocol<SomeProtocol, OtherProtocol>>(thing: T) { }

if let value = value as? protocol<SomeProtocol, OtherProtocol> {
    fancyFunction(value)
}

// Compile Error: cannot invoke 'fancyFunction'
// with an argument list of type '(protocol< SomeProtocol, OtherProtocol >)

If you've ever run into this situation you're at the business end of generic specialization and existentials. The compiler has no idea what the underlying type of value actually is so it can't emit a call to the correct generic specialization.

Even though we clearly satisfy the constraints there is no guarantee that we can call the correct version of the generic function. Remember that the compiler can make all kinds of assumptions about generic specializations; it knows the types so it can insert all kinds of shortcuts to reach inside the object and grab whatever it wants. If we call the wrong specialization things are not going to turn out well for our program.

(This all connects together, I promise!)

Generalized Existentials

So to summarize:

The current restrictions around protocols with associated types comes down to the fact that the compiler can't dynamically unwrap the existential value if associated types are involved.
Even if we can prove an existential satisfies constraints we can't call the correct generic specialization dynamically.

The Completing Generics Manifesto proposes a two-part solution to these problems:

The first Doug calls "generalized existentials":

protocol SqlColumn {
    associatedtype ValueType
    func read() -> ValueType
}

let x: SqlColumn = ...
column.read() // returns Any

If implemented, this feature would immediately put an end to the "Protocol X can only be used as a constraint" error messages; the Associated Types just become Any and you dynamically cast or switch. It won't have fancy statically compiled performance and it introduces the possibility of runtime crashes but sometimes dynamic behavior is the best tool for the job and it's nice to have it in the toolbox.

Doug also goes one step further noting that it would be natural to allow defining constraints to save a lot of casting boilerplate:

let x: Any<SqlColumn where .ValueType == String> = ...
column.read() // returns String

Opening Existentials

One tricky bit is when you have Self requirements like Equatable. Just because two different types implement Equatable doesn't mean there is actually an == implementation to check them for equality. With is/as we could check that the dynamicType of two values matched but we still couldn't call the actual == overload dynamically for the reasons discussed above regarding generic specialization.

To solve that problem, Doug proposes the ability to "open" an existential which would dynamically extract which type is actually stored in the existential and give that type a local name like T. Then you could check if some other value is also the same exact T, but even more importantly you could dynamically call the appropriate generic specialization with T:

if let storedInE1 = e1 openas T {     // T is the type stored in E1
  if let storedInE2 = e2 as? T {      // is e2 also a T?
    if storedInE1 == storedInE2 { … } // okay: storedInT1 and storedInE2 are both of type T, which we know is Equatable
  }
}

In this example, which specialization of == to call depends on the dynamic runtime type of e1 and e2. That makes working with protocols that have Self requirements possible and simultaneously solves our generic constraint problems.

Note: All syntax is just straw-man syntax and none of this is confirmed for any future version of Swift.

Conclusion

Personally I think solving the existential problem is a high priority because it involves a pain point anyone using protocols and generics will quickly run into, but the feature list is long and time is short so I don't know if it will get done in the Swift 3 timeframe. If opening existentials does get implemented using associated types won't force you into static dispatch and generic constraints; that will be a choice you make based on performance and other requirements.

In the mean time you can use type erasure to handle protocols with associated types.

What about more general existential cases? ~~Stay~~ ~~tuned.~~ Benji's post on the topic is now up

Links

For more on associated types, @alexisgallagher has a talk that also covers these topics (in fact we've discussed this in-person several times). There's also a very interesting and awesomely programming-language-nerd paper comparing generic programming across various languages and as Alexis points out Swift ticks all the boxes, partially as a result of associated types.