Mutual Tail Recursion For Fun and Profit by Owein Reese

One of the hallmarks of functional programming is the concept and adoption of recursion. Be it basic recursive functions, recursive data types or generalized recursion schemes; functional programmers like to use it everywhere. So important is recursion that the Scala compiler has a phase specifically to handle tail call transformations.

In this post, I'm going to introduce you to mutual tail recursion by way of the TwoTails compiler plugin. This plugin, now at version 0.3.0, adds support to Scala for tail call transformations on mutually recursive functions. It opens up the technique for general use without having to resort to Trampolines or risking StackOverflowErrors.

A Basic Example: Ping-Pong!

So before we get into the why or how, let's give a basic example of what mutual tail recursion looks like. Below is the classic "ping-pong" which was used to showcase Akka performance. Here, we're less concerned with performance and "messages" per second than we are with overflowing the call stack:

Like @tailrec but Different

Tail recursion works on a single function whose control path has a circular dependence with itself. Mutual tail recursion is more generalized; the circular dependency of each function call forms a directed cyclic graph with a closed walk and each recursive call is located at the tail position. You can think of @tailrec as a mutually recursive function call graph consisting of a single vertex.

All of that is a fancy way of saying if you start at "ping" and go to "pong" there exists a logical branch wherein you could wind back up in "ping," e.g. ping(3) in the example above satisfies this condition. It can also be described in graphical form. In the picture below, functions in the place "B," "C," "D," and "E" are mutually recursive; "A" and "F" are not. If all the calls of "B" through "E" are in tail position, TwoTails can transform them:

Example of a hypothetical dependency graph of function calls

Example of a hypothetical dependency graph of function calls

Mutual Tail Recursion in Practice

Theoretically, any set of mutually recursive methods can be rewritten as a single recursive method. In practice, doing so adds at least one level of indirection and results in one very bloated method. Furthermore, code written to avoid mutual recursion where it should naturally exist becomes more difficult to predict control flow and harder to test due to increased cyclomatic complexity.

Said another way, recursion results in less code compared to stateful, iterative approaches. Mutual recursion builds upon this but organizes and groups code according to specific function resulting in smaller methods. More direct access to each algorithmic component yields more direct testing of logical branches. This in turns reduces bug counts and improves the ability of developers to maintain a living code base.

Such compelling virtues are a strong positive for mutual recursion but prior to TwoTails Scala did not natively support mutual tail call transformations at the compiler stage. Instead, code attempting to avoid stack overflows while retaining the such a structure had to be cast in terms of Continuation Passing Style or Trampolines. The additional artifacts required added to the cognitive overhead, increased boilerplate and leaked implementation details ("Zounds! These methods are recursive. Look at their return types.")

To witness and by way of contrasting the previous example, here is the same "ping-pong" code written in vanilla Scala using trampoline styled recursion:

Note, that every "loop" (an invocation of TailCalls.tailcall) creates a TailRec object, a thunk, and an anonymous function calling that thunk. For finishing touches but not required, two helper methods were added to preserve the same public interface and hide the implementation details that trampolines were used.

The Case for TwoTails

There is nothing wrong with going the trampolined route for handling mutually recursive algorithms. Indeed, for stack safe mutually recursive but not tail call recursive algorithms it is one of the only ways (assuming it can be structured with monadic bindings.) Given a choice, however, code devoid of trampolines is both simpler to read and more direct to write.

TwoTails allows for the expression of mutually tail recursive functions to be written directly in vanilla Scala with a familiar annotation approach; methods are prepended with @mutualrec (styled after @tailrec.). It does so without introducing or requiring additional data structures and adds no additional project library dependencies. In fact, being a compiler plugin (it adds a new phase prior to tailcalls) means that once code is compiled, no trace of TwoTails remains in the generated JAR files.

Version 0.3.0 of TwoTails has been officially released for Scala 2.11.8 with plans to support 2.12.0 in the near future. It has currently not been tested with other compiler plugins or macro based methods. Happily, it has been accepted into incubator status of TypeLevel and it is hoped it can move to full member status with enough adoption. What this means is that the project will continue to evolve, will actively look for people to use it and wants to hear about any reported issues.

Over Constraining the Aux Type by Owein Reese

This blog post will highlight what happens during implicit resolution using an Aux type on a type level proof when the proof has not been properly constructed. To show this, I'm going to revisit the approach taken in a previous blog post proving that a type was a member of a collection of types at compile time. By adding a few small tweaks I will try to achieve both inclusion and exclusion, where as before I could only get inclusion. Finally, I'll show how to build the proof just right and enjoy the reuse of code for multiple purposes but on the type level.

Like in other posts, I'm going to assume some knowledge about Scala and it's type system. First, I'll assume you understand implicit parameters and you're familiar with the type class pattern found in Scala. Secondly, you're familiar with the Aux type pattern or type refinements (which the Aux pattern is.) Finally, I'll assume you've had some exposure to implicit resolution order.


For a refresher, let's take a look at the definition that proved some type exists within a collection of types:

I won't go into specifics here; you can read them in that blog post. Just know that if the type does not exist and we attempt to find the implicit through regular implicit resolution it'll fail compilation. Hence, this only ever allows us to prove something true and not the negation of the hypothesis.

To begin to prove both, I'll modify the definition of OneOf by adding an extra type member. I'll also add some types representing true/false and define an Aux type to compliment our new OneOf. In fact, so as not to confuse the situation, I'll rename it:

Next we'll define the proof to allow both inclusion and exclusion:

Here, we're using the implicit resolution order to force the compiler to search first through the recursive case, exploring the group of types completely before we allow it to reach the false condition. Once the false condition is reached, however, it will be carried all the way back up through the recursive stack to yield a final value of False.

The problem here is that we've failed to properly construct the proof. It's subtle but evidences itself through use of the type definitions we made using the Aux type. Try running this through the repl:

If we were to see if a String type compiles in the "contains" check, it would fail. However, the last case does compile when we would expect it to not compile.

The reason is simple. Normally when we construct a proof, we construct it in such a way as to only compile when the condition is matched. In this case, by over constraining the Aux type, we've told the compiler not to search the proof as we've written it but instead to search to find if there is any implicit which satisfies the condition that the type member is False. There is nothing to stop or signal to the compiler that it can't use the very last, completely unguarded implicit to complete the search.

What we have to do is use a dirty trick, cause an implicit divergence by allowing two mutually inclusive definitions to exist at the same search level:

Now, instead of compiling, the very last example used in the repl session would fail.


In this post I went over why an Aux type is both useful and dangerous at the same time. It is useful to get more mileage out of a the scaffolding of a type level proof. Allowing us to reuse logic for multiple conditions. At the same time, it is dangerous because it imposes an extra burden on the author. If care is not taken in the proof's construction unexpected result will highlight a discrepancy or any hole in the logic.

Due note that this blog post was written for educational purposes and it lacks the level of completeness and power a library might contains. If you enjoy these types of things have a gander at several libraries which use type level tricks to implement deeply powerful patterns: Miles' ever wonderful Shapless, my own AutoLifts or Stew's type level turning machine Scatur.

A Type Level Proof of Uniqueness by Owein Reese

In the last blog post I described how to create a proof that a type belongs to a group of types. In this post, I'll describe how to constrain a method such that we can extend a representative type set in a manner that enforces a uniqueness property of each subsequent type set. Then  I'll describe how to constrain that group so that a single type appears only once.

As in the last post I'm going to make several assumptions about your level of Scala knowledge. I'm going to assume that you understand implicit parameters and that you're familiar with the type class pattern as found in Scala. I'm going to assume you're aware of the Aux type pattern or at least heard of type refinements (this one's new.) I'll assume you can read and understand a recursively defined function. Finally, I expect you to be aware of implicit resolution order semantics (but not have memorized nearly half of it.)

The Proof

First, let us define what we mean by a unique set and the single operation we define on it to extend the collection of types such that the uniqueness property of each type within the type set is preserved

This says that given out TSet, we can add a type to the collection as long as an implicit Merge can be found. It also provides the description of the Merge type class; one that contains an abstract type member. The result of the merge member function depends on the abstract type Out, more specifically known as a path dependent type, since the answer will change for each supplied type parameter.

(As before I've included a type alias for Tuple2 which I hope will make reading the code easier to digest.)

Merging is an action with an invariant which must be proven to be true (or at least defined to be true, semantics.) As a first step towards building the full proof we can consider an attempt to add a single type to a type set containing only two types:

There are three cases which must be handled: the added type is equivalent to the first type in the set, the added type is equivalent to the second type in the set and finally the added type does not exist in the set. Unlike the OneOf cases, we don't need to worry that the two types in the set are equivalent and safely nest them within the direct companion object definition. The Out parameter is defined as the logical result, if it already exists we don't include it twice.

(A point of clarify: we're using the Aux type pattern here because it's good practice for reasons outside the scope of this post. If you don't understand why we would want it, have a gander at the first video in the video section which discusses it in some depth.)

The second stage of building out this proof is generalizing it to an arbitrary sized type set of types:

Added to the logical paths are two more branches: the condition where the set contains only a single element and it is the same as the added type and the recursive search into the type set which inserts the type if and only if it does not already exist. 

Here the Out parameter is being used as a means of passing type level information between different iterations of the type computation much in the same way as an accumulator in a recursive function. If the type being added ultimately does not exist then at some stage an implicit search will yield a result with that type appended. If it does exist then the original type will be recursively rebuilt.

Finally, to complete the proof and come full circle, we must handle when the merged parameter is itself a type set:

Two more branches have been added, this time taking priority over all other added branches. The first branch handles the case where the type set being merged contains only a single item. In this case we reuse the logic which we had previously constructed up until this point. It is more or less unpacking the type within the type constructor.

The second logical branch added exists in a seemingly bizarro world of recursion and Aux pattern type madness. To break this down, the first implicit parameter is repacking everything but the first type parameter back into a TSet so that a new search can be made. The second implicit parameter uses the first type in the TSet with the results generated from the first implicit parameter (this is why it's expressed in an Aux type) to produce the final result.

(If you're still confused as to what's happening with the Aux type watch the video or skim to the middle and read here.)


This builds upon the previous blog post to give a more challenging example of type level proofs/inductive logic. It covered not only all the cases of the previous proof but delved into using the Aux type pattern popularized in the Shapeless library to deconstruct a nested type within an already recursively structured type level computation. At each stage the invariant that a single type should not appear more than once in the final result was obeyed.

Do note, the point of this post was educational. The code here lacks several niceties, namely how to preclude non-unique types in the definition of TSet. If you're interested in seeing more examples of type level computations take a gander at the aforementioned Shapeless or my own Autolifts project.

A Type Level Proof by Owein Reese

Using the compiler to prove the correctness of a program is what static types are all about. In this blog post we'll take that concept one step further and actually use the type system to construct a type level constraint. This constraint, namely that a type is one of several types, will cause the compiler to fail not at runtime but at compile time, thus forming a proof that a system is undeniably working as intended.

To read this post I will assume that you understand implicit parameters and that you're familiar with the type class pattern as found in Scala. Furthermore, I will also assume you can read and understand a recursively defined function. Finally, I expect you to be aware of implicit resolution order semantics (but not have memorized nearly half of it.) After that, it's all about taking these concepts and applying them at the type level. In fact, I'd say it's a bit like learning a language within a language.

The Proof

The first thing we want to do is set up a type class which represents that a type belongs to a set of types:

This says given a collection of types embodied by the Items type and a test type denoted by Obj, find an implicit of the trait OneOf which proves that Obj is contained within Items. Or, at least, it will eventually say that. Right now it just looks to you like a useless object containing a type member and a function that discards the implicit parameter passed to it.

(Included is a type alias for Tuple2 which I hope will make things easier to understand as I build out the logic. I generally don't suggest using symbols for types (or classes, ahem, Scalaz) unless you actually want to confuse your colleagues. In that case, may I introduce you to the IOCCC.)

The first stage of defining the proof is to consider a pair of types as the basis set (hence, Tuple2.) We can place the implicit search within the companion object so that if either the left or the right type, denoted below as L or R, matches Obj, the implicit is always available in scope:

The companion object is further separated into two cases such that the implicit definition in the object body is searched first, followed by the trait. This is to avoid an implicit ambiguity brought on by having both L and R be equivalent types, i.e. Int and Int.

Next we can relax the condition on the types such that they play well with inheritance. That is, we want to be able to say that a type Dog is within a pair Animal | Plant.

To generalize to more than two types, we can use the previously defined conditionals (namely left or right equivalence in a set of two) as the basis for a recursive definition. In doing so we'll form an inductive proof:

The final step is the recurrence relationship. It splits the search space down the structure of the group of types so that if the left hand type, L, is ever the same as Obj, then a solution is found. Only at the last segment, when there are but two types to be searched do we need check the right most type. The additional implicit parameter is discarded, we only need to know that it existed to be able to say that it exists for the entire group.

To see this in action, take the following bit of code:

together with the previously given definition for OneOf and play around with the test cases. The first three should compile without issue. The final will not.


This basic construction is the template strategy for many type level inductive proofs. Form a basis set for a solution (in our case the solution space was binary) and build a search using a recursive structure as a stepping mechanism. It is no different than a recursive function on a list. To wit, the list is a list of types, not values.

Do note, the intent of this post was to educate and not provide a production ready solution. This implementation will not support the search for higher kinded types either has the search parameter Obj type nor as a member of the group of types Items. If you're interested in seeing more examples of these types of things, checkout out the implementation details of Miles Sabin's Shapeless or my own AutoLifts projects.

Type Recursion and Functors by Owein Reese

Type based recursion can be used to solve for a variety of problems at compile time. One of those problems is how to apply a function to an object based upon the type signatures of each. In this post I'll delve into how this is done in the AutoLifts library, mixing the type class pattern to derive the final results.

The Idea

The first thing we need is a structure to recurse over. Given the following:

it is clear that we need to nest the function in three levels of a call to map. Granted, given that each of the levels could be done as a Functor (as each of them has a Functor) we could lift the function up so that it could be applied to the base type. This forms the basis for a recursion; each type nesting can be examined and using a Functor, mapped over.

Let's start with a function definition coupled with a type class. Furthermore, for ease of use place the definition of the function on an implicit class:

Then using the magic of implicit resolution order, we can place the definition of the type class in it's own companion object like so:

The companion object extends a "LowPriority" trait which contains an implicit def called "recur." This is in addition to the implicit which produces the same type class in the main body. No divergent implicit error will be thrown, however, as this is again using the priority rules on implicit resolution order to force the compiler to prefer the definition found in the body of the companion and only then, if the types do not satisfy, check the extended trait.

The "recur" method requires that another search for the same type class with different type parameters be conducted. This acts as a loop, with the compiler pattern matching at the type level to chose which branch to proceed down. Eventually the looping terminates at the "base" method or fails because no implicit parameter could be generated.

Since this is a recursive routine, there must be an accumulator. The Out type parameter is that accumulator, gathering up the type information of the final result. It is not known ahead of time what that type is until the compiler finishes up the recursive search, each level of recursion refining it slightly. Note, the type class itself could not be defined without specifying something as the return type of the "apply" method (and is much better than the generic and near useless Any.)

The above is fine for when the function arguments are very strict and always known to be exact. Scala allows for inheritance and as such the definition of Function1 is permissive with the argument type, contravariant, and result type, covariant. While the above definition doesn't care about the covariance of the result type, mimicking the contravariant nature of the function at the type level requires a slight change:

Adding a fourth type parameter to the "base" method which is bounded by the argument of the enclosing type solves the issue.

Finally, the last thing to do is to add an Aux type so that the compiler carries the type information in an explicit manner to the final result type.

If this Aux type were not added, the compiler's type inferencer would "lose" the type information contained in the Out type parameter.


The actual implementation in the AutoLifts library differs slightly and can be seen in the three different libs each using Algebird, Cats and Scalaz respectively. If you'd like to get a more in depth understanding of how this works, I would refer you to a talk I gave over a year ago in the video section of this website.


Advantages of Being Lazy by Owein Reese


Lazy data structures have many advantages over classically eager data structures. Strictly lazy structures are not widely used in imperative code due to difficulties involved with composition of exception handling and mutable state. Despite these difficulties, they offer a compelling tool for solving a large class of problem spaces. 

This post will attempt to highlight two of those benefits: memory footprint and overall programmatic computational cost. It will begin by first defining a common, basic data structure, a Tree. After implementing two methods on a Tree, map and exists, it will use those methods to solve a problem. Finally, using a lazy version of a tree, the differences in terms of cost and benefit will be shown.

A Tree

Instead of using a tree as a container for other objects, like in many algorithms and structures classes, this tree will express a hierarchy of values. Let us suppose that all hierarchical trees have the following interface:

which exposes just two methods: map and exists.

We can implement an arbitrarily sized hierarchical tree structure by allowing each level to vary on the number of children it has by using a List:

Where for each action performed, the evaluation of that action is called directly upon evocation.

A Problem Statement

Suppose that we want to represent an extended family whose fortunes have swelled but net worth is tied up in a complex set of independent and interwoven investment vehicles. Each member of the family is entitled to a certain number of votes based upon some ownership of shares. The board of one of the companies has called for a vote where a large outside investor is attempting to influence the outcome.

Being an efficient and careful activist, this investor would like to first determine if there is a single family member who has enough voting power such that he could win the majority. If such a member does exist, the investor will spend all of his energy lobbying that person. If not, the investor will divide his time based on the number of votes:

Modeled like this, using the eager tree defined above, in the worst case the algorithm would require an evaluation of every single family member's voting rights twice; once for the expensive calculation and one time for the predicate condition. The best case with eager evaluation would also require every single family member to be evaluated at least once, N expensive calculations but just once for the predicate.

To improve upon the bounds and more accurately model the outside investor as an efficient actor, we would need another implementation. We could rewrite the function to use some form of short circuiting accumulation. This route would lead to increased implementation complexity and decreased maintainability. Or we could examine using lazy data structures and keep the function as written above completely unchanged and intact.

A Lazy Tree

The relationship between an eagerly evaluated tree and a lazy tree is much that same as that between map and exists (on an eager data structure.) Exists is a short circuiting function, executing the function the least number of times necessary until a positive result is found while map will call it N times for N objects. On a lazy data structure, map will be delayed until the first time a result is requested.

We can define a lazy tree as follows:

wherein the call to head is preferable to repeated calculations of value, if it is to be requested multiple times. Coincidentally, the definition of LazyTree is nearly identical to EagerTree save for the implementation detail of nodes as a Stream vs a List and the use of Thunks for member values.

While small in difference cosmetically, programmatically it is a powerful distinction. The actual calculation of value is delayed until it is explicitly called and so too is any calculation using nodes. To understand how this is so, we need but study the implementation of the cons cell of Stream and the map member function, copied with some liberty below:

So defined, calling map on a Stream of N items creates a Stream of 1 element with a tail wrapped in a Thunk. Iterating over each element creates yet another single cons cell with a delayed realization of the tail, hence the concept of lazy evaluation.

Like the eager tree, the worst case scenario for the problem statement would be the evaluation of each family member twice. The best case scenario, on the other hand, would be the evaluation of only a single family member once. Additionally, in the best case scenario the memory footprint would be greatly reduced to 2 (the value and the nodes Thunks) instead of the entire tree of weighted votes, saving not only time but also space.


Efficiency is defined as doing no more than the minimal effort required. Clearly the lazy tree accomplishes this by putting off as much as possible until explicitly asked. As shown in this post, a direct, easy to understand algorithm was made both cost and space efficient by switching from an eager data structure to a lazy data structure without destroying readability.

The key to the improvement of this algorithm was the potential need to retain intermediate values for some but not all branches of logic. In situations like that, lazy structures shine. However, it would be remiss to not caution you, the reader, to understand that lazy structures are not always ideal. With anything there are trade-offs. Lazy structures can induce hard to understand performance profiles and in multi-threaded applications, if coded with persistence in mind, lock based overheads (look into the how of Scala's lazy val.)

That said, knowing about lazy structures and understanding situations where they can help arms you with another tool in the tool box. In the next post, another use case of lazy structures will be talked about. One which leads to patterns that would be impossible to write with eager structures.

Type Soup by Owein Reese

Type soup is an expression I once heard a coworker say that has stuck with me over the course of several years. Said coworker had originally worked in dynamic programming languages like R, Julia and Python but had switched to Scala for large scale numerical analysis. The phrase was uttered in reaction to seeing simple code sprout so many type parameters it was no longer simple to reason about.

The difficulty of watching agreeable code get cluttered with type parameters even if used for good effect never goes away. To counter this, a good rule of thumb is to avoid no more than one of a phantom type, free parameter within an implicit or a compound type within a method signature or object constructor. That is to say, you should only chose to have exclusively one of rather than the inclusive one of each in any signature.

This is a bit more restrictive than Twitter's style guide but better than the normal coding standards which don't attempt to set any boundaries. Types should be there to guide, not confuse. For instance, which of the following is easier to understand?

Without me telling you I bet you could deduce that the Key and the Value type parameters were related to some sort of key-value store. I have no doubt in my mind that you can't tell me what the other bit of code is doing without either knowing the context or seeing the code.

This is what type soup has come to mean to me. It's an utterly abstract bit of code searching to tie together several different underlying representations or even concepts by generalizing over the type signatures of the methods and/or objects. Please don't do this. You are the only one who will ever benefit from your "clever" code.