DEV Community

Yufan Lou
Yufan Lou

Posted on

Generalizing Tree Traversal in Haskell with Typeclass

Tree is the most pervasive data structure in our lives. First, we see trees, green literal trees, everywhere. Second, almost any navigation menu is a tree. Finally, our code always becomes an abstract syntax tree before it goes on to become executable.

Haskell is a very interesting language. The creators of Haskell intentionally chooses to limit it to what mathematics can do, and surprisingly reveals how much mathematics can do. This post introduces what mathematics is best at: generalization. You think interface is generalization? Nah, this is generalization.

To start, hopefully you are familiar with the tree data structure:

data Tree =
    Leaf Int
  | Node Int Tree Tree
Enter fullscreen mode Exit fullscreen mode

Let's construct some simple trees as a warm up:

ex1 = Leaf 1
ex2 = Node 2 (Leaf 1) (Node 3 (Leaf 0) (Leaf 5))
Enter fullscreen mode Exit fullscreen mode

When we traverse a list, we can go from front to back or back to front. For trees, there are three orders to traverse it: pre-order, in-order, and post-order. Each would let us visit the nodes in a different order, which we can represent with a list of the nodes.

But first let's construct the tree we'd like to play with! I'll borrow the example from this StackOverflow answer.

A binary search tree of numbers 0 to 10

ex3 = Node 7 (Node 1 (Leaf 0) (Node 3 (Leaf 2) (Node 5 (Leaf 4) (Leaf 6)))) (Node 9 (Leaf 8) (Leaf 10))
Enter fullscreen mode Exit fullscreen mode

Pre-order is so named because the root node is visited before both of the subtrees, thus the "pre-" prefix. Have a look at its code:

preorder :: Tree -> [Int]
preorder (Leaf a) = [a]
preorder (Node a l r) = [a] ++ preorder l ++ preorder r
Enter fullscreen mode Exit fullscreen mode

The Leaf a case is the natural base case. In the Node a l r case, a is the subroot at the relative top, l is the left subtree, while r is the right subtree. Top to bottom, so a is the first. Left to right, so l is before r.

>>> preorder ex3
[7,1,0,3,2,5,4,6,9,8,10]
Enter fullscreen mode Exit fullscreen mode

Now that's all good, and we have a list of nodes in the traversal order. We can take this list and process it further, such as summing it up:

sumOfTree :: Tree -> Int
sumOfTree = sum . preorder
Enter fullscreen mode Exit fullscreen mode
>>> sumOfTree ex3
55
Enter fullscreen mode Exit fullscreen mode

or calculating a cumulative sum:

cumulativeSum :: [Int] -> [Int]
cumulativeSum = scanl1 (+)

cumulativeSumOfTree :: Tree -> Int
cumulativeSumOfTree = cumulativeSum . preorder
Enter fullscreen mode Exit fullscreen mode

scanl1 builds a list by applying a binary function to an existing list from left to right, taking the first element of the list as the initial value, and recording results from each step. With (+) as the binary function, the resulting list is the cumulative sum of the input list.

>>> cumulativeSumOfTree ex3
[7,8,8,11,13,18,22,28,37,45,55]
Enter fullscreen mode Exit fullscreen mode

Composition does the job, and that's good. But that intermediate list is annoying. It might even become a problem if the tree is big enough. Can we do away with it? Let's look at the code again:

preorder :: Tree -> [Int]
preorder (Leaf a) = [a]
preorder (Node a l r) = [a] ++ preorder l ++ preorder r
Enter fullscreen mode Exit fullscreen mode

The operations related to list are the singleton constructor [a] and the append operator (++). It turns out we can generalize this! For having a singleton constructor, list is an instance of the typeclass Applicative. For having an associative append operator, list is an instance of the typeclass Semigroup. The singleton constructor of Applicative is pure, and the associative operator of Semigroup is <>. We can therefore rewrite the function to be:

preorder :: (Applicative f, Semigroup (f Int)) => Tree -> f Int
preorder (Leaf a) = pure a
preorder (Node a l r) = pure a <> preorder l <> preorder r
Enter fullscreen mode Exit fullscreen mode

Does it still work?

>>> preorder ex3
error:
    * Ambiguous type variable `f0' arising from a use of `print'
      prevents the constraint `(Show (f0 Int))' from being solved.
      Probable fix: use a type annotation to specify what `f0' should be.
    ...
Enter fullscreen mode Exit fullscreen mode

Now that we've generalized preorder, when we use it directly we need to specify what concrete type we want it to give back. We can do so by inline annotation.

>>> (preorder ex3 :: [Int])
[7,1,0,3,2,5,4,6,9,8,10]
Enter fullscreen mode Exit fullscreen mode

What does this generalization gain us though? Let's check the list of instances of Semigroup. The following may catch your attention:

  • Num a => Semigroup (Product a)
  • Num a => Semigroup (Sum a)
  • Ord a => Semigroup (Max a)
  • Ord a => Semigroup (Min a)

Since Int fulfills the requirement Num and Ord, we can use these instances. How? By changing the type annotation is one way.

>>> (preorder ex3 :: Product Int)
Product {getProduct = 0}
>>> (preorder ex3 :: Sum Int)
Sum {getSum = 55}
>>> (preorder ex3 :: Max Int)
Max {getMax = 10}
>>> (preorder ex3 :: Min Int)
Min {getMin = 0}
Enter fullscreen mode Exit fullscreen mode

Another way is to use the getter of the corresponding instance.

>>> getProduct (preorder ex3)
0
>>> getSum (preorder ex3)
55
>>> getMax (preorder ex3)
10
>>> getMin (preorder ex3)
0
Enter fullscreen mode Exit fullscreen mode

Cumulative sum is not in the existing list of Semigroup. Can we add cumulative sum to this as well? Pause and think. Does it have a binary associative operator?

Think a bit more.

A bit more before I tell you the answer.

......

Yes! For cumulative sum, the binary operator appends the two lists of sums, and adds the last sum of the left hand side to every sum in the right hand side. For example:

[7,1,0,3,2] <> [5,4,6,9,8,10]
[7,8,8,11,13] <> [5,9,15,24,32,42]
[7,8,8,11,13,18=5+13,22=9+13,28=15+13,37=24+13,45=32+13,55=42+13]
Enter fullscreen mode Exit fullscreen mode

And it is associative. I am lazy, so this example is left to you.

Anyway, we need to write a wrapper.

newtype CumulativeSum a = CumulativeSum { getCumulativeSum :: [a] }
  deriving Show
Enter fullscreen mode Exit fullscreen mode

and an instance of Semigroup for it.

instance Num a => Semigroup (CumulativeSum a) where
  c1 <> c2 = let 
    l1 = getCumulativeSum c1
    l2 = getCumulativeSum c2
    nn = last l1
    in CumulativeSum (l1 ++ (fmap (nn +) l2))
Enter fullscreen mode Exit fullscreen mode

and delegate the Functor and Applicative to list.

instance Functor CumulativeSum where
  fmap f c = CumulativeSum (fmap f (getCumulativeSum c))

instance Applicative CumulativeSum where
  pure a = CumulativeSum [a]
  liftA2 f c1 c2 = CumulativeSum (liftA2 f (getCumulativeSum c1) (getCumulativeSum c2))
Enter fullscreen mode Exit fullscreen mode

and Voila!

>>> (preorder ex3 :: CumulativeSum Int)
CumulativeSum {getCumulativeSum = [7,8,8,11,13,18,22,28,37,45,55]}
>>> getCumulativeSum (preorder ex3)
[7,8,8,11,13,18,22,28,37,45,55]
Enter fullscreen mode Exit fullscreen mode

Admittedly, this is not the most elegant. We haven't made use of liftA2 of Applicative at all! We picked Applicative arbitrarily, and this is a sign that Applicative might not be the right pick here. But I am at my wit's end, so, until next time.


  1. I know this is not the canonical Tree in Haskell, but rather a nonempty tree, but the result of show looks better this way.

  2. To become Applicative list also has to be able to apply a list of functions to a list of parameters. Please also check hoogle for laws of Applicative and Semigroup.

  3. To use the (<>) operator and the various Semigroup instances we need to import Data.Semigroup.

  4. Thank you to #haskell for advices.

Top comments (1)

Collapse
 
shimmer profile image
Brian Berns

Good article. Thanks!

For trees, there are three orders to traverse it: pre-order, in-order, and post-order

These are all depth-first traversals, but of course there are other possible traversals as well. Breadth-first is often useful, but a bit more difficult to accomplish.