DEV Community

Cover image for Binary Decision Tree in Ruby ? Say hello to Composite Pattern 🌳
Pimp My Ruby for Wecasa

Posted on

Binary Decision Tree in Ruby ? Say hello to Composite Pattern 🌳

Decision trees are very common algorithms in the world of development. On paper, they are quite simple. You chain if-else statements nested within each other. The problem arises when the tree starts to grow. If you're not careful, it becomes complex to read and, therefore, difficult to maintain.

In this article, we'll see how we implemented decision trees at Wecasa to make them as readable and maintainable as possible.

Before we go any further, let's take some time to define what a Composite is.

What Links Composite and Binary Decision Tree?

Composite is a structural design pattern. It is a way of arranging objects in a tree structure, which allows us to have a logical hierarchy.

When we talk about a decision tree, we can represent it as boxes that can contain boxes, which can contain boxes, which can contain… a final result.

It's precisely this recursive aspect that makes the decision tree so powerful.
Let's take an example. We want to create a decision tree to calculate the amount of the promotion a customer is eligible for on our e-commerce site.

  • In input, our tree receives a User object with all its attributes.
  • In output, our tree returns a number.

Image description

Each node in our tree has a condition. You can notice by the color code that we have three types of Composites here: Creation Date, Zip Code, and whether the user has already made a purchase. Now that our example is set, let's see how this translates into code!


Firstly, we need tools !

In our implementation of the decision tree, we will need three classes:

  1. The Node Class: The base class for all other classes. Its purpose is to encapsulate the common logic for all different Nodes in my tree.
class Node
  attr_accessor :left_node, :right_node

  def call(user)
    raise NotImplementedError
  end

  private

  # this method is used to ease composites definition
  def call_with_condition(user)
    if yield
      left_node.call(user)
    else
      right_node.call(user)
    end
  end
end
Enter fullscreen mode Exit fullscreen mode
  1. The Leaf Class: Its purpose is to end the tree. You can find them as bottom nodes without any child nodes. The only job they have to accomplish is to return the value they encapsulate.
class Leaf < Node
  attr_accessor :value

  def initialize(value)
    self.value = value
  end

  def call(_)
    value
  end
end
Enter fullscreen mode Exit fullscreen mode
  1. The Composite Classes: Their role is to contain the condition to guide between the success branch and the failure branch, to proceed to the next step in the algorithm. They also contain a method to set arguments. We will see later why this method is useful.
class CreationDateComposite < Node
  attr_accessor :days

  def initialize(left_node, right_node)
    self.left_node = left_node
    self.right_node = right_node
  end

  def call(user)
    call_with_condition(user) do
      self.days.ago > user.created_at
    end
  end

  def set_days(days)
    self.days = days
    self
  end
end

class ZipCodeComposite < Node
  [...]
end

class HaveBoughtComposite < Node
  [...]
end
Enter fullscreen mode Exit fullscreen mode

I only show CreationDateComposite so you understand the logic, no need to show the rest.


How We Finally Manage to Create a Binary Decision Tree

Thanks to all the classes we have built, we will now create our first complete tree. Here is the proposed implementation:

class PromotionTree
  # We use Singleton because we don't need to rebuild the tree
  # everytime we call the class
  include Singleton

  TREE = CreationDateComposite.new(
    ZipCodeComposite.new(
      BoughtComposite.new(
        Leaf.new(60),
        Leaf.new(150)
      ),
      BoughtComposite.new(
        Leaf.new(30),
        Leaf.new(100)
      )
    ).set_zip_code('75'),
    BoughtComposite.new(
      Leaf.new(30),
      ZipCodeComposite.new(
        Leaf.new(75),
        ZipCodeComposite.new(
          Leaf.new(78),
          Leaf.new(0)
        ).set_zip_code('78')
      ).set_zip_code('75')
    )
  ).set_days(7).freeze

  def call(user)
    TREE.call(user)
  end
end
Enter fullscreen mode Exit fullscreen mode

And now we can simply use the tree :

user = User.last
PromotionTree.instance.call(user) # => 60 🎉
Enter fullscreen mode Exit fullscreen mode

That being explained, this implementation offers several benefits:

  • No if-else mixed in my business logic: By encapsulating conditions in separate objects, we have a clear separation of responsibilities. Each condition is handled by a specific composite, making the code more readable, elegant and maintainable.
  • Centralization of conditions in separate objects: This allows easy reuse and modification of conditions without touching the main application logic. Conditions can be changed or extended by creating new composites or modifying existing ones thanks to OOP.
  • Ease of testing: Each component of the decision tree can be tested in isolation, simplifying the unit testing process. Tests can focus on specific conditions without having to simulate the entire decision tree.
  • Extensibility: Adding new conditions or modifying existing conditions is simple and straightforward. Just create new composite nodes and insert them into the existing decision tree.
  • It's easy to build a new tree: Let's say tomorrow I want to build a new kind of tree. I always have all my Nodes ready to be used.

But apart from this, I can see some downsides :

  • The setup cost is High: I think it's not that pertinent to use this if you setup this kind of architecture if you don't plan to create other trees.
  • It uses recursion: The limit of your tree will be the limit of your Stack Size. So if your tree is bigger than your allowed stack size, you won't be able to use it. Fortunately, this value is quite big : 10867 for my setup. But it can depend on your setup !
  • You need to get into the habit to read it quickly: At first it's kinda disturbing. The fact that we need to read on the top the Composite, and the bottom the value it used is quite weird. But don't be afraid of this syntax, I will publish another article to show you how I implemented a DSL for Tree building based on this architecture.

Conclusion

Using Composite to implement decision trees allows us to structure our conditions in a clear and maintainable way. Rather than getting lost in a maze of nested if-else statements, we have a logical hierarchy of decisions, each node playing a well-defined role.
Thank you for reading. I invite you to subscribe so you don't miss the next articles that will be released.

Top comments (0)