DEV Community

ZhiHong Chua
ZhiHong Chua

Posted on

Under the hood of Python

A deep dive into sortedcontainers Source Code

About 1 year ago, I solved this question in Amazon Berlin's phone screen. But without the optimal time complexity. Many times since then, I have wondered how to do it better.

Today I encountered it again in Leetcode's Daily Challenge.

Turns out the missing part is a data structure unbeknownst to me then -- Sorted Dictionary. So here are the findings for the day, including a deep dive into the Python source code for this. Beyond being a journal, I hope this post helps people:

  • realize it's not that scary to dig into Python source code,
  • learn a new data structure.

Checking the Leetcode Solution

from sortedcontainers import SortedDict

class MyCalendarThree(object):

    def __init__(self):
        self.diff = SortedDict()

    def book(self, start, end):
        """
        :type start: int
        :type end: int
        :rtype: int
        """
        self.diff[start] = self.diff.get(start, 0) + 1
        self.diff[end] = self.diff.get(end, 0) - 1
        maxK = k = 0

        for val in self.diff.values():
            k += val
            maxK = max(maxK, k)

        return maxK
Enter fullscreen mode Exit fullscreen mode

My goodness, so the question that has been haunting me was so simple?!

No, it just looks deceptively simple because of the abstracted logic in SortedDict(), which we'll look at in a bit, but first, a thing on lazy engineers...

Sorting the Standard Python Dictionary, dict()

I recall hearing that best engineers are the lazy ones. This is because they find the easiest way to do things, which is the essence of our efficiency-obsessed world (wait a minute, is that why we're having layoffs?).

Image description

In spirit of that, I tried to hack it with my existing knowledge of the standard Python dict() function. The whole article is below, but TLDR; it didn't work. Too much tuple/list/dict conversions and lambda expressions that seemed too crazy for a Leetcode Hard question, which is expected to be completed within 1 hour (mentioned by a Google engineer).

https://realpython.com/sort-python-dictionary/#sorting-dictionaries-in-python

Diving into sortedcontainers

Since we wrote from sortedcontainers import SortedDict, let's look for SortedDict() first!

Image description

Mmm.. it has pretty good documentation, quite quickly we can see the methods we need __getItem__ and __setItem__

Since __getItem__ is inherited from dict, then there is no point to explore that, we already know the function.

__setItem__ is the interesting one.

Image description

Wait.. what is that self._list_add(key)?

Image description

Image description

Looks like it involves some binary search and insert.

Gathering the ingredients

Now we know SortedDict() implementation involves:

  1. A sorted list to track order of keys in the hashmap (since keys in hashmap are not ordered).
  2. Binary Search and Insert to the sorted list.

Final Solution

class MyCalendarThree(object):

    def __init__(self):
        self.diff = {}
        self.sortedIdx = []

    def book(self, start, end):
        """
        :type start: int
        :type end: int
        :rtype: int
        """
        def binarySearchAndInsert(arr, target):
            l, r = 0, len(arr)-1

            while l <= r:
                mid = l + (r-l)//2
                if arr[mid] > target:
                    r = mid - 1
                else:
                    l = mid + 1

            arr.insert(l, target)
            return arr

        if start not in self.diff:
            self.sortedIdx = binarySearchAndInsert(self.sortedIdx, start)
        self.diff[start] = self.diff.get(start,0) + 1

        if end not in self.diff:
            self.sortedIdx = binarySearchAndInsert(self.sortedIdx, end)
        self.diff[end] = self.diff.get(end,0) - 1

        maxK = k = 0

        for idx in self.sortedIdx:
            val = self.diff[idx]
            k += val
            maxK = max(maxK, k)

        return maxK
Enter fullscreen mode Exit fullscreen mode

Conclusion

The full Leetcode discussion post I wrote is linked here.

Besides that, I think I'll continue to be a lazy engineer and just use a SortedDict() next time a similar question comes up ๐Ÿ˜Ž

Top comments (0)