Christopher Nilsson

Posted on Dec 7, 2020

Advent of Code: 2020 Day 07

#adventofcode #python #networkx

⚠️ SPOILER ALERT ⚠️
This is a post with my solutions and learning from the puzzle. Don't continue reading if you haven't tried the puzzle on your own yet.

If you want to do the puzzle, visit adventofcode.com/2020/day/7.

My programming language of choice is python and all examples below are in python.

Key learning

Depth first search
Possibly on part 1: Breadth first search

This puzzle can be viewed as a "graph" problem with nodes and edges. A simple way for beginners to start is utilizing a queue or recursive function.

Puzzle

The puzzle describes relationship between bags in different colors. A bag in one color can contain a certain amount of bags in other colors.

Example input:

light red bags contain 1 bright white bag, 2 muted yellow bags.
dark orange bags contain 3 bright white bags, 4 muted yellow bags.
bright white bags contain 1 shiny gold bag.
muted yellow bags contain 2 shiny gold bags, 9 faded blue bags.
shiny gold bags contain 1 dark olive bag, 2 vibrant plum bags.
dark olive bags contain 3 faded blue bags, 4 dotted black bags.
vibrant plum bags contain 5 faded blue bags, 6 dotted black bags.
faded blue bags contain no other bags.
dotted black bags contain no other bags.

Part 1

The puzzle is to find all bags that can directly or indirectly contain a shiny gold bag. This can be viewed as a graph where bags are nodes and their relationships are edges.

An visualisation with example data:

           -> green bag -> grey bag
          /                       \
blue bag -                         -> shiny gold bag
          \                       /
            -> purple bag       /
                               /
white bag -------------------

Our goal is to traverse from shiny gold to the left and counting all nodes it finds. In this visualisation the answer is: white bag, blue, green, grey

Parsing input

A big part of the problem is deciding on data structure and how to parse the input into that structure.

For part 1 I aimed for this data-structure:

bags = {
  "light red": "1 bright white bag, 2 muted yellow bags.",
  "dark orange": "3 bright white bags, 4 muted yellow bags.",
  "bright white": "1 shiny gold bag.",
  "muted yellow":  "2 shiny gold bags, 9 faded blue bags.",
  ...
}

To keep track of which bags that contain a shiny gold bag I used a set to avoid duplicates. Below is the code to parse in the data. This approach does the simple parsing for bare minimum for part 1:

# Extract a list of lines from file
lines = [x.strip() for x in open('07.input', 'r').readlines()]

bags = {}
for l in lines:
    bag, contains = l.split('contain')
    bag = bag.replace(' bags','')
    bags[bag] = contains

Solution: Breadth first

Some bags will directly contain shiny gold. Those bags can in turn be contained in other bags, which will indirectly contain shiny gold.

This solution uses a work queue to traverse the "graph". Once we find a bag that can contain shiny gold we add it to the queue to check if it can be contained by another bag. We keep on doing so until the work queue is empty. For every bag we find we store it in the answer-set

This is a so called breadth first search. In this case I use a queue.

answer = set()                      # Bags containing 'shiny gold'
q = ['shiny gold']                  # Work queue for traversing bags
while len(q) != 0:
    current = q.pop(0)
    for bag in bags:
        if bag in answer:             # Skip if already counted.
            continue
        if current in bags[bag]:      # If bag contains current-bag,
            q.append(bag)             # add to queue and answer
            answer.add(bag)
print "Answer part 1: %d" % len(answer)

Solution: Depth first

Another solution is the depth first search. Both algorithms do fine in this puzzle and input-size. But in coming advent-of-code-puzzles, it is possible that only one will be sufficient enough. Therefore we practice both now.

This example is a depth first where I use a recursive function:

bags = {}
for l in lines:
    bag, contains = l.split('contain')
    bag = bag.replace(' bags','')
    bags[bag] = contains

def deep_search(bag, bags):
    contains = []
    for b in bags:
        if bag in bags[b]:
            # Add b to our list
            contains.append(b)
            # Add all children to b in our list
            contains.extend(deep_search(b, bags))

    # Remove duplicates
    return set(contains)

print "Answer: ", len(deep_search('shiny gold', bags))

Part 2

Parsing input

For part 2 I aimed for this data-structure:

bags = {
  "light red": { "bright white": 1, "muted yellow": 2 },
  "dark orange": { "bright white": 3, "muted yellow": 4 },
  "bright white": { "shiny gold": 1 },
  "muted yellow": { "shiny gold": 2, "faded blue": 9 },
  ...
}

In this way I can call bags['muted yellow']['shiny gold'] to find out how many shiny gold bags muted yellow contain.

This is a ugly "simple" way of solving it. Further down is an regex example which is cleaner.

bags = {}
q = []
for l in lines:
    # Clean line from unnecessary words.
    l = l.replace('bags', '').replace('bag', '').replace('.','')

    bag, contains = l.split('contain')
    bag = bag.strip()

    if 'no other' in contains: # If bag doesn't contain any bag
        bags[bag] = {}
        continue

    contains = contains.split(',')
    contain_dict = {}
    for c in [c.strip() for c in contains]:
        amount = c[:2]
        color = c[2:]
        contain_dict[color] = int(amount)
    bags[bag] = contain_dict

Solution

A good data structure enables us to do a clean solution. Here is an example with
a recursive function to count all bags:

def recursive_count(bag, bags):
    count = 1  # Count the bag itself

    contained_bags = bags[bag]
    for c in contained_bags:
        multiplier = contained_bags[c]
        count += multiplier * recursive_count(c, bags)
    return count

# Minus one to not count the shiny gold bag itself
result = recursive_count('shiny gold', bags) - 1
print "Result part 2: %d " % result

Alternative: With regex

A lot easier way to parse the data is using regex. Here is an example:

import re
from collections import defaultdict

bags = defaultdict(dict)
for l in lines:
    bag = re.match(r'(.*) bags contain', l).groups()[0]
    for count, b in re.findall(r'(\d+) (\w+ \w+) bag', l):
        bags[bag][b] = int(count)

def part1():
    answer = set()
    def search(color):
        for b in bags:
            if color in bags[b]:
                answer.add(b)
                search(b)
    search('shiny gold')
    return len(answer)

def part2():
    def search(bag):
        count = 1
        for s in bags[bag]:
            multiplier = bags[bag][s]
            count += multiplier * search(s)
        return count
    return search('shiny gold' ) - 1  # Rm one for shiny gold itself

Alternative: Network X

NetworkX is a good library for handling graphs. Haven't used it myself so much, but it appears quite often in the solutions megathread once there is a graph-problem.

There is a lot of documentation, but by reading solutions on the megathread you'll learn what common features to use.

Make sure you understand the above example with regex first before reading this. Here is an example solution with networkx:

import re
from collections import defaultdict
import networkx as nx

# Parse data
bags = defaultdict(dict)
for l in lines:
    bag = re.match(r'(.*) bags contain', l).groups()[0]
    for count, b in re.findall(r'(\d+) (\w+ \w+) bag', l):
        bags[bag][b] = { 'weight': int(count)}

# Create a graph in networkx
G = nx.DiGraph(bags)

def part1():
    # Reverse edges
    RG = G.reverse()
    # Get predecessors
    predecessors = nx.dfs_predecessors(RG, 'shiny gold')
    # Count predecessors
    return len(predecessors)
print(part1())

def part2():
    def search(node):
        count = 1
        # Iterate neighbors for node
        for n in G.neighbors(node):
            # Multiply weight with recursive search
            count += G[node][n]['weight'] * search(n)
        return count
    return search('shiny gold') - 1
print(part2())

Some benefit in using networkx is that it gives us more powerful tools than just traversing the graph. If we would like to find the shortest path, then this one-liner would help:

print(nx.shortest_path(G, source='bright red', target='shiny gold'))

Links:

networkx.org/documentation/networkx-2.3/reference/classes/digraph.html#
[networkx.org/documentation/networkx-2.3/reference/algorithms/generated/networkx.algorithms.shortest_paths.generic.shortest_path.html](https://networkx.org/documentation/networkx-2.3/reference/algorithms/generated/networkx.algorithms.shortest_paths.generic.shortest_path.html#networkx.algorithms.shortest_paths.generic.shortest_path

More python tools

These are tools worth having in back of your head while doing advent of code.

defaultdict: Useful when you don't want to think about initiating all values first.

deque: Useful when you want to pop left and right from a queue.

heapq: Amazing for tool or solving problems that involve extremes such as largest, smallest, optimal, etc.

Thanks for reading!

I hope these solutions were helpful for you.

Complete code can be found at: github.com/cNille/AdventOfCode/blob/master/2020/07.py

Top comments (1)

Levi Guo • Dec 13 '20

This is an awesome post! Thanks for sharing your insights!

DEV Community

Advent of Code: 2020 Day 07

TOC

Key learning

Puzzle

Part 1

Parsing input

Solution: Breadth first

Solution: Depth first

Part 2

Parsing input

Solution

Alternative: With regex

Alternative: Network X

More python tools

Thanks for reading!

Top comments (1)

Read next

7 Powerful Python Metaprogramming Techniques for Dynamic Code

BCEWithLogitsLoss in PyTorch

🚀 Validating User Input with FastAPI: An Example with Custom Validators

Python beats Javascript, Next.js Leap & the AI Coding Wars