⚠️ SPOILER ALERT ⚠️
This is a post with my solutions and learning from the puzzle. Don't continue reading if you haven't tried the puzzle on your own yet.
If you want to do the puzzle, visit adventofcode.com/2020/day/7.
My programming language of choice is python
and all examples below are in python.
TOC
Key learning
- Depth first search
- Possibly on part 1: Breadth first search
This puzzle can be viewed as a "graph" problem with nodes and edges. A simple way for beginners to start is utilizing a queue or recursive function.
Puzzle
The puzzle describes relationship between bags in different colors. A bag in one color can contain a certain amount of bags in other colors.
Example input:
light red bags contain 1 bright white bag, 2 muted yellow bags.
dark orange bags contain 3 bright white bags, 4 muted yellow bags.
bright white bags contain 1 shiny gold bag.
muted yellow bags contain 2 shiny gold bags, 9 faded blue bags.
shiny gold bags contain 1 dark olive bag, 2 vibrant plum bags.
dark olive bags contain 3 faded blue bags, 4 dotted black bags.
vibrant plum bags contain 5 faded blue bags, 6 dotted black bags.
faded blue bags contain no other bags.
dotted black bags contain no other bags.
Part 1
The puzzle is to find all bags that can directly or indirectly contain a shiny gold bag
. This can be viewed as a graph where bags are nodes and their relationships are edges.
An visualisation with example data:
-> green bag -> grey bag
/ \
blue bag - -> shiny gold bag
\ /
-> purple bag /
/
white bag -------------------
Our goal is to traverse from shiny gold
to the left and counting all nodes it finds. In this visualisation the answer is: white bag, blue, green, grey
Parsing input
A big part of the problem is deciding on data structure and how to parse the input into that structure.
For part 1 I aimed for this data-structure:
bags = {
"light red": "1 bright white bag, 2 muted yellow bags.",
"dark orange": "3 bright white bags, 4 muted yellow bags.",
"bright white": "1 shiny gold bag.",
"muted yellow": "2 shiny gold bags, 9 faded blue bags.",
...
}
To keep track of which bags that contain a shiny gold
bag I used a set
to avoid duplicates. Below is the code to parse in the data. This approach does the simple parsing for bare minimum for part 1:
# Extract a list of lines from file
lines = [x.strip() for x in open('07.input', 'r').readlines()]
bags = {}
for l in lines:
bag, contains = l.split('contain')
bag = bag.replace(' bags','')
bags[bag] = contains
Solution: Breadth first
Some bags will directly contain shiny gold
. Those bags can in turn be contained in other bags, which will indirectly contain shiny gold
.
This solution uses a work queue to traverse the "graph". Once we find a bag that can contain shiny gold
we add it to the queue to check if it can be contained by another bag. We keep on doing so until the work queue is empty. For every bag we find we store it in the answer
-set
This is a so called breadth first search. In this case I use a queue
.
answer = set() # Bags containing 'shiny gold'
q = ['shiny gold'] # Work queue for traversing bags
while len(q) != 0:
current = q.pop(0)
for bag in bags:
if bag in answer: # Skip if already counted.
continue
if current in bags[bag]: # If bag contains current-bag,
q.append(bag) # add to queue and answer
answer.add(bag)
print "Answer part 1: %d" % len(answer)
Solution: Depth first
Another solution is the depth first search. Both algorithms do fine in this puzzle and input-size. But in coming advent-of-code-puzzles, it is possible that only one will be sufficient enough. Therefore we practice both now.
This example is a depth first
where I use a recursive function
:
bags = {}
for l in lines:
bag, contains = l.split('contain')
bag = bag.replace(' bags','')
bags[bag] = contains
def deep_search(bag, bags):
contains = []
for b in bags:
if bag in bags[b]:
# Add b to our list
contains.append(b)
# Add all children to b in our list
contains.extend(deep_search(b, bags))
# Remove duplicates
return set(contains)
print "Answer: ", len(deep_search('shiny gold', bags))
Part 2
Parsing input
For part 2 I aimed for this data-structure:
bags = {
"light red": { "bright white": 1, "muted yellow": 2 },
"dark orange": { "bright white": 3, "muted yellow": 4 },
"bright white": { "shiny gold": 1 },
"muted yellow": { "shiny gold": 2, "faded blue": 9 },
...
}
In this way I can call bags['muted yellow']['shiny gold']
to find out how many shiny gold
bags muted yellow
contain.
This is a ugly "simple" way of solving it. Further down is an regex example which is cleaner.
bags = {}
q = []
for l in lines:
# Clean line from unnecessary words.
l = l.replace('bags', '').replace('bag', '').replace('.','')
bag, contains = l.split('contain')
bag = bag.strip()
if 'no other' in contains: # If bag doesn't contain any bag
bags[bag] = {}
continue
contains = contains.split(',')
contain_dict = {}
for c in [c.strip() for c in contains]:
amount = c[:2]
color = c[2:]
contain_dict[color] = int(amount)
bags[bag] = contain_dict
Solution
A good data structure enables us to do a clean solution. Here is an example with
a recursive function to count all bags:
def recursive_count(bag, bags):
count = 1 # Count the bag itself
contained_bags = bags[bag]
for c in contained_bags:
multiplier = contained_bags[c]
count += multiplier * recursive_count(c, bags)
return count
# Minus one to not count the shiny gold bag itself
result = recursive_count('shiny gold', bags) - 1
print "Result part 2: %d " % result
Alternative: With regex
A lot easier way to parse the data is using regex. Here is an example:
import re
from collections import defaultdict
bags = defaultdict(dict)
for l in lines:
bag = re.match(r'(.*) bags contain', l).groups()[0]
for count, b in re.findall(r'(\d+) (\w+ \w+) bag', l):
bags[bag][b] = int(count)
def part1():
answer = set()
def search(color):
for b in bags:
if color in bags[b]:
answer.add(b)
search(b)
search('shiny gold')
return len(answer)
def part2():
def search(bag):
count = 1
for s in bags[bag]:
multiplier = bags[bag][s]
count += multiplier * search(s)
return count
return search('shiny gold' ) - 1 # Rm one for shiny gold itself
Alternative: Network X
NetworkX is a good library for handling graphs. Haven't used it myself so much, but it appears quite often in the solutions megathread once there is a graph-problem.
There is a lot of documentation, but by reading solutions on the megathread you'll learn what common features to use.
Make sure you understand the above example with regex first before reading this. Here is an example solution with networkx:
import re
from collections import defaultdict
import networkx as nx
# Parse data
bags = defaultdict(dict)
for l in lines:
bag = re.match(r'(.*) bags contain', l).groups()[0]
for count, b in re.findall(r'(\d+) (\w+ \w+) bag', l):
bags[bag][b] = { 'weight': int(count)}
# Create a graph in networkx
G = nx.DiGraph(bags)
def part1():
# Reverse edges
RG = G.reverse()
# Get predecessors
predecessors = nx.dfs_predecessors(RG, 'shiny gold')
# Count predecessors
return len(predecessors)
print(part1())
def part2():
def search(node):
count = 1
# Iterate neighbors for node
for n in G.neighbors(node):
# Multiply weight with recursive search
count += G[node][n]['weight'] * search(n)
return count
return search('shiny gold') - 1
print(part2())
Some benefit in using networkx is that it gives us more powerful tools than just traversing the graph. If we would like to find the shortest path, then this one-liner would help:
print(nx.shortest_path(G, source='bright red', target='shiny gold'))
Links:
- networkx.org/documentation/networkx-2.3/reference/classes/digraph.html#
- [networkx.org/documentation/networkx-2.3/reference/algorithms/generated/networkx.algorithms.shortest_paths.generic.shortest_path.html](https://networkx.org/documentation/networkx-2.3/reference/algorithms/generated/networkx.algorithms.shortest_paths.generic.shortest_path.html#networkx.algorithms.shortest_paths.generic.shortest_path
More python tools
These are tools worth having in back of your head while doing advent of code.
defaultdict: Useful when you don't want to think about initiating all values first.
deque: Useful when you want to pop left and right from a queue.
heapq: Amazing for tool or solving problems that involve extremes such as largest, smallest, optimal, etc.
Thanks for reading!
I hope these solutions were helpful for you.
Complete code can be found at: github.com/cNille/AdventOfCode/blob/master/2020/07.py
Top comments (1)
This is an awesome post! Thanks for sharing your insights!