Advent of Code 2016 Day 7
Part 1
- I've come a long way!
- Crafting the supernet
regular expression
- Crafting the hypernet
regular expression
- Altogether now:
reduce()
,control flow
®ex
- Hunting for the culprit
- Identifying the culprit
- Updating my hypernet sequence
regex
- Updating and re-testing my algorithm
I've come a long way!
- Before attempting my first Advent of Code puzzle, I was terrified of
regular expressions
- I couldn't read them
- I couldn't write them
- I knew I'd have to resort to manual string inspection (by way of splitting into an array, counting characters, tracking indices, etc.) if I wanted to solve any puzzles that required finding patterns within strings
- But here I am - several years into AoC - and I got giddy with excitement when I noticed this puzzle would most certainly involve
regular expressions
!
That's because I've had a lot of practice as part of this series:
- The simpler exercises of extracting each line's integer values
- Knowing just enough about the syntax to craft overly-specific regular expressions, such as for one among a limited set of possible character combinations
- Becoming more familiar with
capture groups
,character classes
,escape characters
,backreferences
andlookaheads
/lookbehinds
- especially when attempting to find the same character so many times in a row
I'm so glad I attempting these puzzles backwards.
Because today's seems like it will require all of my acquired regular expression
skills to solve.
Crafting the supernet regular expression
What I need to find in a string:
any four-character sequence which consists of a pair of two different characters followed by the reverse of that pair, such as
xyyx
orabba
That seems like a relatively simple regular expression
involving two capture groups
and two backreferences
, one to each capture group
:
/(.)(.)\2\1/
-
(.)
matches nearly any character - I'll call A -
(.)
matches nearly any character - I'll call B -
\2
matches only B -
\1
matches only A
Thus, finding ABBA
...right?
Not so fast:
aaaa is invalid; the interior characters must be different
Oh, yeah! I forgot about that!
Hmm, how could I update my regex
to eliminate AAAA
or BBBB
from matching?
I need to ensure that whatever comes after \2
is not another \2
.
So, just after my \2
, I need to look ahead at the next character, and confirm it is not the same character.
A negative lookahead
!
That looks like this:
/(.)(.)\2(?!\2)\1/
-
(.)
matches nearly any character - I'll call A -
(.)
matches nearly any character - I'll call B -
\2
matches only B -
(?!\2)
looks ahead and confirms no B, at least in the next position -
\1
matches only A
Passing some tests:
-
abba
match: passes -
aaaa
no match: passes -
abab
no match: passes -
bbbb
no match: passes -
aaabbaaa
match: passes
Crafting the hypernet regular expression
What I need to avoid being in a string:
must not have an ABBA within any hypernet sequences, which are contained by square brackets
In other words:
-
abba
passes -
[abba]
fails
Hmm. Ok, I still got this, I think.
I already know how to find abba
:
/(.)(.)\2(?!\2)\1/
Now I need to account for square brackets on either side.
/\[.*(.)(.)\2(?!\2)\1.*\]/
-
\[
matches an open square bracket -
.*
matches zero or more characters -
(.)
matches nearly any character - I'll call A -
(.)
matches nearly any character - I'll call B -
\2
matches only B -
(?!\2)
looks ahead and confirms no B, at least in the next position -
\1
matches only A -
.*
matches zero or more characters -
\]
matches a closing square bracket
Passing some tests:
-
abcd[abba]abcd
match: passes -
abcd[abba]abba
match: passes (I'll handle there being an outer match in the control flow of my algorithm) -
abcd[abcd]abcd[abba]
match: passes
I think I've got both working regex
to build my algorithm!
Altogether now: reduce()
, control flow
& regex
My algorithm as pseudocode:
Split my puzzle input at each newline character into an array of strings
For each string, accumulate a sum - starting at 0
Test for a match of an ABBA in a hypernet sequence
If there was no match in this string
Test for a match of an ABBA anywhere
If there was a match
Increment the sum by 1
Return the sum
- I ran it on the examples in the instructions: correct answer!
- I ran it on my puzzle input: wrong answer!
I wasn't expecting that.
What am I missing?
Hunting for the culprit
- My algorithm found 77 valid IP addresses from my input list of 2000
- I was only told that was not the correct answer; no guidance about too low or too high
- First, I need to know whether my 77 valid IP addresses are indeed valid
- Second, I need to test more IP addresses in my input on both of my
regex
es to ensure they are working as expected - Hopefully somewhere in this litany of troubleshooting tactics I'll find the reason why my algorithm isn't generating the correct answer!
Step one, complete:
After logging the IP address and the ABBA
found in it, I feel confident that my 77 valid IP addresses are indeed valid.
Step two, epiphany!
- I copy-pasted several IP addresses from my puzzle input into regexr.com
- Many highlighted the match or lack thereof that I was expecting
But one caught my attention.
Identifying the culprit
It had this pattern:
abcd[abcd]abba[abcd]
- It was mistakenly being flagged as a match for my hypernet sequence
regex
!
That regex
again:
/\[.*(.)(.)\2(?!\2)\1.*\]/
- Open square bracket
- Zero or more of any character
- ABBA
- Zero or more of any character
- Closing square bracket
Wow. Ya. It matched abba
:
[abcd]abba[abcd]
Well, that's not good!
How do I correct my regex
to not mistakenly match ABBA
s that are outside of - but in between - two hypernet sequences?
Updating my hypernet sequence regex
The problem seems to be my regex
missing the ABBA
's closest pair of closing and opening square brackets.
It knows to look for an opening one.
But not to discard a match if it finds a closing one between the opening one and the ABBA
.
Therefore, my .
character class is too general.
I need to look for any character that isn't an opposing square bracket at each end.
Like this:
/\[[^\]]*(?:(.)(.)\2(?!\2)\1)[^\[]*\]/
-
\[
matches an open square bracket -
[^\]]*
matches zero or more characters that are not closing square brackets -
(.)
matches nearly any character - I'll call A -
(.)
matches nearly any character - I'll call B -
\2
matches only B -
(?!\2)
looks ahead and confirms no B, at least in the next position -
\1
matches only A -
[^\[]*
matches zero or more characters that are not opening square brackets -
\]
matches a closing square bracket
Voila!
Running the test again:
-
abcd[abcd]abba[abcd]
no match! -
[abba]
still matches!
Updating and re-testing my algorithm
- I replaced my hypernet
regex
with this new, more specific one - I re-ran my program
- I now had 110 valid IP addresses
- That is the correct answer!
My working, eloquent but cryptic, JavaScript algorithm:
input.reduce((valids, ip) => {
return valids += (
!ip.match(/\[[^\]]*(?:(.)(.)\2(?!\2)\1)[^\[]*\]/)
&& ip.match(/(.)(.)\2(?!\2)\1/)
) ? 1 : 0
}, 0)
Part 2
- Same strategy. Different
regex
. - A ton of trial and error
Same strategy. Different regex
.
In Part 1, I generated the correct answer using two subsequent regular expressions
:
- Flag any strings with
ABBA
inside square brackets - Count any unflagged strings with
ABBA
As long as the first test fails and the second passes, I had a match!
For Part 2, I intend to generate the correct answer using two independent new regular expressions
:
- A
supernet
followed by ahypernet
- A
hypernet
followed by asupernet
Examples of 1:
ABA[BAB]
ABC[ABC]ABA[ABC]ABC[BAB]
Examples of 2:
[BAB]ABA
ABC[BAB]ABC[ABC]ABA[ABC]
A ton of trial and error
- I used regexr.com to iterate several times toward working
regex
s - I thought I found both
I updated my ternary
operation from:
return valids += !match1 && match2 ? 1 : 0
To:
return valids += match1 || match2 ? 1 : 0
- I ran my program
- Generated 272 valid IPs: Too high
- Bummer! But at least some guidance!
- Assumption: my
regex
s were finding too many matches!
After a lot of manual analysis of the results I was matching, I noticed this pattern being incorrectly matched:
ABC[ABA]ABC[BAB]ABC
It appeared that my regex
was missing the occurrence of an open square bracket - thereby failing to detect that the matches were both in hypernet
s.
After plenty more trial and error using regexr.com, I felt confident I wrote a superior regex
.
- I ran my program
- Generated 241 valid IPs: Too low
- Bummer! But at least some more guidance!
- Assumption: my other
regex
wasn't finding enough matches!
After even more trial and error, I shifting one section of my regex
to the left.
It didn't affect matches I was already finding.
Maybe this was it?
- I ran my program
- Generated 242 valid IPs: Correct answer!
This animation combines both parts, but highlights my struggle to write both regex
s that would solve Part 2:
My working, eloquent but cryptic, JavaScript algorithm:
input.reduce((valids, ip) => {
return valids += (
ip.match(/(.)(.)(?!\2)\1[^\]]*\[.*\2\1\2[^\[]*\]/) ||
ip.match(/\[[^\]\[]*(.)(.)(?!\2)\1.*?\][^\[]*\2\1\2/)
) ? 1 : 0
}, 0)
I did it!!
- I solved both parts!
- I solved Part 1 rather quickly!
- I solved Part 2 after hours of analysis, trial and error!
- I wrote my most complicated
regular expressions
yet! - I wrote some of the shortest algorithms to date throughout AoC!
- I made a GIF that attempts to catalog each of my failed and successful
regex
s!
Wow. What a regex
gauntlet that was.
Top comments (0)