Robert Mion

Posted on Aug 29, 2022

Internet Protocol Version 7

#adventofcode #algorithms #programming #regex

Advent of Code 2016 Day 7

Part 1

I've come a long way!
Crafting the supernet regular expression
Crafting the hypernet regular expression
Altogether now: reduce(), control flow & regex
Hunting for the culprit
Identifying the culprit
Updating my hypernet sequence regex
Updating and re-testing my algorithm

I've come a long way!

Before attempting my first Advent of Code puzzle, I was terrified of regular expressions
I couldn't read them
I couldn't write them
I knew I'd have to resort to manual string inspection (by way of splitting into an array, counting characters, tracking indices, etc.) if I wanted to solve any puzzles that required finding patterns within strings
But here I am - several years into AoC - and I got giddy with excitement when I noticed this puzzle would most certainly involve regular expressions!

That's because I've had a lot of practice as part of this series:

The simpler exercises of extracting each line's integer values
Knowing just enough about the syntax to craft overly-specific regular expressions, such as for one among a limited set of possible character combinations
Becoming more familiar with capture groups, character classes, escape characters, backreferences and lookaheads/lookbehinds - especially when attempting to find the same character so many times in a row

I'm so glad I attempting these puzzles backwards.

Because today's seems like it will require all of my acquired regular expression skills to solve.

Crafting the supernet `regular expression`

What I need to find in a string:

any four-character sequence which consists of a pair of two different characters followed by the reverse of that pair, such as xyyx or abba

That seems like a relatively simple regular expression involving two capture groups and two backreferences, one to each capture group:

/(.)(.)\2\1/

(.) matches nearly any character - I'll call A
(.) matches nearly any character - I'll call B
\2 matches only B
\1 matches only A

Thus, finding ABBA...right?

Not so fast:

aaaa is invalid; the interior characters must be different

Oh, yeah! I forgot about that!

Hmm, how could I update my regex to eliminate AAAA or BBBB from matching?

I need to ensure that whatever comes after \2 is not another \2.

So, just after my \2, I need to look ahead at the next character, and confirm it is not the same character.

A negative lookahead!

That looks like this:

/(.)(.)\2(?!\2)\1/

(.) matches nearly any character - I'll call A
(.) matches nearly any character - I'll call B
\2 matches only B
(?!\2) looks ahead and confirms no B, at least in the next position
\1 matches only A

Passing some tests:

abba match: passes
aaaa no match: passes
abab no match: passes
bbbb no match: passes
aaabbaaa match: passes

Crafting the hypernet `regular expression`

What I need to avoid being in a string:

must not have an ABBA within any hypernet sequences, which are contained by square brackets

In other words:

abba passes
[abba] fails

Hmm. Ok, I still got this, I think.

I already know how to find abba:

/(.)(.)\2(?!\2)\1/

Now I need to account for square brackets on either side.

/\[.*(.)(.)\2(?!\2)\1.*\]/

\[ matches an open square bracket
.* matches zero or more characters
(.) matches nearly any character - I'll call A
(.) matches nearly any character - I'll call B
\2 matches only B
(?!\2) looks ahead and confirms no B, at least in the next position
\1 matches only A
.* matches zero or more characters
\] matches a closing square bracket

Passing some tests:

abcd[abba]abcd match: passes
abcd[abba]abba match: passes (I'll handle there being an outer match in the control flow of my algorithm)
abcd[abcd]abcd[abba] match: passes

I think I've got both working regex to build my algorithm!

Altogether now: `reduce()`, `control flow` & `regex`

My algorithm as pseudocode:

Split my puzzle input at each newline character into an array of strings

For each string, accumulate a sum - starting at 0
  Test for a match of an ABBA in a hypernet sequence
  If there was no match in this string
    Test for a match of an ABBA anywhere
    If there was a match
      Increment the sum by 1

Return the sum

I ran it on the examples in the instructions: correct answer!
I ran it on my puzzle input: wrong answer!

I wasn't expecting that.

What am I missing?

Hunting for the culprit

My algorithm found 77 valid IP addresses from my input list of 2000
I was only told that was not the correct answer; no guidance about too low or too high
First, I need to know whether my 77 valid IP addresses are indeed valid
Second, I need to test more IP addresses in my input on both of my regexes to ensure they are working as expected
Hopefully somewhere in this litany of troubleshooting tactics I'll find the reason why my algorithm isn't generating the correct answer!

Step one, complete:
After logging the IP address and the ABBA found in it, I feel confident that my 77 valid IP addresses are indeed valid.

Step two, epiphany!

I copy-pasted several IP addresses from my puzzle input into regexr.com
Many highlighted the match or lack thereof that I was expecting

But one caught my attention.

Identifying the culprit

It had this pattern:

abcd[abcd]abba[abcd]

It was mistakenly being flagged as a match for my hypernet sequence regex!

That regex again:

/\[.*(.)(.)\2(?!\2)\1.*\]/

Open square bracket
Zero or more of any character
ABBA
Zero or more of any character
Closing square bracket

Wow. Ya. It matched abba:

    [abcd]abba[abcd]

Well, that's not good!

How do I correct my regex to not mistakenly match ABBAs that are outside of - but in between - two hypernet sequences?

Updating my hypernet sequence `regex`

The problem seems to be my regex missing the ABBA's closest pair of closing and opening square brackets.

It knows to look for an opening one.

But not to discard a match if it finds a closing one between the opening one and the ABBA.

Therefore, my . character class is too general.

I need to look for any character that isn't an opposing square bracket at each end.

Like this:

/\[[^\]]*(?:(.)(.)\2(?!\2)\1)[^\[]*\]/

\[ matches an open square bracket
[^\]]* matches zero or more characters that are not closing square brackets
(.) matches nearly any character - I'll call A
(.) matches nearly any character - I'll call B
\2 matches only B
(?!\2) looks ahead and confirms no B, at least in the next position
\1 matches only A
[^\[]* matches zero or more characters that are not opening square brackets
\] matches a closing square bracket

Voila!

Running the test again:

abcd[abcd]abba[abcd] no match!
[abba] still matches!

Updating and re-testing my algorithm

I replaced my hypernet regex with this new, more specific one
I re-ran my program
I now had 110 valid IP addresses
That is the correct answer!

My working, eloquent but cryptic, JavaScript algorithm:

input.reduce((valids, ip) => {
    return valids += (
      !ip.match(/\[[^\]]*(?:(.)(.)\2(?!\2)\1)[^\[]*\]/) 
      && ip.match(/(.)(.)\2(?!\2)\1/)
    ) ? 1 : 0
  }, 0)

Part 2

Same strategy. Different regex.
A ton of trial and error

Same strategy. Different `regex`.

In Part 1, I generated the correct answer using two subsequent regular expressions:

Flag any strings with ABBA inside square brackets
Count any unflagged strings with ABBA

As long as the first test fails and the second passes, I had a match!

For Part 2, I intend to generate the correct answer using two independent new regular expressions:

A supernet followed by a hypernet
A hypernet followed by a supernet

Examples of 1:

ABA[BAB]
ABC[ABC]ABA[ABC]ABC[BAB]

Examples of 2:

[BAB]ABA
ABC[BAB]ABC[ABC]ABA[ABC]

A ton of trial and error

I used regexr.com to iterate several times toward working regexs
I thought I found both

I updated my ternary operation from:

return valids += !match1 && match2 ? 1 : 0

To:

return valids += match1 || match2 ? 1 : 0

I ran my program
Generated 272 valid IPs: Too high
Bummer! But at least some guidance!
Assumption: my regexs were finding too many matches!

After a lot of manual analysis of the results I was matching, I noticed this pattern being incorrectly matched:

ABC[ABA]ABC[BAB]ABC

It appeared that my regex was missing the occurrence of an open square bracket - thereby failing to detect that the matches were both in hypernets.

After plenty more trial and error using regexr.com, I felt confident I wrote a superior regex.

I ran my program
Generated 241 valid IPs: Too low
Bummer! But at least some more guidance!
Assumption: my other regex wasn't finding enough matches!

After even more trial and error, I shifting one section of my regex to the left.

It didn't affect matches I was already finding.

Maybe this was it?

I ran my program
Generated 242 valid IPs: Correct answer!

This animation combines both parts, but highlights my struggle to write both regexs that would solve Part 2:

My working, eloquent but cryptic, JavaScript algorithm:

input.reduce((valids, ip) => {
    return valids += (
      ip.match(/(.)(.)(?!\2)\1[^\]]*\[.*\2\1\2[^\[]*\]/) || 
      ip.match(/\[[^\]\[]*(.)(.)(?!\2)\1.*?\][^\[]*\2\1\2/)
    ) ? 1 : 0
  }, 0)

I did it!!

I solved both parts!
I solved Part 1 rather quickly!
I solved Part 2 after hours of analysis, trial and error!
I wrote my most complicated regular expressions yet!
I wrote some of the shortest algorithms to date throughout AoC!
I made a GIF that attempts to catalog each of my failed and successful regexs!

Wow. What a regex gauntlet that was.

DEV Community

Internet Protocol Version 7

Advent of Code 2016 Day 7

Part 1

I've come a long way!

Crafting the supernet `regular expression`

Crafting the hypernet `regular expression`

Altogether now: `reduce()`, `control flow` & `regex`

Hunting for the culprit

Identifying the culprit

Updating my hypernet sequence `regex`

Updating and re-testing my algorithm

Part 2

Same strategy. Different `regex`.

A ton of trial and error

I did it!!

Top comments (0)

Read next

React.js Explained: A Comprehensive Guide to Building Modern Web Applications

How to Update Java JDK [A Simple Guide]

AoC 24 - Day 4: Ceres Search

JavaScript Best Practices

Advent of Code 2016 Day 7

Part 1

I've come a long way!

Crafting the supernet regular expression

Crafting the hypernet regular expression

Altogether now: reduce(), control flow & regex

Hunting for the culprit

Identifying the culprit

Updating my hypernet sequence regex

Updating and re-testing my algorithm

Part 2

Same strategy. Different regex.

A ton of trial and error

I did it!!

Read next

React.js Explained: A Comprehensive Guide to Building Modern Web Applications

How to Update Java JDK [A Simple Guide]

AoC 24 - Day 4: Ceres Search

JavaScript Best Practices

Crafting the supernet `regular expression`

Crafting the hypernet `regular expression`

Altogether now: `reduce()`, `control flow` & `regex`

Updating my hypernet sequence `regex`

Same strategy. Different `regex`.