So recently I came across the fact that Elixir is full of macros, and that it allows you to interact with its internal syntax tree pretty easily (using quote
), and I thought that was a pretty cool thing.
In the simplest of terms macros are special functions designed to return a quoted expression that will be inserted into our application code. -- Elixir Lang
One example that I came across was the following -- say we have an expression that we want to evaluate:
if true do
"hello world"
end
But sometimes you want to write a single-line expression. No problemo amigo, Elixir's got you covered!
if true, do: "hello world"
This statement is a single line long, but it excites me enough to write a post about it. Let's dive into why!
There are three things to note here in this one, simple expression.
- There's a comma before
do
- There's a colon after do
- There's no
end
keyword
These three are hints to a reveal that blew my mind.
The if
statement you just saw? It's not really a language keyword, it's just a normal function (in fact, it's a macro).
The do
block you just saw? Not a language keyword either, it's a keyword list (well technically it's a keyword list keyword according to source code). If you don't know what a keyword list is, keep reading, I explain it a little further down.
How did the hints help?
Let's revisit the hints and see how the hints help us uncover the truth which has been hiding in plain sight all along!
1) Comma
If you paid attention earlier, the first hint points out that true
and the do
statement are separated.
Do you know what else is separated? Function arguments.
if
statement is really just a function that has arity of 2, first being the condition to evaluate, second being the action to do, and the comma separates the two arguments.
This means that we can translate our earlier statement to look like this, and it'll still work.
iex(2)> if(true, do: "hello world")
"hello world"
All I did was add brackets around it - don't they look a lot more familiar now? ;)
2) Colon
The second and third hints are actually very closely related, because they both point to one thing: keyword list.
Let's take a refresher on what a keyword list is, or what it looks like.
Official definition on Keyword List
A keyword is a list of two-element tuples where the first element of the tuple is an atom and the second element can be any value. - Elixir Lang
We know what a list looks like ->[]
We know what a tuple looks like -> {}
So, a keyword list looks like this:
[{:atom, "any_value"}]
Alright that doesn't look very nice/easy to type out, so Elixir provides a syntactic sugar to write a keyword list like this:
[atom: "any_value"]
At this point, we can use this newfound logic to apply back to our previous if
statement, and that translates to this:
iex(2)> if(true, [do: "hello world"])
"hello world"
Again, all I did here was add square brackets around the do
keyword, and it's beginning to clear up what they are.
In fact, we could get a little crazier with the original syntax of keyword list..
iex(2)> if(true, [{:do, "hello world"}])
"hello world"
Hooray, still works!
3) Where's the end
? (and proofy time!)
Alright, so I've told you that they're the same thing and that they all work exactly the same, but as far as you're concerned I could be blabbering random shits (Although I promise I'm not). So let's prove that what I said was right!
I have also yet to tell you about the 3rd hint, which is - where did the end
keyword go?
So, the end
is actually completely ignored because there's no real need to represent an end
of statement in an Abstract Synax Tree -- you should be able to infer it from the syntax tree itself. I also show evidence of this in Elixir's source code later on.
I won't talk too much about AST partially because it's a whole another topic on its own, but also because I honestly also don't know enough about it (😬) but a quick primer/what I understand so far here:
AST is how the machine understand the flow of code. It builds a tree based on the programming languages' syntax and how machine execution should flow.
In an essence, you write a block of text (the code), and the language tries to understand it (parses), and then it creates a mental model of what it currently understands (the abstract syntax tree).
The reason why I'm running the different expressions to produce an AST to prove that they're the same is because, the same-ness of the code isn't depended on any syntactic sugar, but rather what the language really understands from it, and in our case, it's the mental model (AST) of the language.
Time to verify!
first = quote do
if true do
"hello world"
end
end
# {:if, [context: Elixir, import: Kernel], [true, [do: "hello world"]]}
second = quote do
if true, do: "hello world"
end
# {:if, [context: Elixir, import: Kernel], [true, [do: "hello world"]]}
third = quote do
if(true, do: "hello world")
end
# {:if, [context: Elixir, import: Kernel], [true, [do: "hello world"]]}
fourth = quote do
if(true, [do: "hello world"])
end
# {:if, [context: Elixir, import: Kernel], [true, [do: "hello world"]]}
fifth = quote do
if(true, [{:do, "hello world"}])
end
# {:if, [context: Elixir, import: Kernel], [true, [do: "hello world"]]}
We can run a final check to check if they all equal to each other!
first == second #=> true
second == third #=> true
third == fourth #=> true
fourth == fifth #=> true
I'll be frank, I don't know how Elixir actually decides to continue reading/parsing the statement until an end
is met, I feel like there's a whole another topic of lexing, parsing and tokenizing that I'll need to understand first (along with AST, which I will, but not yet!)
BUT! What I can add to support this article is the evidence that when Elixir outputs the AST back as a string (with Macro.to_string/1
), they have to explicitly append the end
word into the outputted string, as seen from here!
Takeaway
if true, do: "hello world"
There's two things that I learned from just this one simple line of code, and that is:
-
if
is just a language macro (it expands down to acase
statement), and in fact, there's a whole new world about meteprogramming/macros in Elixir's world! -
do
is nothing more than a keyword list.
I had thought that basic things like if
statements or declaring a function should be core language features with special meanings, but Elixir is able to beautifully build upon fundamental features like AST/keyword list and use that to build out other features.
This also means you can build any sort of macro that you want! Want to build an unless
feature like in Ruby? You can do that (as shown in Elixir official guide), want to build a while
loop?, you can do that too (Chris McCord shows this in his introductory video to Metaprogramming Elixir).
In fact.... let me just leave you with one thing. def
itself is a macro, defined using defmacro
. And defmacro
itself is a macro, defined using defmacro
. Mind blown yet? If not feel free to re-read that sentence until you do. Read more here.
Author's note
I've probably glossed over a lot of details (particularly about AST) because frankly, I do not understand them well enough yet (and the point of the article is not really on AST, they're just a means to proving they're all equal without the syntactic sugars).
But this means I know exactly what I'll be diving into next, and I can't wait to learn more about lexers, parsers, tokenizers and ASTs :) If you have any good resources on them, comment down below ⬇️, I'd really appreciate them!
P.S Credit where credit's due, I learned about this from the Elixir Slack channel over here, Martin Svalin was explaining that they're all equivalent, and I decided to dive a little deeper and explore the AST to verify what he's saying. Spoiler: he's right!
Top comments (4)
You can take it a bit farther, too, with something like:
This makes it even more obviously a keyword list.
Great post.
Can I suggest adding a definition of “macro” somewhere in the beginning? Not everybody knows what that word means and you use it a lot.
Very good point! Thanks so much for pointing out, I've clarified a little what it means and linked to relevant docs :) Thanks!
You also called
if
a function once, but it's especially not a function