I was trying to understand how some syntax in Groovy was working and it led me to some better understanding on how closures can be used. I'm sure it's all in the documentation but... well I'm the kind of person who really needs to see it in place to understand...
List and map "comprehensions"
Borrowing from Python, I was trying to understand how to perform "list comprehension" and "dict comprehension", but in Groovy. An article I found did give me the answer, and I noticed something a a little odd:
// List comprehension to double the values of each item
my_list = my_list.collect({ item ->
item * 2 // the last statement of the block is the item that gets "returned" for each item
})
// Map comprehension to add ".txt" suffix to each entry
// EDIT - from some _very_ old (pre-2012) example code prior to Groovy 1.7.9
my_map = my_map.inject([:]) { new_map, item ->
new_map[item.getKey()] = "${item.getValue()}.txt"
new_map // the returnable must be the final statement
}
// EDIT - there is a better way to do map comprehension in current (post-2022) Groovy:
my_map = my_map.collectEntries { item -> [item.getKey(), "${item.getValue()}.txt"] }
(for details, do see the original post)
Two things that I noticed:
- In the list comprehension, the curly block is a direct argument to
.collect()
- but in the map comprehension, the curly block is placed after the call to.inject()
- In both cases, the block looks suspiciously like the
.each { name -> operations; }
notation
So I did what any system-dismantling child would do... and tried something ...
def myfunc(message, thing) { println thing }
myfunc("Hello") { println "other action" }
This showed me that thing
did actually populate.... with a closure !
basic_closure$_run_closure1@76f2bbc1
And thus my journey down to Wonderland began ...
.each { }
is not a true loop
Notationally, it looks like using the each { }
operation on an iterable item results in a loop that steps through each item.
Well, not really: the each()
function (which, yes, it is) in fact takes one argument: a function handler, and calls it once for each item in the iterable it operates from.
This is why this will not work:
def error_on_null(my_sequence) {
my_sequence.each { item ->
if(item == null)
break // Groovy will tell you that "break" must be used in a loop!
}
}
If we re-write this in more familiar constructs, this is what is actually happening:
def error_on_null(my_sequence) {
for(item in my_sequence) {
null_check(item)
}
}
def null_check(item) {
if(item == null)
{ break } // Obviously wrong!
}
We can see more explicitly in this way what is happening: the content that checks null-ness is actually in a block and scope of its own - it does not incorporate its own loop, and so it is a syntactic error to try to use break
there.
When using curly braces for a code block, we are actually defining an anonymous function, and passing it along to the each()
function which itself implements the loop. This anonymous function is what is known in Groovy as a closure, a piece of code declared in one scope, and executed anywhere else, probably at a deferred time.
Similarly, with .collect( {} )
we are passing a closure, that can then be called by .collect
's internal logic.
Closure parameters
Closures can have parameters too.
def greet = { greeting, name ->
println "$greeting , $name"
}
greet("Hello", "Tam")
And that's how you get the name ->
notation in the .each { }
call we are so much more familiar with.
Passing in closures can be done with multiple syntactical omissions:
// Demo function
def call_me(clos) { clos(); }
// All of these are equivalent !
// Explicit param
call_me({ println "Hi";})
// Suffix notation
call_me() { println "Hi";}
// parenthesis omission, since it's the only argument
call_me { println "Hi";}
Function calling
In fact, function calling in Groovy can take many different forms as well.
// These are NOT the same
my_func("alpha", "beta") // (A)
my_func(a="alpha", b="beta") // (B)
my_func a: "alpha", b: "beta" // (C)
Cases A and B require the function to be defined as:
def my_func(a, b) { println "$a -> $b" ; }
However case C requires a mapping:
def my_func(Map args) {
a = args.get("a")
b = args.get("b") // you can also use `.getOrDefault(key, default_value)`
}
Also this does not behave as you would expect it:
def my_func(a, b) { println "$a -> $b" ; }
my_func(a="alpha", b="beta")
my_func(b="beta", a="alpha")
The output of this is...
alpha -> beta // OK
beta -> alpha // ??
The second instance should not have been reversed right? Well, it is - the arguments continue to be passed in by position and not by name. I am unsure of whether that's a bug or not (certainly feels like one) , but it means that if you need to provide default values for options, you MUST use the Map
form of the arguments.
Domain Specific Language: Jenkinsfile
I always did wonder how Jenkinsfile pipelines declared its own code blocks like
stage("Build stuff") {
sh "make clean && make && make install"
}
It turns out, the stage()
function is defined something like this
def stage(stage_name, operation) {
org.hudson.etc.setStageName(stage_name) // for example
operation()
}
When stage()
is called in my Jenkinsfile, it receives the closure I supply after it as an operation to perform. My closure (the build steps) has access to the variables and namespace in the rest of the file - and so can be passed along to the Jenkins-level stage()
function which proceeds then to calling it (probably wrapped around some more complex error-handling logic).
The Closure Gotcha
Previously I posted about a behaviour I did not understand where variables were seemingly interpolated at the very last possible moment. I originally thought it had something to do with interpolation of GStrings -- but nay!
The reason was, of course, because of closures !
I managed to distil the issue down to the following snippet:
operations = []
for(x in [0,1,2]) {
println "Setting operation $x"
operations[x] = { println "Running operation $x"; }
}
println "===="
for(op in operations) { op(); }
The output:
Setting operation 0
Setting operation 1
Setting operation 2
====
Running operation 2
Running operation 2
Running operation 2
The reason is that the closure evaluates at a deferred time, with knowledge of x
at the time of execution - which is after the loop has completed, and so the value at execution time ends up being its last value from the loop in every case ...!
In my solution in my complaint post, I moved the closure out a function, which resulted in it taking the value with which the function was called - which remains constant for each call, and is not affected by the loop.
Finally, a mystery solved!
Currying the closure
Another way around the problem of deferred resolution would be to curry the closure - call its .curry()
method to produce a new copy of the closure.
for(x in [0,1,2]) {
operations[x] = { y -> // take an argument
println "Running operation $y"
}.curry(x) // immediately feed the outer variable into the argument
}
What we are doing here is defining a closure with 1 argument, and calling the curry function with the intended value for it. This returns a new closure, which is the effectively pre-seeded closure
Extra Gotcha: the top-level run method
Another item I found whilst trawling for answers recently is how "global" variables work in a Groovy file script, and how code not encapsulated in a function relates to variables equally not encapsulated.
What looks like code that exists within a self-same scope is actually not the case ...
In essence:
// Runtime.groovy
implicit_property = "Available everywhere"
def implicit_local = "Not available inside functions"
println "Top level"
println implicit_property
println implicit_local
def a_func() {
println "In function"
println implicit_property // OK
println implicit_local // Fail - it is "local" to the main "run()"
}
a_func()
Groovy compiles to a Java-equivalent layout and so the above actually ends up looking like this:
// equivalent Runtime.java
class Runtime { // from the filename - "Runtime.groovy"
static String implicit_property; // Declaration separate from assignment
public void run() { // Called at runtime by Groovy evaluator
Runtime.implicit_property = "Available everywhere";
String implicit_local = "Not available inside functions";
System.out.println("Top level");
System.out.println(Runtime.implicit_property);
System.out.println(implicit_local);
a_func();
}
public void a_func() {
System.out.println("In function");
System.out.println(Runtime.implicit_property); // OK
System.out.println(implicit_local); // Fail - it is "local" to the main "run()"
}
}
This outputs
Top level
Available everywhere
Not available inside functions
In function
Available everywhere
Caught: groovy.lang.MissingPropertyException: No such property: implicit_local for class: runtime
groovy.lang.MissingPropertyException: No such property: implicit_local for class: runtime
at runtime.a_func(runtime.groovy:11)
at runtime.run(runtime.groovy:14)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
.... see what happened...? ๐ฑ
The Groovy interpreter first cross-compiles to JVM-compatible code - which puts all the "naked code" (not encapsulated in a function) in an instance "run()" method of an implicit class. It then instantiates the implicit class, and calls this run()
method.
Note that
- the
implicit_local
only has scope within the run() method and is not available to the other methods - the
implicit_property
is only populated if therun()
method executes (it is possible in normal groovy to load a file and not executerun()
with aGroovyShell().parse()
action)
Loading files: return this
When using the Jenkinsfile load
instruction, you need the return this
final line to be able to call the defined methods
// hello.groovy
def say_hi(name) { println name }
return this
// main.groovy
greeter = load "hello.groovy"
greeter.say_hi("Sam")
This is due again to the compile-step pushing the body code into a run()
function. When load
executes the file, the body code (run()
) is executed immediately , and its return value is assigned from the import.
This is, in a nutshell, how Jenkinsfile's load()
function actually works (as actual Groovy code - try it!):
// Behaves pretty much like Jenkins's own `load` operation
def load(path) {
// This will simply compile the file and create an instance
// of the implicit class
def loaded_file = new GroovyShell().parse(new File(path))
// And then the <instance>.run() function is actually called separately
// hence why we need "return this" at the bottom of the file
return loaded_file.run()
}
Conclusion
A dive down this rabbit-hole finally allowed me to make sense of a core part of the Groovy language , and identify a very interesting gotcha.
This feels more of a symptom of closures being very implicit in Groovy - and possibly of the false similarities in notation between function declarations, and objects - that closure in my Jenkinsfile looked to me like an "object" in JavaScript (think JSON notation) and as such I had not expected it to be the source of the deferred execution.
Another day, another lesson.
Top comments (0)