reduce
(inject
) is one of the most powerful
methods that exists on the Enumerable module, meaning that the
methods are available on any instances of any class that includes this module,
including Array, Hash, Set and
Range.
reduce
can be used in a MapReduce process,
often is the base for comprehensions and is a great way to group values or
calculate a single value (reducing a set of values to a single value) given a
set of values.
This article quickly shows you how to skip values / conditionally return values
during a reduce
iteration and how to break early / return a
different value and stop iteration.
Recap 💬
From the documentation, given an instance enum
(an Enumerable) calling
enum.reduce
:
# Combines all elements of <i>enum</i> by applying a binary
# operation, specified by a block or a symbol that names a
# method or operator.
An example of using reduce
would be write a function that sums
all the elements in a collection:
##
# Sums each item in the enumerable (naive)
#
# @param [Enumerable] enum the enumeration of items to sum
# @return [Numeric] the sum
#
def summation(enum)
sum = 0
enum.each do |item|
sum += item
end
sum
end
##
# Sums each item in the enumerable (reduce block)
#
# Each iteration the result of the block is the passed in previous_result.
#
# @param [Enumerable] enum the enumeration of items to sum
# @return [Numeric] the sum
#
def summation(enum)
enum.reduce do |previous_result, item|
previous_result + item
end
end
##
# Sums each item in the enumerable (reduce method)
#
# Each iteration the :+ symbol is sent as a message to the current result with
# the next value as argument. The result is the new current result.
#
# @param [Enumerable] enum the enumeration of items to sum
# @return [Numeric] the sum
#
def summation(enum)
enum.reduce(:+)
end
##
# Alias for enum.sum
#
def summation(enum)
enum.sum
end
reduce
takes an optional initial value, which is used instead of
the first item of the collection, when given.
How to control the flow?
When working with reduce
you might find yourself in one of two
situations:
- you want to conditionally return a different value for the iteration (which is used as base value for the next iteration)
- you want to break out early (stop iteration altogether)
next ⏭
The next
keyword allows you to return early from a yield
block, which is the
case for any enumeration.
Let’s say you the sum of a set of numbers, but want half of any even
number, and double of any odd number:
def halfly_even_doubly_odd(enum)
enum.reduce(0) do |result, i|
result + i * (i.even? ? 0.5 : 2)
end
end
Not too bad. But now another business requirement comes in to skip any number
under 5:
def halfly_even_doubly_odd(enum)
enum.reduce(0) do |result, i|
if i < 5
result
else
result + i * (i.even? ? 0.5 : 2)
end
end
end
Ugh. That’s not very nice ruby code. Using next
it could look like:
def halfly_even_doubly_odd(enum)
enum.reduce(0) do |result, i|
next result if i < 5
next result + i * 0.5 if i.even?
result + i * 2
end
end
next
works in any enumeration, so if you’re just processing items using
.each
, you can use it too:
(1..10).each do |num|
next if num.odd?
puts num
end
# 2
# 4
# 6
# 8
# 10
# => 1..10
break 🛑
Instead of skipping to the next item, you can completely stop iteration of a an
enumerator using break
.
If we have the same business requirements as before, but we have to return the
number 42 if the item is exactly 7, this is what it would look like:
def halfly_even_doubly_odd(enum)
enum.reduce(0) do |result, i|
break 42 if i == 7
next result if i < 5
next result + i * 0.5 if i.even?
result + i * 2
end
end
Again, this works in any loop. So if you’re using find to try to find
an item in your enumeration and want to change the return
value of that
find
, you can do so using break
:
def find_my_red_item(enum)
enum.find do |item|
break item.name if item.color == 'red'
end
end
find_my_red_item([
{ name: "umbrella", color: "black" },
{ name: "shoe", color: "red" },
{ name: "pen", color: "blue" }
])
# => 'shoe'
StopIteration
You might have heard about or seen raise StopIteration
.
It is a special exception that you can use to stop iteration of an enumeration,
as it is caught be Kernel#loop
, but its use-cases are limited as
you should not try to control flow using raise
or fail
. The
airbrake blog has a good article about this
use case.
When to use reduce
If you need a guideline when to use reduce
, look no further. I
use the four rules to determine if I need to use reduce
or
each_with_object
or something else.
I use reduce
when:
- reducing a collection of values to a smaller result (e.g. 1 value)
-
grouping a collection of values (use
group_by
if possible) - changing immutable primitives / value objects (returning a new value)
- you need a new value (e.g. new Array or Hash)
Alternatives 🔀
When the use case does not match the guidelines above, most of the time I
actually need each_with_object
which has a similar
signature, but does not build a new value based on the return
value of a block,
but instead iterates the collection with a predefined “object”, making it much
easier to use logic inside the block:
doubles = (1..10).each_with_object([]) do |num, result|
result << num* 2
# same as result.push(num * 2)
end
# => [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]
doubles_over_ten = (1..10).each_with_object([]) do |num, result|
result << num * 2 if num > 5
end
# => [12, 14, 16, 18, 20]
Use each_with_object
when:
- building a new container (e.g. Array or Hash). Note that you’re not really reducing the current collection to a smaller result, but instead conditionally or unconditionally map values.
- you want logic in your block without repeating the result value (because you
must provide a return value when using
reduce
)
My use case
The reason I looked into control flow using reduce
is because I was iterating
through a list of value objects that represented a migration path. Without using
lazy
, I wanted an elegant way of representing when these
migrations should run, so used semantic versioning. The migrations enumerable is
a sorted list of migrations with a semantic version attached.
migrations.reduce(input) do |migrated, (version, migration)|
migrated = migration.call(migrated)
next migrated unless current_version.in_range?(version)
break migrated
end
The function in_range?
determines if a migration is executed, based on the
current “input” version, and the semantic version of the migration. This will
execute migrations until the “current” version becomes in-range, at which point
it should execute the final migration and stop.
The alternatives were less favourable:
-
take_while
,select
and friends are able to filter the list, but it requires multiple iterations of the migrations collection (filter, then “execute”); -
find
would be a good candidate, but I needed to change the input so that would require me to have a bookkeeping variable keeping track of “migrated”. Bookkeeping variables are almost never necessary in Ruby.
Top comments (0)