A few years ago, I participated in a public workshop organized by Wix. We practiced TDD by implementing the Game Of Life kata with different constraints. One constraint was to write code without loops. I was shocked and stuck, I could not grasp how it is possible. Eventually, I figured it out: map, filter, and reduce! I knew this concept before but it wasn’t part of my daily programming (back then I wrote in C++). When I returned to C#, I started using this concept more and more and understood its benefits.
In this post, I show how to write clean code by replacing foreach loops with map, filter, and reduce. This approach is relevant to all languages.
TL;DR
- Declarative code written with map, filter, and reduce is more readable and concise than imperative code written with foreach loops.
- Reduce boilerplate code and avoid stupid bugs.
- Achieve clean code by separating between the logic of which items to operate on and the operation itself.
- Code is better in terms of the Single Responsibility Principle
What Are Map, Filter & Reduce?
Map, Filter and Reduce are functions that work on collections. They receive a collection and a function and return a new collection based on the result of the function.
- Map — maps each object in a collection into a new object. It returns a new collection with the same size as the given collection.
- Filter — filters a collection according to a given condition. The new collection includes only the elements that pass the condition.
- Reduce — reduces a collection down to a single value.
I wrote the code examples with C# LINQ and simplified them for the post. In real-life systems the code is not clearly divided into functions, instead, most of the code is inside the loop.
Replace foreach with map
Sometimes we see code with this structure:
Now, let’s rewrite it using LINQ Select which is the LINQ version of map :
* I call
ToList()
because LINQ is lazy.
This code is shorter — a single line!
Replace if with filter
Sometimes we see code with these structures:
The loops contain two logics:
- Which items to operate on.
- The operation itself.
We can replace if with LINQ Where, which is the LINQ version of filter. When we replace if (someCondition) continue;
with Where
we write the negative form of the condition.
The two logics are now separated. Each logic has a single responsibility.
Replace Boilerplate Code with Any
Sometimes we see code with these structures:
These are patterns for “return true if at least one of the items returns true for a given condition” aka the Any function. LINQ Any determines whether any element of a sequence exists or satisfies a condition. Any is a specific form of a reduce function.
Even though we are used to writing these boilerplate patterns, there is always a chance to cause bugs by doing a wrong initialization or confusing between the AND operator and the OR operator.
Let’s rewrite the code by using LINQ Any:
This code is clearer, shorter (single line!), and emphasizes the code’s meaning. It is also more efficient, Any
stops as soon as the result can be determined.
Replace Boilerplate Code with All
Sometimes we see code with these structures:
These patterns mean “return true if ALL the items return true for a given condition” aka the All function. LINQ All Determines whether all elements of a sequence satisfy a condition. All is a specific form of a reduce function.
Similar to Any
, there is a chance to miswrite this boilerplate code and to cause bugs.
Let’s rewrite the code by using LINQ All.
This code is clearer, shorter (single line!), and emphasizes the code’s meaning.
Code Evolution
In this section, I show how writing code with foreach loops can lead to spaghetti code while writing code with map, filter, and reduce results in clear, readable, and maintainable code.
Let’s imagine we are building a fun system for organizing fun days. 💃 🎉 🍻
Each employee should receive a welcome email for the fun day they registered to. Here is the code:
Now we discover that some people forgot to register. They were unaware of the fun days 😲. What should we do? Let’s send them a reminder email to register for a fun day.
Sending a reminder also operates only on current employees. The current code already finds the fun day for each employee, so this functionality is quite simple to add.
Great! Now people will not miss out on the fun. 😃
Now we discover that some people registered for a fun day but forgot to register for the optional activities. Let’s send them a reminder to register for the optional activities.
The current code already finds the fun day for each employee, so let’s use it to add the new functionality.
Yammi 😋 we got spaghetti… code! 🤦♀
The problems in the above code are:
- Many logics mixed together aka violates the Single Responsibility Principle.
- In each row,
employee
passed different filters so it is hard to grasp which filters it passed. - Quite nested code (4 nesting levels)
Let’s Try Again — Rewrite 💣💣💣
Now we start from scratch, writing the code with LINQ.
Let’s rewrite the first feature — welcome email.
The original code:
Rewrite with LINQ:
What are the differences?
In both implementations
SendWelcomeEmail
operates only on employes who are registered for a fun day. In the LINQ implementation, I separated the logic of whom to operates and the operation itself.I separated the filtering of current employees from the employes' loop. The filtering is done in a separate function called
GetCurrentEmployees
which emphasizes the code's meaning. I replacedif (IsPastEmployee(employee)) continue;
andif (IsFutureEmployee(employee)) continue;
withWhere
.Instead of searching the fun day inside the employes' loop, I created a dictionary between an employee and a fun day called
employee2Funday
. I replacedif (funday.IsEmployeeRegistered(employee.Id))
withemployee2Funday.Where
.We assume that an employee is registered to a single fun day. The original code implicitly chooses the first fun day by using the
break
statement. The new implementation uses LINQFirstOrDefault
which is more explicit.FirstOrDefault
is a specific form of a reduce function.
Now let's implement the next feature — a reminder email to register for a fun day for employees who didn’t register for any fun day.
The original code:
Rewrite with LINQ:
What are the differences?
I separated the logic of whom to operates and the operation itself. The function
SendUnregisteredToFunday
operates only on employes who are unregistered for any fun day.In the original code, the two functionalities are tangled together. In the rewritten code each functionality has its own method. This is possible because the data this functionality operates on, aka the employees and whether they are registered to a fun day, exists outside of the first functionality implementation.
Now let's implement the last feature — a reminder email for registering to the optional activities.
The original code:
Rewrite with LINQ:
What are the differences?
Again, I separated between the logic of whom to operates and the operation itself. The function
SendEmailAboutOptionalActivities
operates only on employees who are registered for a fun day but unregistered to any optional activity.In the original code, the functionalities are tangled together. In the rewritten code each functionality has its own method. This is possible because the data this functionality operates on, aka the employees and whether they are registered to a fun day, exists outside of the other functionalities implementation.
IsUnregisteredToOptionalActivities
is implemented withWhere
andAny
instead of boilerplate code. It is shorter and clearer.
This paradigm is part of the functional programming paradigm which has some concepts that help write clean, readable, and less error-prone code. Some of these concepts are immutable classes, avoid variables reassignment, and avoid shared state.
I currently write code in Scala and I barely see foreach loops. Out of curiosity, I gathered some statistics from the codebase I work on: map has ~3500 occurrences, filter has ~3000 occurrences, and foreach has only ~500 occurrences.
Top comments (6)
Nice post!
I'm a big fan of clean code myself.
One small comment: I'd use shorter argument names for lambda expressions.
For instance, instead of "employeAndFunday" I'd write "emp" or something similar.
For the ones who already familiar with lambda expressions, I think that the shorter argument name is more readable.
However, it's just a style thingie :)
I wouldn't.
While I agree with the post, I don't think it's easy to read lines with
employee2Funday
,employeeAndFunday
,employee
andfunday
all thrown in across 200 columns of wrapped text.So I see where you're going. But a variable called "emp" is harder to parse for someone arriving new to the code and doesn't tell you what it is. You might guess it was an "employee" object of some kind. You'd also be worried that other parts of the code would use other abbreviations and searching for words would become a chore.
Naming things is half the battle!
I see your point, and I'm a huge fan of proper names!
My point is that when I got used to lambda syntax, I found that long argument names somehow stand in the way of understanding the lambda's body itself.
We can agree to disagree about this specific issue...
Cheers!
Big fan of LINQ and am using it a lot to achieve a cleaner and easier to understand code. Sometimes, though, for loops are more descriptive, so... Golden Hammer.
However, I often see the comparison between LINQ and Javascript array functions. They are NOT the same. LINQ is optimized to work only on the items you indicate, while Javascript map/filter/reduce just apply a function on each of the items!
With new Javascript functionality we can do that in Javascript as well. Shameful self promotion here: siderite.dev/blog/linq-in-javascri...
I would love if you share an example of when foreach loops are more descriptive.
I didn't compare LINQ to js functions. Map filter and reduce are known from the functional programming paradigm and implemented in a lot of languages.
LINQ is lazy which makes it more efficient in some cases. I deleted the explanation about this from the post because it is long enough and I want to keep the readers focus.
One example from the top of my head is finding the minimum and maximum of a list. One can use list.min() and list.max(), but it means iterating twice even if it is clearer code. One can use a reduce/Aggregate with a seed that is a tuple, but it's difficult to read. In the end, good old for or foreach loop is the simpler option.
Maybe that's not the best example, but it's what I came up with on a short notice.