As I review many PRs in Kotlin, I've seen many ways people use to create collections (Lists
, Sets
); and while they all work, I think the different ways have different pros and cons for what you are doing. So I decided to put together this article to better explain my very philosophical views on this matter.
What do I mean by “creating collections”? Specifically in this article, I am referring to when you create a list/set given its size and a formula for its elements. So not things like listOf
, setOfNotNull
, or buildList
– which are all great for their own use cases.
The for
loop
The most basic way that comes to mind for solving this problem is using a simple for
loop.
As an overarching example in this article, say we wanted to create a list of n
random Users to use in a test case. One could do:
val users = mutableListOf<User>()
for (index in 0..n) {
users.add(User("John"))
}
The problems (admittedly, “nits”) I see with this approach are as follows:
Using a mutable list
While not necessarily bad, this is something we want to avoid when the list doesn't need to be mutable. This is similar to using val
(instead of var
) when you are not modifying a reference. Reduce the cognitive load necessary to understand your code.
While we could hide this mutability as an implementation detail into a buildUserList
function that returns a List
, and that would be perfectly valid, all the alternatives I will propose will bypass this problem entirely.
Non-functional style
Using for
loops, while sometimes clearer (or unavoidable), often means we are writing "implementation details" rather than expressing what we want to happen -- especially if you are using manual indexing.
More often than not, functional styles can be much more clear on the "intent", rather than on the "how".
For example, compare a simple map
operation done with the for
loop:
val results = mutableListOf<String>()
for (element in original) {
results.add(transform(element))
}
with
val results = original.map { transform(it) }
Again, we are reducing the amount of information you need to parse the code - bit by bit.
We don't care about the index
Sometimes you might care about the index, sometimes you might not. We will explore solutions below for each. If you don't, the for
loop forces you to declare a new variable (the infamous i
) and, again, increase context with a meaningless variable.
In all of the alternative proposals, you will be able to optionally access the index via it
-- while at the same time not having to declare it explicitly if you don't care about it.
We don't care about the range (specific start and end)
We just set out to create n
users. Not only we don't care about the index; we don't care about the range. We just want the size to be n
, we don't care about the start value and end value.
In fact, this can lead to quite the confusion. While the other points can be absolutely trivial in isolation and only matter when taken as a philosophy for the entire project, this one can actually lead to immediate bugs, due to the confusion between the inclusive and exclusive range operators.
You can argue that they are clear and easy to distinguish, and, in fact, IJ will helpfully annotate them leaving no space for confusion:
But it can be something to be missed in a quick code review. In fact, as an example of that, did you notice that I intentionally mis-used them in this example? 0..n
is the inclusive range. So this should be either 1..n
or 0 until n
. Easy mistake to not notice on a code review -- which doesn't mean we shouldn't use ranges, but definitely means there is no reason to introduce this complexity if we don't care about it.
Note: I am aware that, in the time it took me to write this article, the gods of Kotlin went ahead and fixed the non-intuitive ranges issue, but I believe my argument still stands that there is no reason to introduce a composite concept (range) if we only care about the size.
Alternatives
While the for
loop is far from terrible, and the problems I pointed out are very small nits, Kotlin has such a range (pun definitely intended) of other options that can be much more suitable to you depending on the situation.
repeat
function
The first (and by far simplest) alternative solution to consider is the repeat function.
Use it if you don't care neither about the resulting list nor the range:
repeat(n) { saveUser("John $it") }
List
constructor
Unknown to many people, but List
actually has a List(size, factory)
constructor! No need to do map
over a range if you don’t care about the range, but do care about the generated list:
val users = List(n) { User("John $it") }
This is simple and clear and allows you to specify a size directly (which is what you care about here).
map
over a range
If you care about the resulting list, and also care about the range, that is the place you should really use one. But the cool trick is that you can map
the range directly into your result list:
val users = (a..b).map { User("John $it") }
Here, a..b
is an example, but you can use until
, downto
, etc -- build the range you need.
forEach
over a range
This is the final thing to callout: if you do not care about the generated collection, do not use map
. I see this so often on PRs, and it is, semantically, completely wrong.
Kotlin offers a perfect replacement already for map
when you don't care about the result: forEach
.
(a..b).forEach { saveUser("John $it") }
Or, if you are using the index variable anyway, you can go even simpler, and just use the plain old for
loop, as, in this last scenario, it doesn't have any of the disadvantages mentioned in the previous section:
for (index in a..b) {
saveUser("John $index")
}
Conclusion
I hope this article illustrated my point about why each slightly different pattern is better for slightly different circumstances, and that you were able to learn the rationale behind each, so you can apply such reasoning on your own code (rather than memorize instructions).
I know all of this might seem absolutely trivial. On every isolated example, reading the code is always super easy. However, when reading an entire function, every little bit contributes to the difference between an immediate understanding and a minute of thinking. When reading a code base, well, you see my point. These kinds of things littering a codebase add up to cause substantial cost.
I strongly believe that, on huge projects, maintained by multiple people at the same time, keeping the code clean, even if by addressing such seemingly tiny nits, is paramount for velocity, maintainability, and developer happiness over the years.
And finally, I encourage my fellow PR reviewers to refer to and share this article whenever you want to quickly express all these points but don't have the time to write an article-length comment on such philosophical matters.
Top comments (2)
Exactly matching what I'm using right now. Thanks for the educational article
Thanks for this amazing article!