I wanted to find Groupby, a means to iterate a list in groups (lists of lists). In that search I came across this article about split, apply, merge for datatables. This looked like what I wanted, but it being specific to data science had me confused.
In D these function are chunkBy, map, joiner. The pattern of consistency continues as we just need to specify what to group on, once our list is sorted.
import std.algorithm;
auto data = [1,1,2,2];
assert(data.chunkBy!((a, b) => a==b)
.equal!equal([[1,1],[2,2]));
Unlike previous lambdas, this one is taking two arguments, this allows for elements to be grouped in interesting ways.
import std.algorithm;
auto data = [1,1,2,2,3,3];
auto evenGrouping(int a, int b) {
if(a%2 == b%2)
return a < b;
return a%2 < b%2;
}
assert(data.sort!evenGrouping
.chunkBy!((a,b) => a%2==b%2)
.equal([[2,2],[1,1,3,3]]));
As mentioned sorting needs to happen first.
import std.algorithm;
import std.range;
auto data = [3,3,1,1,2,2];
assert(data.sort!((a, b) => a%2 < b%2)
.chunkBy!((a,b) => a%2==b%2)
.map!(x => x.array.sort)
.equal!equal([[2,2],[1,1,3,3]]));
In this contrived example I decided it best to run it through a compiler. It was a good thing as I found a difference in behavior. I'll save map
for another day.
Two types of lambda functions are supplied to these functions. One takes a single argument which gets referred to as unary predicate and one that takes two which gets referred to as binary predicate.
When a unary predicate is supplied to chunkBy
it returns a tuple of the quality found and the value. This is an interesting optimization but this overload should live with group
which already has this behavior.
Top comments (0)