The power of reduce

reduce (aka fold aka inject aka lfold) is a very powerful, flexible, and at the same time an unintuitive and controversial function. In this post I'll talk about what makes it both so flexible and unintuitive, and I'll present how other iteration functions like map or filter can be implemented on top of reduce. I'll use JS definition of reduce as a reference and I'll show what other languages do better in implementing this function.

Basics of reduce

reduce is a function that works on collections. It usually accepts 2 arguments: a reducer function and an optional initial value. reduce iterates over the collection, calling the reducer function for every element and passing the output of reducer to the next iteration (with one exception mentioned later). A simple example is calculating a product of all elements of the array:

// returns 2 * 4 * 6 * 8 = 384
[2,4,6,8].reduce((accumulator, el) => accumulator * el, 1);

The reducer function can accept up to 4 arguments:

  • accumulator - the output of previous iteration (in the first iteration it takes the default value, or if not provided, the first element of the array)
  • element - the current element of the array
  • index - the index of the current element of the array
  • originalArray - the whole array on which reduce is being called.

In the following example, the execution will look like this:

1st iteration: acc = 1 * 2 (output: 2)
2nd iteration: acc = 2 * 4 (output: 8)
3rd iteration: acc = 8 * 6 (output: 48)
4rd iteration: acc = 48 * 8 (output: 384)

If you want to understand it better and see more advanced examples, check the tutorial I recorded:

Use cases

reduce has traditionally been a part of functional languages, where it acts as kind of equivalent of for loops. It became more common thanks to a MapReduce framework which allows to easily parallelize operations that aggregate some data. MapReduce splits the work to be done in 2 parts - map part performs some kind of operation on each piece of data (this part can be done in parallel) and reduce then gathers all the output from map and combines the filan result (this part is done sequentially).

Let's say we want to count number of occurences of each word in a piece of text. We can split the text into sentences, and fo each sentence we can calculate number of occurences of each word in parallel. Then we end up with multiple dictionaries, let's say:

{ "dog": 2, "is": 2, "animal": 1, "and": 1, "mammal": 1},
{ "fish": 1, "is": 1, "animal": 1, "too": 1}

Then reduce function can merge these 2 dictionaries and calculate final output:

{ "dog": 2, "is": 3, "animal": 2, "and": 1, "mammal": 1, "fish": 1, "too": 1 }

Interestingly, reduce does not need map to achieve the result above - it's only needed in order to have the first part run in parallel.

Another common use case is to calculate some number that's based on a list of numbers. A good example is sum of squares that has a number of uses in mathematics like in linear regression.

I personally often use reduce in order to transform one dictionary into another (e.g. I might need to normalize keys, or update values). This is not possible in JavaScript though - I explain it a bit later in the article.

The controversy

For a number of reasons, reduce is a controversial function among programmers. In JS it gets quite a bad rep, like in the widely retweeted example below:

It's not the only example, though. In Python, reduce was removed from the standard library and moved to functools library. It's still shipped as part of the Python language distribution, but in order to use it, you need to explicitly import it.

There are a number of reasons why reduce gets a bad reputation, the main of them being: for every use of reduce there's at least one more intuitive and more readable alternative.

For loops and other options

First argument for not using reduce is that many languages (mainly imperative/OO) there are always more idiomatic and intuitive ways to write code than useing reduce. The main solution is to use for loop, forEach function, or some kind of equivalent. Let's take the example from the previous section:

[2,4,6,8].reduce((accumulator, el) => accumulator * el, 1);

Another way to write is

const product = 1;
for (const el in [2,4,6,8]) {
    product *= el;
}

For programmers coming from other imperative languages, the latter version certainly feels more familiar. Is it clearly better though? I'm not so sure.

Readability

The second argument is quite similar, but focuses on reduce function itself - a lot of people say the function is hard to read. I partially agree with this. Most of the time I have little problem understanding what is the goal of reduce just by having a quick look, but because it can return anything, it's not as meaningful and intuitive as map or filter. What's more, if you want to use reduce in multiple programming languages, you'll have to remember that each of them has a different number and order of arguments!

There's one more thing that adds to the problem - the initial value, which is an optional parameter in reduce and which changes a lot about how the function works. If you have a collection of 10 elements, you can expect that it'll trigger 10 iterations, however if you don't pass the initial value to the function, there'll be only 9 iterations. That's because the first element of the collection will become the initial value. In a lot of cases, like when calculating a sum or a product, it doesn't matter, but when you want to calculate sum of squares, that missing initial value will break the function!

function sumSquares(ary) {
    return ary.reduce((acc, el) => acc + el * el);
}

sumSquares([1,2,3,4]); // => 30, works!
sumSquares([4,3,2,1]); // => 18, broken!

Limitations

The last reason applies to some specific langauges, for example JavaScript - reduce was added to JS as a half-baked thing, working only on arrays. The same function in other languages can be used on other types of collections. In Ruby as long as a class includes the Enumerable module, it gets reduce function. In Python, where reduce is used very rarely, you still can use it with dictionaries. I believe reduce would be way more useful in JavaScript if only it was possible to call it on other types of collections.

Write everything in reduce!

While I agree with the arguments I presented above, I still believe that understanding reduce can be very helpful, especially if you ever consider learning functional languages. It's really a powerful function. Actually, reduce is so flexible, that a lot of collection functions can be rewritten using reduce. Let's try it out!

Warning: don't try to do it in your apps. The original implementations of the functions below are certainly better (and probably much, much faster).

forEach

First, something easy: forEach is a reduce that calls a passed callback and does not return any value.

function foreach(array, cb) {
    array.reduce((_acc, el) => cb(el));
}

map

map is reduce where we start with empty array and in every iteration we add result of the callback function to the accumulator.

function map(array, cb) {
    return array.reduce((acc, el) => [...acc, cb(el)], []);
}

A slightly more readable (and faster, I guess) version, with 2 statements, would look like this:

function map(array, cb) {
    return array.reduce((acc, el) => {
        acc.push(cb(el));
        return acc;
    }
}

flatMap

This one's quite complicated! flatMap behaves similarly to map except that it always returns a flat (1-dimensional) array. If the provided callback returns an array, map returns an array of arrays, while flatMap, as the name suggests, flattens the output. It could be implemented this way:

function flatMap(array, cb) {
    return array.reduce((acc, el) => [...acc, ...cb(el)], []);
}

However, if the cb does not return an array (we can't guarantee it does), we need to add something more. There are a few different ways to deal with it, the most trivial is to just flatten the outer array. It's not a pretty solution (and oh it is SO slow), but it will do.

function flatMap(array, cb) {
    return array.reduce((acc, el) => [...acc, ...cb(el)].flatten(), []);
}

filter

Next, filter returns elemets of orignal array, but only those that meet the provided expectation (read: where cb(el) returns truthy value). First, let me implement it using 2 statements to make it easier to read.

function filter(array, cb) {
    return array.reduce((acc, el) => {
        if (cb(el)) acc.push(el);
        return acc;
    }, []);
 }

Now the same can be rewritten with a single statement, though it's less intuitive.

function filter(array, cb) {
    return array.reduce((acc, el) => {
        return cb(el) ? [...acc, el] : acc;
    }, []);
 }

some

some returns true if callback function returns true (or any truthy value) for any of the elements in the array. It can be written in pseudocode as cb(array[0]) || cb(array[1]) || ... || cb(array[n-1]). In order to implement it with reduce I'll be carrying on the boolean value over each iteration.

function some(array, cb) {
    return array.reduce((acc, el) => acc || Boolean(cb(el)), false);
}

every

every is a sibling function to some and returns true if the callback function returns true for every element of the array. It can be written as fun(array[0]) && fun(array[1]) && ... && fun(array[n-1]). Similarly I'll carry a boolean value as acc.

function every(array, cb) {
    return array.reduce((acc, el) => acc && Boolean(cb(el)), true);
}

includes

includes could actually be implemented using some. For the sake of consistency I'll just keep using the reduce directly though. In this case we don't have a callback to use, instead we need to check if any element is equal to provided value.

function includes(array, value) {
    return array.reduce((acc, el) => acc && (el === value), false);
}

As a side note, the 3 functions above are examples where using reduce introduces a performance penalty (they'll iterate over the whole array even if they could stop earlier). One more reason not to use this code in any serious application.

find

find returns the first element that meets a criteria specified by the callback function. In terms of implementation, it's similar to some with a twist. Just like with some we're going to pass a certain falsy value and as soon as it becomes truthy, we're going to pass it till the end of the iteration process. The twist is that the value we need to pass is not the output of the callback function, but the element on which the function is called.

function find(array, cb) {
    return array.reduce((acc, el) => {
        if (acc) return acc;
        if (cb(el)) return el;
    }, null);
}

Earlier in this post I said I'd try to write the reduce with only a single expression. It's possible in this case as well, though just as before it's harder to understand:

function find(array, cb) {
    return array.reduce((acc, el) => acc || (cb(el) && el)), null);
}

The cb(el) && el part will return false if the element does not meet provided requirement, or it will return the value of el if it does. Then the first part, acc || ...will either return acc (output of previous iteration), unless it's a falsy value, in which case it'll return the 2nd part explained above.

findIndex

findIndex initially seemed more challenging to implement, because somehow I need to keep track of the index together with the element. Then I remembered that the reducer function takes 4 arguments, and not only 2! The 3rd argument is the current index, and 4th one is the array on which the reduce is called (I'm still thinking how to use it in practice). So findIndex will be almost identical to find.

function findIndex(array, cb) {
    array.reduce((acc, el, i) => {
        if (acc) return acc;
        if (cb(el)) return i;
    }, null);
}

lastIndexOf

lastIndexOf is almost the same, except that first we check whether the current element meets the expectation, and only if it doesn't, then we return the last on that did. In short: we swap the order.

function lastIndexOf(array, cb) {
    array.reduce((acc, el, i) => {
        if (cb(el)) return i;
        if (acc) return acc;
    }, null);
}

Similarly to find, the findIndex and lastIndexOf functions (why isn't it called findLastIndex by the way? and why there's no findLast function?) could be rewritten using a single expression, the only difference is the order and the logical operators used.

Can reduce do everything?

Looking at the list of array functions in JS and I was wondering if there's anything that can't be implemented with reduce. Initially I had 3 ideas:

  1. Functions that modify the original array - reduce comes from languages with immutable data structures, so modifying original array (with functions like copyWithin) was a long shot, but because the reducer accepts original array as a parameter, it is possible (I am 99.99% sure it's always bad idea, though - don't do it at home!)
  2. Sorting - ok, when that idea came to my mind I thought it was really stupid, but maybe it's possible to implement some kind of bubble sort with reduce? Well, it seems I was not the only person who wondered about it!
  3. Finally, I found something - Array class has methods like keys and entries, and those functions return iterators. I tried to implement them with reduce, but I failed miserably, so I assume it can't be done (correct me if I'm wrong!).

What's the point?

This was a fun exercise, but my point here is that each function has its place. reduce gets a lot of bad rep in JS and for good reasons. It's limiting yet overcomplicated and I still don't remember the order of parameters in reducer, although I used it a number of times. Still, it's good to understand it, so that you can use it from time to time.

Oh, and of course - check out other languages where reduce work also for dictionaries, sets, or other collection types. Languages like Elixir, Haskell or Ruby make reduce more powerful and intuitive at the same time!

26