Using array map, filter and reduce in MongoDB aggregation pipeline

If you work with javascript, the chances of you are using array methods like map, filter and reducer today are really great.

All simplicity offered by javascript higher-order functions makes our code more readable and concise, mainly when we work with array data transformations.

Let's remember these methods:

const numbers = [2, 8, 15];

const greaterThanFive = (num) => num > 5;
const multiplyBy2 = (num) => num * 2;
const sum = (acc, num) => acc + num;

const filtered = numbers.filter(greaterThanFive);
const mapped = numbers.map(multiplyBy2);
const reduced = numbers.reduce(sum);

console.log(filtered); // [8, 15]
console.log(mapped); // [4, 16, 30]
console.log(reduced); // 25

That's really amazing!

However, in databases scenario, querying data with this simplicity is usually unusual, unless that database is MongoDB.

Because MongoDB is a NoSQL database with JSON based model, some javascript array methods have similar expression operators
in MongoDB Aggregation Pipeline.

About its JSON nature, official website cites:

MongoDB’s document data model naturally supports JSON and its expressive query language is simple for developers to learn and use.

And that makes all difference folks...

Let's get numbers array data used in the javascript example to create a new document in a generic collection. For improve the understanding, I will use MongoDB Playground to test our queries:

[
  {
    "numbers": [
      2,
      8,
      15
    ]
  },
]

Good! Our collection is ready to receive queries now :)

$filter

Starting, let's use $filter aggregation pipeline operator.

Query

db.collection.aggregate([
  {
    $project: {
      _id: 0,
      filtered: {
        $filter: {
          input: "$numbers",
          as: "num",
          cond: {
            $gt: [
              "$$num",
              5
            ]
          }
        }
      }
    }
  }
])
  • Start using aggregate method to submit the query. That method enables aggregation framework;
  • Pipeline starts using $project aggregation pipeline stage. The specified fields inside it can be existing fields from the input documents or newly computed fields. In our case, filtered field will be created and added to response;
  • The computed value for filtered field will be given by $filter aggregation pipeline operator;
  • Inside filter operator, set input to $numbers. That's our array to be iterated;
  • Set as to num to get each array value to test in filter condition. You could use any name here, just like you did in javascript filter method;
  • Then, set the filter condition in cond using $gt expression to return a boolean if current array value $$num is greater than 5;

Response

[
  {
    "filtered": [
      8,
      15
    ]
  }
]

$map

Query

db.collection.aggregate([
  {
    $project: {
      _id: 0,
      mapped: {
        $map: {
          input: "$numbers",
          as: "num",
          in: {
            $multiply: [
              "$$num",
              2
            ]
          }
        }
      }
    }
  }
])

In case, using $multiply expression to return all array values multiplied by 2.

Response

[
  {
    "mapped": [
      4,
      16,
      30
    ]
  }
]

$reduce

Query

db.collection.aggregate([
  {
    $project: {
      _id: 0,
      reduced: {
        $reduce: {
          input: "$numbers",
          initialValue: 0,
          in: {
            $sum: [
              "$$value",
              "$$this"
            ]
          }
        }
      }
    }
  }
])
  • Again, set $numbers array as input to iterate;
  • The initial cumulative value set before in is applied to the first element of the input array, initialValue is set to 0;
  • Finally, in expression give us two special variables: $$value is the variable that represents the cumulative value of the expression (acc in javascript example) and $$this is the variable that refers to the element being processed (num in javascript example). In case, using $sum expression to return the new accumulated value.

Response

[
  {
    "reduced": 25
  }
]

All in one

In previous examples, we worked with each operator in a separated query, however we could do a single query requesting all operators at once.

Query

db.collection.aggregate([
  {
    $project: {
      _id: 0,
      filtered: {
        $filter: {
          input: "$numbers",
          as: "num",
          cond: {
            $gte: [
              "$$num",
              5
            ]
          },

        }
      },
      mapped: {
        $map: {
          input: "$numbers",
          as: "num",
          in: {
            $multiply: [
              "$$num",
              2
            ]
          }
        }
      },
      reduced: {
        $reduce: {
          input: "$numbers",
          initialValue: 0,
          in: {
            $sum: [
              "$$value",
              "$$this"
            ]
          }
        }
      }
    }
  }
])

Response

[
  {
    "filtered": [
      8,
      15
    ],
    "mapped": [
      4,
      16,
      30
    ],
    "reduced": 25
  }
]

Going further, if you add more documents to collection, this same query computes data for each of them. Let's query a collection with 3 documents now:

Collection

[
  {
    "numbers": [
      2,
      8,
      15
    ]
  },
  {
    "numbers": [
      4,
      8,
      9,
      13
    ]
  },
  {
    "numbers": [
      1,
      3,
      7
    ]
  }
]

Response

[
  {
    "filtered": [
      8,
      15
    ],
    "mapped": [
      4,
      16,
      30
    ],
    "reduced": 25
  },
  {
    "filtered": [
      8,
      9,
      13
    ],
    "mapped": [
      8,
      16,
      18,
      26
    ],
    "reduced": 34
  },
  {
    "filtered": [
      7
    ],
    "mapped": [
      2,
      6,
      14
    ],
    "reduced": 11
  }
]

Conclusion

MongoDB for javascript developers is intuitive by nature! Aggregation framework does the hard work directly in the database server using many of features already known by us and data can be delivered ready-to-use, which normally decreases the workload for the application server.

See also the complete Array Expression Operators list in MongoDB official website.

22