Measure performance of aggregation's $merge stage against the BestBuy Developer API data, specifically stressing the exchange logic and comparing the performance against the mapReduce command for older branches less than version 4.4 and earlier, for newer branches >=4.5 just aggregation pipelines will be exercised. Each of the operations will compute the same thing: a histogram of words in the 'name' field of each game or software product in the database, something like: {_id: "word", count: 32}. The results will be spilled to a collection using either $merge or the 'output' option to mapReduce. To stress the exchange optimization in a sharded deployment, that collection is expected to be set up as a sharded collection, (unless shard_collections is false) though the test will still work in unsharded deployments.
Pre-requisite
The dataset must be installed on the target cluster before running the test. The data can be downloaded from here and installed using mongorestore (mongorestore --gzip --archive=bestbuyproducts.bson.gz)
In a sharded cluster (if shard_collections is not false), the target collection ('target_range_id') is expected to be sharded by the key {_id: 1} and have chunks distributed amongst the shards.
Setup
None
Test
The tests use a simple for loop of 2 minutes to repeatedly run a query which computes the word count in the names of products. The computation is performed in three ways, once with $merge with the exchange optimization enabled, and once with $merge with the exchange optimization disabled. Each run will report the throughput in documents processed per second.
Owning-team
mongodb/product-query
- Source: