A simple test of basic Map/Reduce fuctionality
Test
Execute an aggregation pipeline to sum all of the amounts grouped by uid. ~numJobs~ (default 200)
output documents are generated.
The number of input documents equals the product of:
numJobs * batches * batchSize * statusRange
The default case is:
200 * 40 * 1000 * 5 = 40,000,000
Results are reported as docs processed per second.
Setup
All variants (standalone, replica, sharded)
Notes
- This test stage will evenly distribute documents over 200 UID, the agg pipeline will calculate sum of amount based on uid.
Owning-team
mongodb/product-query
- Source:
Members
(inner) batches
The number of batches. The default is 40.
The actual values in use are injected by run_workloads.py, which gets it from config file, see this hello world example.
- Source:
(inner) batchSize
The unorderedBulkOp batch size to use when generating the documents. The default is 1000.
The actual values in use are injected by run_workloads.py, which gets it from config file, see this hello world example.
- Source:
(inner) db_name
the destination database name.
- Source:
(inner) numJobs
The range of uids to generate. The default 200.
The actual values in use are injected by run_workloads.py, which gets it from config file, see this hello world example.
- Source:
(inner) poolSize
The thread pool size to use generating the documents. The default is 32.
The actual values in use are injected by run_workloads.py, which gets it from config file, see this hello world example.
- Source:
(inner) statusRange
The range of status values to use when generating documents. It default to 5. So values 0 through 4 are generated in this case.
The actual values in use are injected by run_workloads.py, which gets it from config file, see this hello world example.
- Source:
Methods
(inner) createJobs(numJobs, func, db_name, batches, batchSize, statusRange) → {array}
Create an array of jobs to insert the documents for the map reduce test. In this instance, the job paramter are fixed except for the uid. Each job generates a set of documents for a given uid in the desired range (0 to numJobs -1).
Parameters:
Name | Type | Description |
---|---|---|
numJobs |
integer | the range of uids to generate |
func |
function | the staging data function |
db_name |
string | the mr database name |
batches |
string | the number of batches to invoke. |
batchSize |
string | the size of a batch. |
statusRange |
string | the range of status values (0 to statusRange -1). |
- Source:
Returns:
returns an array of jobs that can be pased to runJobsInPool. A single job is an array containing the function to call as the first element of the array and the remaining elements of the array are the parameters to the function.
- Type
- array
(inner) staging_data(db_name, batches, batchSize, uid, statusRange) → {object}
Create a range of documents for the map / reduce test.
Parameters:
Name | Type | Description |
---|---|---|
db_name |
string | The database name. |
batches |
integer | the number of batches to insert. |
batchSize |
integer | the number of documents per batch. Note: if this value is greater than 1000, then the bulk operator will transparently create batches of 1000. |
uid |
integer | the value of the uid field to insert. |
statusRange |
integer | the range of status values (zero based). |
- Source:
Returns:
a json document with the following fields: ok: if 1 then all the insert batches were successful and nInserted is the expected value nInserted: the number of documents inserted results: if ok is 1 then this field is an empty array, otherwise it contains all the batch results
- Type
- object