A simple test of basic Map/Reduce fuctionality.
Test
Execute a Map/Reduce task to sum the incidence of all words in a 10 word sentence from a set of documents.
Results are reported as docs processed / sec.
Setup
All variants (standalone, replica, sharded)
Notes
- Each document contains a 10 word 'sentence'.
- A word is generated from a random number between 0 and 999, but it should group the incidence of any given number towrds the center of the range. If the WORDS array is defined, then the word at this index is used, otherwise the index value is used.
- Up to 1000 documents will be outputted (as the words are between 0 and 999).
- We make sure to emphasize js perf by running the job in jsmode (except for sharded see SERVER-5448).
- As a result of SERVER-5448, jsMode only works for non-sharded workloads.
Owning-team
mongodb/product-query
- Source:
Members
(inner) batches
The number of batches. The default is 150.
The actual values in use are injected by run_workloads.py, which gets it from config file, see this hello world example.
- Source:
(inner) batchSize
The unorderedBulkOp batch size to use when generating the documents. The default is 1000.
The actual values in use are injected by run_workloads.py, which gets it from config file, see this hello world example.
- Source:
(inner) numJobs
The number of insertion jobs to schedule. The default is 100.
The actual values in use are injected by run_workloads.py, which gets it from config file, see this hello world example.
- Source:
(inner) poolSize
The thread pool size to use generating the documents. The default is 32.
The actual values in use are injected by run_workloads.py, which gets it from config file, see this hello world example.
- Source:
(inner) wordsPerSentence
a constant (10) containing the number of words per sentence.
- Source:
Methods
(inner) createJobs(staging_data, numJobs, db_name, batches, batchSize, wordsPerSentence, words) → {array}
Create an array of jobs to insert the documents for the word count test.
Parameters:
Name | Type | Description |
---|---|---|
staging_data |
function | the staging data function |
numJobs |
integer | the number of jobs to create |
db_name |
string | the mr database name |
batches |
string | the number of batches (batches) to invoke. |
batchSize |
string | the size of a batch. |
wordsPerSentence |
string | the number of words per sentence. |
words |
array | an array of word to select from. If empty then the inde will be used. |
- Source:
Returns:
returns an array of jobs that can be pased to runJobsInPool. A single job is an array containing the function to call as the first element of the array and the remaining elements of the array are the parameters to the function.
- Type
- array
(inner) staging_data(db_name, batches, batchSize, wordsPerSentence) → {object}
Create a range of documents for the word count test.
Parameters:
Name | Type | Description |
---|---|---|
db_name |
string | The database name. |
batches |
integer | the number of batches to insert. |
batchSize |
integer | the number of documents per batch. Note: if this value is greater than 1000, then the bulk operator will transparently create batches of 1000. |
wordsPerSentence |
integer | the number of words in a sentence. |
- Source:
Returns:
a json document with the following fields: ok: if 1 then all the insert batches were successful and nInserted is the expected value nInserted: the number of documents inserted results: if ok is 1 then this field is an empty array, otherwise it contains all the batch results
- Type
- object