JSDoc: Module: workloads/word

A simple test of basic Map/Reduce fuctionality.

Test

Execute a Map/Reduce task to sum the incidence of all words in a 10 word sentence from a set of documents.

Results are reported as docs processed / sec.

Setup

All variants (standalone, replica, sharded)

Notes

Each document contains a 10 word 'sentence'.
A word is generated from a random number between 0 and 999, but it should group the incidence of any given number towrds the center of the range. If the WORDS array is defined, then the word at this index is used, otherwise the index value is used.
Up to 1000 documents will be outputted (as the words are between 0 and 999).
We make sure to emphasize js perf by running the job in jsmode (except for sharded see SERVER-5448).
As a result of SERVER-5448, jsMode only works for non-sharded workloads.

Owning-team

mongodb/product-query

Source:

workloads/word_count.js, line 1

Members

(inner) batches

The number of batches. The default is 150.

The actual values in use are injected by run_workloads.py, which gets it from config file, see this hello world example.

Source:

workloads/word_count.js, line 63

(inner) batchSize

The unorderedBulkOp batch size to use when generating the documents. The default is 1000.

The actual values in use are injected by run_workloads.py, which gets it from config file, see this hello world example.

Source:

workloads/word_count.js, line 74

(inner) numJobs

The number of insertion jobs to schedule. The default is 100.

The actual values in use are injected by run_workloads.py, which gets it from config file, see this hello world example.

Source:

workloads/word_count.js, line 54

(inner) poolSize

The thread pool size to use generating the documents. The default is 32.

The actual values in use are injected by run_workloads.py, which gets it from config file, see this hello world example.

Source:

workloads/word_count.js, line 45

(inner) wordsPerSentence

a constant (10) containing the number of words per sentence.

Source:

workloads/word_count.js, line 80

Methods

(inner) createJobs(staging_data, numJobs, db_name, batches, batchSize, wordsPerSentence, words) → {array}

Create an array of jobs to insert the documents for the word count test.

Parameters:

Name	Type	Description
`staging_data`	function	the staging data function
`numJobs`	integer	the number of jobs to create
`db_name`	string	the mr database name
`batches`	string	the number of batches (batches) to invoke.
`batchSize`	string	the size of a batch.
`wordsPerSentence`	string	the number of words per sentence.
`words`	array	an array of word to select from. If empty then the inde will be used.

Source:

workloads/word_count.js, line 161

Returns:

returns an array of jobs that can be pased to runJobsInPool. A single job is an array containing the function to call as the first element of the array and the remaining elements of the array are the parameters to the function.

Type: array

(inner) staging_data(db_name, batches, batchSize, wordsPerSentence) → {object}

Create a range of documents for the word count test.

Parameters:

Name	Type	Description
`db_name`	string	The database name.
`batches`	integer	the number of batches to insert.
`batchSize`	integer	the number of documents per batch. Note: if this value is greater than 1000, then the bulk operator will transparently create batches of 1000.
`wordsPerSentence`	integer	the number of words in a sentence.

Source:

workloads/word_count.js, line 98

Returns:

a json document with the following fields: ok: if 1 then all the insert batches were successful and nInserted is the expected value nInserted: the number of documents inserted results: if ok is 1 then this field is an empty array, otherwise it contains all the batch results

Type: object

Module: workloads/word_count

Test

Setup

Notes

Owning-team

Members

(inner) batches

(inner) batchSize

(inner) numJobs

(inner) poolSize

(inner) wordsPerSentence

Methods

(inner) createJobs(staging_data, numJobs, db_name, batches, batchSize, wordsPerSentence, words) → {array}

Parameters:

Returns:

(inner) staging_data(db_name, batches, batchSize, wordsPerSentence) → {object}

Parameters:

Returns: