MongoDB Cursors with PHP
London, UK
Tuesday, May 22nd 2012, 09:15 BST
Recently I was asked to improve the MongoCursor::batchSize documentation. This began an indepth investigation in how the PHP driver for MongoDB handles pulling data that's been queried from the MongoDB server. Here are my findings.
A MongoCursor is created as soon as you run the find() method on a MongoCollection object, like in:
$m = new Mongo();
$collection = $m->demoDb->demoCollection;
$cursor = $collection->find();
Just calling
find()
will only create a cursor object, and does not immediately send the query to the server for processing. That is only done as soon as you start reading from the cursor for the first time. Because of this, you can call additional methods on the newly created cursor object that still influence how the query is run on the server. One of such examples is the sort() method that makes the result sort according to its arguments (in this example, by name):$cursor->sort( array( 'name' => 1 ) );
$result = $cursor->getNext();
When you then call getNext() on
$cursor
the driver sends to the server the query, and requests to return a default number of documents in the first batch. The default Batch Size is 101. Let's have a look on what's get send on the wire in our simple query for all documents, sorted by name:The Number to Return is 0, which means to use the default. So even although we only want to fetch one result (
getNext()
asks the cursor for the next document only), the server returns 101 documents:The driver stores all 101 documents locally and during the next 100 calls to
getNext()
the driver will simply return the documents from the local memory. Once getNext()
gets called for the 102th time, the driver connects back to the server to request more documents:// skip the other 100 docs
for ($i = 0; $i getNext(); }
// request document 102:
$result = $cursor->getNext();
When the driver asks for more documents separately (i.e., not at the same time it is issuing a query) without a specific batch size, the server fills up 4MB of documents. On the wire, the request for Get More looks like:
and the reply like:
As you can see, the returned data is
4194378
bytes, and the Number Returned is 34673
.Setting your own batch size
You can instruct the driver to use different batch sizes, by using the batchSize() method on the
$cursor
. In this new example, we use the batchSize()
method to request 25
documents per round trip to the server:$cursor = $collection->find()->sort( array( 'name' => 1 ) );
$cursor->batchSize(25);
$result = $cursor->getNext();
When we run this script, we will see the following on the wire:
As expected, the Number to Return is now 25. During iteration, all query results are returned from the server to the driver in batches of 25 documents:
// retrieve another 25 documents to trigger the getMore
for ($i = 0; $i getNext(); }
Which creates this query:
And this r
Truncated by Planet PHP, read more at the original (another 7922 bytes)
DIGITAL JUICE
No comments:
Post a Comment
Thank's!