Wednesday, May 23, 2012

MongoDB Cursors with PHP - Derick Rethans

MongoDB Cursors with PHP - Derick Rethans:




MongoDB Cursors with PHP





London, UK


Tuesday, May 22nd 2012, 09:15 BST


Recently I was asked to improve the MongoCursor::batchSize documentation. This began an indepth investigation in how the PHP driver for MongoDB handles pulling data that's been queried from the MongoDB server. Here are my findings.
A MongoCursor is created as soon as you run the find() method on a MongoCollection object, like in:
$m = new Mongo();
$collection = $m->demoDb->demoCollection;
$cursor = $collection->find();


cursor.gif
Just calling find() will only create a cursor object, and does not immediately send the query to the server for processing. That is only done as soon as you start reading from the cursor for the first time. Because of this, you can call additional methods on the newly created cursor object that still influence how the query is run on the server. One of such examples is the sort() method that makes the result sort according to its arguments (in this example, by name):
$cursor->sort( array( 'name' => 1 ) );
$result = $cursor->getNext();


When you then call getNext() on $cursor the driver sends to the server the query, and requests to return a default number of documents in the first batch. The default Batch Size is 101. Let's have a look on what's get send on the wire in our simple query for all documents, sorted by name:
01-find-sort-name.jpg
The Number to Return is 0, which means to use the default. So even although we only want to fetch one result (getNext() asks the cursor for the next document only), the server returns 101 documents:
02-find-sort-name.jpg
The driver stores all 101 documents locally and during the next 100 calls to getNext() the driver will simply return the documents from the local memory. Once getNext() gets called for the 102th time, the driver connects back to the server to request more documents:
// skip the other 100 docs
for ($i = 0; $i getNext(); }
// request document 102:
$result = $cursor->getNext();


When the driver asks for more documents separately (i.e., not at the same time it is issuing a query) without a specific batch size, the server fills up 4MB of documents. On the wire, the request for Get More looks like:
03-find-sort-name.jpg
and the reply like:
04-find-sort-name.jpg
As you can see, the returned data is 4194378 bytes, and the Number Returned is 34673.
Setting your own batch size

You can instruct the driver to use different batch sizes, by using the batchSize() method on the $cursor. In this new example, we use the batchSize() method to request 25 documents per round trip to the server:
$cursor = $collection->find()->sort( array( 'name' => 1 ) );
$cursor->batchSize(25);
$result = $cursor->getNext();


When we run this script, we will see the following on the wire:
05-batch25.jpg
As expected, the Number to Return is now 25. During iteration, all query results are returned from the server to the driver in batches of 25 documents:
// retrieve another 25 documents to trigger the getMore
for ($i = 0; $i getNext(); }


Which creates this query:
06-batch25.jpg
And this r



Truncated by Planet PHP, read more at the original (another 7922 bytes)

DIGITAL JUICE

No comments:

Post a Comment

Thank's!