- MySQL 5.6 is always slower <= 64 threads
- MySQL 5.6 is a bit faster at >= 128 threads with 1 buffer pool instance.
- MySQL 5.6 is a lot faster at >= 128 threads with 8 buffer pool instances.
In MySQL 5.1 and 5.5 the main background IO thread (srv_master_thread) supported furious flushing when needed. That thread had a loop from which background IO would be scheduled (write back dirty pages, do reads for insert buffer merges) and there was a one second sleep at the start of the loop but the sleep would be skipped when the previous loop iteration flushed many dirty pages. I use furious flushing to describe the InnoDB behavior when the sleep is frequently skipped. Each iteration of the loop would do about innodb_io_capacity disk requests (note that innodb_io_capacity limit was fuzzy, it might try to do twice the rate, but probably not 10X the rate). Because sleep could be skipped the innodb_io_capacity limit didn't really set the IOPs rate for background IO (despite what the docs state) but this was usually a good thing on servers that can do a lot of IOPs. The alternative would be to make InnoDB respect the limit and then set innodb_io_capacity to a large value and I think that is a bad idea without support for real AIO from InnoDB -- which is now in MySQL 5.6.
Several things have changed in MySQL 5.6.10:
- flushing of dirty pages from the tail of the LRU used to be done by foreground threads (threads that handle query processing) via buf_flush_free_margin and a thread would attempt to flush many dirty pages at a time. Note that clean pages at the end of the LRU can quickly be moved to the free list but dirty pages must first be flushed. In MySQL 5.6 foreground threads will try to move one page at a time via buf_flush_single_page_from_LRU and hope that the page cleaner thread does the rest of the work.
- the page cleaner thread (buf_flush_page_cleaner_thread) doesn't do furious flushing. It sleeps so that it won't run more than once per second. In theory this means that the documented behavior for innodb_io_capacity is more likely to be correct. Sleep is done if any of the following are true 1) the server is not idle 2) there are pending background reads 3) pages were not flushed on the previous loop iteration. Note that the first condition is always true on a busy server, so sleep is not skipped on a busy server.
- when the page cleaner flushes dirty pages from the end of the LRU it does not use innodb_io_capacity to determine how much work to do. Follow the call chain from buf_flush_LRU_tail to buf_flush_LRU. It looks like InnoDB will do up to ~1000 page writes per buffer pool instance. So the trick to getting a higher rate of page flushes from the LRU is to use more buffer pool instances. But that is not the real solution.
- fb5163 - MySQL 5.1.63 + the Facebook patch, iocap=1000, itc=0
- orig5163 - MySQL 5.1.63, iocap=1000, itc=0
- orig5610+hack - MySQL 5.6.10, iocap=1000, bpi=8 and a hack to get the page cleaner thread to do furious flushing when needed.
- orig5610+bp8 - MySQL 5.6.10, iocap=1000, bpi=8, itc=0
- orig5610+bp1 - MySQL 5.6.10, iocap=1000, bpi=1, itc=0
15134 19623 18521 14804 9730 5898 fb5163
10802 12980 13140 11337 11822 6284 orig5163
17649 22993 17281 15066 14899 14907 orig5610+hack
8317 8054 9379 11091 12684 14488 orig5610+bp8
4695 5393 6436 7378 8632 8951 orig5610+bp1
There are a few obvious problems. QPS falls quickly with concurrency for MySQL 5.1.63 because of mutex contention. QPS is much worse for MySQL 5.6 because of stalls on LRU flushing but using more buffer pool instances helps for the reason described above. From PMP I see that foreground threads are all stuck in buf_flush_single_page_from_LRU. Using a larger value for innodb_io_capacity does not help.
I have spent a lot of time working on the LRU flushing code for MySQL 5.1. That includes the innodb_fast_free_list option which allows MySQL 5.1 + the Facebook patch to almost match MySQL 5.6 on IO-bound & read-only workloads. Pages were moved from the LRU to the free list on demand in MySQL 5.1 & 5.5 when foreground threads needed a free page for a disk read and the free list was empty. Unfortunately the code in buf_flush_free_margin to do that work wasn't efficient. MySQL 5.6 might be more efficient given that most of the work will now be done by the page cleaner, a background thread. However this adds the risk that it won't keep up with demand. For example it is possible today for the page cleaner thread to be sleeping when the free list is empty and there are many dirty pages at the end of the LRU. That is not a good state for a high-perf server.
With the Facebook patch, using a large amount of memory with xtrabackup required the innodb_fast_free_list option or recovery was much too slow. I wonder if a similar problem exists in MySQL 5.6.
I have spent a lot of time working on the LRU flushing code for MySQL 5.1. That includes the innodb_fast_free_list option which allows MySQL 5.1 + the Facebook patch to almost match MySQL 5.6 on IO-bound & read-only workloads. Pages were moved from the LRU to the free list on demand in MySQL 5.1 & 5.5 when foreground threads needed a free page for a disk read and the free list was empty. Unfortunately the code in buf_flush_free_margin to do that work wasn't efficient. MySQL 5.6 might be more efficient given that most of the work will now be done by the page cleaner, a background thread. However this adds the risk that it won't keep up with demand. For example it is possible today for the page cleaner thread to be sleeping when the free list is empty and there are many dirty pages at the end of the LRU. That is not a good state for a high-perf server.
With the Facebook patch, using a large amount of memory with xtrabackup required the innodb_fast_free_list option or recovery was much too slow. I wonder if a similar problem exists in MySQL 5.6.
DIGITAL JUICE
No comments:
Post a Comment
Thank's!