How to deal with mysqldump and innodb_buffer_pool_size?

Solution 1:

This seems to be an old thread but is still one of the concerns of a DBA. We can admit that a simple SELECT one of the developers does in a production database, depending on the size of the table and the amount of memory reserved for the Buffer Pool, can create troubles for InnoDB to move around pages that will not be accessed anymore to have in place again those being served to client requests coming from the applications.

Reducing the Buffer Pool will lead to the same problem, as the case here is to try to keep the Buffer Pool with the "right" set of pages (pushed to the Buffer Pool by having the applications touching them) even running a mysqldump for the data. Even if the data wasn't archived data, the problem is pointed on the docs:

By default, pages read by queries are immediately moved into the new sublist, meaning they stay in the buffer pool longer. A table scan, performed for a mysqldump operation or a SELECT statement with no WHERE clause, for example, can bring a large amount of data into the buffer pool and evict an equivalent amount of older data, even if the new data is never used again. Similarly, pages that are loaded by the read-ahead background thread and accessed only once are moved to the head of the new list. These situations can push frequently used pages to the old sublist where they become subject to eviction. https://dev.mysql.com/doc/refman/5.7/en/innodb-buffer-pool.html

Put in simple, if you run a mysqldump on your master, you can end having an effect of having a not warmed-up InnoDB Buffer Pool, full of pages your applications don't request to be in memory.

As per what you also can read on the manual, you need to make the buffer pool scan resistant, and I found the below:

In mixed workloads where most of the activity is OLTP type with periodic batch reporting queries, which result in large scans, setting the value of innodb_old_blocks_time during the batch runs can help keep the working set of the normal workload in the buffer pool. When scanning large tables that cannot fit entirely in the buffer pool, setting innodb_old_blocks_pct to a small value keeps the data that is only read once from consuming a significant portion of the buffer pool. For example, setting innodb_old_blocks_pct=5 restricts this data that is only read once to 5% of the buffer pool. https://dev.mysql.com/doc/refman/5.7/en/innodb-performance-midpoint_insertion.html

I'd say that I'd try on labs first to SET a session value for the innodb_old_blocks_time as, e.g. 10000 (ms), which means keep these pages for 10000 - 10 seconds - and protect my buffer pool against full table scans (as greater this value, as more pages will be kept at the old sublist). Even the docs for this variable says that, so, I would try that in a lab first. And then, you run the mysqldump command.

bianchi@box01:/# mariadb -e "SET GLOBAL innodb_old_blocks_time=100;" -vv
--------------
SET GLOBAL innodb_old_blocks_time=10000
--------------

Query OK, 0 rows affected (0.000 sec)

Bye

bianchi@box01:/# time mysqldump --opt \
--hex-blob --routines --triggers --events \
--single-transaction --databases <db-name> > test.dump

bianchi@box01:/# mariadb -e "SET GLOBAL innodb_old_blocks_time=DEFAULT;" -vv
--------------
SET GLOBAL innodb_old_blocks_time=DEFAULT
--------------

Query OK, 0 rows affected (0.000 sec)

Bye

An alternative is to have a replica available, not part of the regular load-balancing rotation so you can mess up with that database Server Buffer Pool.

I hope it helps.

Cheers!