RSYNC corruption innodb
I skimmed over this thread and I think I can help clarify the innodb, rsync, and lvm situation. The basic issue with just running rsync on the files under a running mysqld is that the rsync will copy different parts of files at different points in time. This means that it could sync things to disk before and after rsync has read that part of a file which, when restored, will be corrupted. This is the worst case scenario. The next best thing is to run a FLUSH TABLES WITH READ LOCK (global read lock) and maintain the lock while the rsync is running. This works fine for myisam tables because all file activity to myisam tables is stopped while a global read lock is held. This isn't guaranteed to work with innodb. Internally to mysql flush tables with read lock only stops queries from acquiring write locks which let them modify tables. This won't make innodb hold still for a few different reasons. First a select query in innodb can actually modify data files. A select on a page with unmerged records from the insert buffer will cause innodb to merge those records before making the page available for selects. This causes some disk i/o. If this happens while rsync is running the resulting backup can be corrupted. So even while holding a global read lock and only running selects innodb can write data. The master thread(s) perform background tasks such as flushing dirty pages, merging the insert buffer and purging old records whenever innodb feels like there is spare i/o capacity. These threads don't know how to hold still during a global read lock and can corrupt a backup if it were taken with rsync. The safest way to create a backup without using something like XtraBackup is to get a snapshot at the filesystem level or below that at the block device level. This is effectively what LVM does. When you create a LVM snapshot it freezes that device at a single point in time. When you restore the backup innodb goes through it's recovery procedure as if the power went out. Some possible solutions to this were discussed on the internals list a few years ago. I'm not sure what has been implemented since then. The list thread is here:
Login
0 Comments
Oldest