首页 > Virtualization > QEMU block cache参数分析

QEMU block cache参数分析

Virtualization 2017-08-15

代码:git://git.qemu-project.org/qemu.git  v2.9.0
在QEMU doc描述的块设备cache参数:
cache is "none", "writeback", "unsafe", "directsync" or "writethrough" and controls how the host cache is used to access block data.

cache mode cache.writeback cache.direct cache.no-flush
writeback on off off
none on on off
writethrough off off off
directsync off on off
unsafe on off on

The default mode is cache=writeback.
cache的各种形式是通过cache.writeback、cache.direct 、cache.no-flush实现的.

在drive_new下:

value = qemu_opt_get(all_opts, "cache");

bdrv_parse_cache_mode(value, &flags, &writethrough)

//负责解析,结果填充到flags和writethrough

int bdrv_parse_cache_mode(const char *mode, int *flags, bool *writethrough)

{

    *flags &= ~BDRV_O_CACHE_MASK;

    if (!strcmp(mode, "off") || !strcmp(mode, "none")) {

        *writethrough = false;

        *flags |= BDRV_O_NOCACHE;

    } else if (!strcmp(mode, "directsync")) {

        *writethrough = true;

        *flags |= BDRV_O_NOCACHE;

    } else if (!strcmp(mode, "writeback")) {

        *writethrough = false;

    } else if (!strcmp(mode, "unsafe")) {

        *writethrough = false;

        *flags |= BDRV_O_NO_FLUSH;

    } else if (!strcmp(mode, "writethrough")) {

        *writethrough = true;

    } 

}

下面三句是配合生成cache.writeback cache.direct cache.no-flush,同时将opts值插入到--drive参数对应的QemuOptsList中.

if (!qemu_opt_get(all_opts, BDRV_OPT_CACHE_WB))

qemu_opt_set_bool(all_opts, BDRV_OPT_CACHE_WB, !writethrough, &error_abort);

if (!qemu_opt_get(all_opts, BDRV_OPT_CACHE_DIRECT))

qemu_opt_set_bool(all_opts, BDRV_OPT_CACHE_DIRECT, !!(flags & BDRV_O_NOCACHE), &error_abort);

if (!qemu_opt_get(all_opts, BDRV_OPT_CACHE_NO_FLUSH)) 

qemu_opt_set_bool(all_opts, BDRV_OPT_CACHE_NO_FLUSH, !!(flags & BDRV_O_NO_FLUSH), &error_abort);

在blockdev_init中,读取BDRV_OPT_CACHE_WB对应的QemuOptsList元素值, 然后执行blk_set_enable_write_cache,即blk->enable_write_cache = wce.

writethrough = !qemu_opt_get_bool(opts, BDRV_OPT_CACHE_WB, true);

blk_set_enable_write_cache(blk, !writethrough);

也即cache.writeback的值最终传递给了blk->enable_write_cache.

而blk->enable_write_cache只在blk_co_pwritev下生效,即当writeback没有使能时, 多加一个BDRV_REQ_FUA标签:

if (!blk->enable_write_cache)

    flags |= BDRV_REQ_FUA;

而在bdrv_driver_pwritev下,

if (ret == 0 && (flags & BDRV_REQ_FUA))  ret = bdrv_co_flush(bs);

在bdrv_co_flush中bs->drv->bdrv_co_flush_to_os负责flush虚拟磁盘模拟的硬件缓存.

bs->drv->bdrv_co_flush_to_disk负责flush image读写的host缓存到host文件系统上.但是它在qcow2上为空,那么执行bs->drv->bdrv_aio_flush, 即raw_aio_flush

在paio_submit调用aio_worker,当type满足QEMU_AIO_FLUSH时即执行handle_aiocb_flush,就是qemu_fdatasync(aiocb->aio_fildes).

在满足bs->open_flags & BDRV_O_NO_FLUSH的情况下,bs->drv->bdrv_aio_flush就无法执行了, 也就是在BDRV_OPT_CACHE_NO_FLUSH为1时, 不刷新image文件的buffer缓存.

BDRV_OPT_CACHE_DIRECT在update_flags_from_options会给flags添加BDRV_O_NOCACHE, 在raw_open_common中, raw_parse_flags解析open_flags

if ((bdrv_flags & BDRV_O_NOCACHE))    *open_flags |= O_DIRECT;

随后执行qemu_open(filename, s->open_flags, 0644),即image文件使用O_DIRECT打开.

总结:

当cache.writeback==0时, qcow2模拟的磁盘缓存被每个request提交后都会刷新.当writeback为1时, qcow2的模拟磁盘缓存会定时刷新.

当cache.direct==1时, qcow2不使用host的page cache机制.

当cache.no-flush==1时, image文件的缓存刷新由host自己控制, qemu不做任何控制


QEMU block cache参数分析来自于OenHan,链接为:http://oenhan.com/qemu-block-cache
更多阅读