Message ID | 4BABA2FB.8050505@web.de |
---|---|
State | New |
Headers | show |
On 03/25/2010 12:52 PM, Jan Kiszka wrote: > This adds the "map" subcommand to qemu-img. It is able to expose the raw > content of a disk image via a FUSE filesystem. Both the whole disk can > be accessed, e.g. to run partitioning tools against it, as well as > individual partitions. This allows to create new filesystems in the > image or loop-back mount exiting ones. Using the great mountlo tool > from the FUSE collection [1][2], the latter can even be done by non-root > users (the former anyway). > > There are some dependency to fulfill to gain all features: Partition > scanning is done via recent libblkid (I used version 2.17.1). If this > library is not available, only the disk file is provide. Fortunately, > mountlo can do partition scanning as well ("-p n") to work around this. > > Moreover, libfuse>= 2.8 and a host kernel>= 2.6.29 is required for > seamless disk access via fdisk. Otherwise, the BLKGETSIZE64 IOCTL cannot > be provided, and the number of cylinders has to set explicitly (e.g. via > "-C n"). > > This work was inspired by Ashley Saulsbury's qemu-diskp [3]. > > [1] http://sourceforge.net/apps/mediawiki/fuse/index.php?title=FileSystems#Mountlo > [2] http://sourceforge.net/projects/fuse/files/mountlo/ > [3] http://www.saulsbury.org/software/virtualization.html > > Signed-off-by: Jan Kiszka<jan.kiszka@web.de> > This has been proposed quite a few times. In fact, I wrote something like this prior to implementing qemu-nbd. The problem with fuse is that as default configured, you can't actually enter into a fuse filesystem as root and since you need to be root to loopback mount it, it pretty nasty from a usability perspective. So why did you go the fuse route instead of using qemu-nbd? Regards, Anthony Liguori > --- > Makefile | 6 +- > Makefile.objs | 6 + > configure | 55 +++++++ > qemu-img-cmds.hx | 11 ++ > qemu-img-map.c | 438 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ > qemu-img.c | 13 +- > qemu-img.h | 13 ++ > qemu-img.texi | 10 ++ > 8 files changed, 545 insertions(+), 7 deletions(-) > create mode 100644 qemu-img-map.c > create mode 100644 qemu-img.h > > diff --git a/Makefile b/Makefile > index 57c354d..d5a1dae 100644 > --- a/Makefile > +++ b/Makefile > @@ -126,10 +126,12 @@ bt-host.o: QEMU_CFLAGS += $(BLUEZ_CFLAGS) > > ###################################################################### > > -qemu-img.o: qemu-img-cmds.h > +qemu-img.o: qemu-img.h qemu-img-cmds.h > qemu-img.o qemu-tool.o qemu-nbd.o qemu-io.o: $(GENERATED_HEADERS) > > -qemu-img$(EXESUF): qemu-img.o qemu-tool.o $(block-obj-y) $(qobject-obj-y) > +qemu-img-map.o: QEMU_CFLAGS += $(FUSE_CFLAGS) $(BLKID_CFLAGS) > + > +qemu-img$(EXESUF): $(qemu-img-y) $(block-obj-y) $(qobject-obj-y) > > qemu-nbd$(EXESUF): qemu-nbd.o qemu-tool.o $(block-obj-y) $(qobject-obj-y) > > diff --git a/Makefile.objs b/Makefile.objs > index 281f7a6..8a651d2 100644 > --- a/Makefile.objs > +++ b/Makefile.objs > @@ -207,3 +207,9 @@ libdis-$(CONFIG_PPC_DIS) += ppc-dis.o > libdis-$(CONFIG_S390_DIS) += s390-dis.o > libdis-$(CONFIG_SH4_DIS) += sh4-dis.o > libdis-$(CONFIG_SPARC_DIS) += sparc-dis.o > + > +###################################################################### > +# qemu-img > + > +qemu-img-y = qemu-img.o qemu-tool.o > +qemu-img-$(CONFIG_FUSE) += qemu-img-map.o > diff --git a/configure b/configure > index 6bc40a3..c84aaa9 100755 > --- a/configure > +++ b/configure > @@ -263,6 +263,7 @@ vnc_tls="" > vnc_sasl="" > xen="" > linux_aio="" > +fuse="" > > gprof="no" > debug_tcg="no" > @@ -639,6 +640,10 @@ for opt do > ;; > --enable-linux-aio) linux_aio="yes" > ;; > + --disable-fuse) fuse="no" > + ;; > + --enable-fuse) fuse="yes" > + ;; > --enable-io-thread) io_thread="yes" > ;; > --disable-blobs) blobs="no" > @@ -801,6 +806,8 @@ echo " --disable-vde disable support for vde network" > echo " --enable-vde enable support for vde network" > echo " --disable-linux-aio disable Linux AIO support" > echo " --enable-linux-aio enable Linux AIO support" > +echo " --disable-fuse disable support for FUSE in qemu-img" > +echo " --enable-fuse enable support for FUSE in qemu-img" > echo " --enable-io-thread enable IO thread" > echo " --disable-blobs disable installing provided firmware blobs" > echo " --kerneldir=PATH look for kernel includes in PATH" > @@ -1586,6 +1593,44 @@ EOF > fi > fi > > +########################################## > +# FUSE libraries probe > +if test "$fuse" != "no" ; then > + fuse_cflags=`pkg-config --cflags fuse 2> /dev/null` > + fuse_libs=`pkg-config --libs fuse 2> /dev/null` > + cat> $TMPC<< EOF > +#include<fuse.h> > +int main(int argc, const char *argv[]) > +{ > + return fuse_main(argc, argv, NULL); > +} > +EOF > + if compile_prog "$fuse_cflags" "$fuse_libs" ; then > + fuse=yes > + libs_tools="$fuse_libs $libs_tools" > + else > + if test "$fuse" = "yes" ; then > + feature_not_found "FUSE" > + fi > + fuse=no > + fi > +fi > + > +########################################## > +# blkid_partlist probe > +blkid_cflags=`pkg-config --cflags blkid 2> /dev/null` > +blkid_libs=`pkg-config --libs blkid 2> /dev/null` > +cat> $TMPC<<EOF > +#include<blkid.h> > +int main(void) { blkid_partlist ls; return 0; } > +EOF > +blkid_partlist=no > +if compile_prog "$blkid_cflags" "$blkid_libs" ; then > + blkid_partlist=yes > + libs_tools="$blkid_libs $libs_tools" > +fi > + > + > # > # Check for xxxat() functions when we are building linux-user > # emulator. This is done because older glibc versions don't > @@ -1962,6 +2007,8 @@ echo "PIE user targets $user_pie" > echo "vde support $vde" > echo "IO thread $io_thread" > echo "Linux AIO support $linux_aio" > +echo "FUSE support $fuse" > +echo "partlist support $blkid_partlist" > echo "Install blobs $blobs" > echo "KVM support $kvm" > echo "fdt support $fdt" > @@ -2183,6 +2230,14 @@ fi > if test "$fdatasync" = "yes" ; then > echo "CONFIG_FDATASYNC=y">> $config_host_mak > fi > +if test "$fuse" = "yes" ; then > + echo "CONFIG_FUSE=y">> $config_host_mak > + echo "FUSE_CFLAGS=$fuse_cflags">> $config_host_mak > +fi > +if test "$blkid_partlist" = "yes" ; then > + echo "CONFIG_BLKID_PARTLIST=y">> $config_host_mak > + echo "BLKID_CFLAGS=$blkid_cflags">> $config_host_mak > +fi > > # XXX: suppress that > if [ "$bsd" = "yes" ] ; then > diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx > index f96876a..94c6e66 100644 > --- a/qemu-img-cmds.hx > +++ b/qemu-img-cmds.hx > @@ -49,5 +49,16 @@ DEF("rebase", img_rebase, > "rebase [-f fmt] [-u] -b backing_file [-F backing_fmt] filename") > STEXI > @item rebase [-f @var{fmt}] [-u] -b @var{backing_file} [-F @var{backing_fmt}] @var{filename} > +ETEXI > + > +#ifdef CONFIG_FUSE > +DEF("map", img_map, > + "map [-f fmt] [<FUSE options>] filename mountpoint") > +#endif > +STEXI > +@item map [@var{FUSE options}] @var{filename} @var{mountpoint} > +ETEXI > + > +STEXI > @end table > ETEXI > diff --git a/qemu-img-map.c b/qemu-img-map.c > new file mode 100644 > index 0000000..cd6bbf4 > --- /dev/null > +++ b/qemu-img-map.c > @@ -0,0 +1,438 @@ > +/* > + * QEMU disk image utility > + * > + * Copyright (c) 2010 Jan Kiszka > + * > + * This work is licensed under the terms of the GNU GPL, version 2 or later. > + * See the COPYING file in the top-level directory. > + */ > +#include "qemu-img.h" > +#include "qemu-option.h" > +#include "osdep.h" > +#include "block_int.h" > +#include<stdio.h> > +#include<getopt.h> > +#include<pthread.h> > +#include<signal.h> > + > +#define FUSE_USE_VERSION 28 > +#include<fuse.h> > + > +#ifdef CONFIG_LINUX > +#include<linux/fs.h> > +#endif > + > +#define ENTRY_INVALID 1 > +#define ENTRY_DIRTY 2 > + > +#define ENTRY_PATH_MAX 16 > + > +struct map_entry { > + struct map_entry *next; > + const char *path; > + size_t size; > + off_t offset; > + unsigned int use_counter; > + unsigned int flags; > +}; > + > +static struct stat img_stat; > +static BlockDriverState *img_bs; > +static struct map_entry disk_entry = { .path = "/disk" }; > +static char *disk_path; > + > +#ifdef CONFIG_BLKID_PARTLIST > + > +#include<blkid.h> > + > +static pthread_t reader_thread; > +static sigset_t wakeup_sigset; > +static pthread_mutex_t entry_lock = PTHREAD_MUTEX_INITIALIZER; > +static struct map_entry *last_entry =&disk_entry; > + > +static void *partition_reader(void *unused) > +{ > + struct map_entry *entry; > + blkid_partition par; > + blkid_partlist ls; > + blkid_probe pr; > + int nparts, i; > + char *path; > + > + while (sigwaitinfo(&wakeup_sigset, NULL)>= 0) { > + pr = blkid_new_probe_from_filename(disk_path); > + if (!pr) { > + continue; > + } > + > + ls = blkid_probe_get_partitions(pr); > + if (!ls) { > + blkid_free_probe(pr); > + continue; > + } > + > + nparts = blkid_partlist_numof_partitions(ls); > + > + for (i = 0; i< nparts; i++) { > + entry = calloc(1, sizeof(*entry)); > + if (!entry) { > + continue; > + } > + path = malloc(ENTRY_PATH_MAX); > + if (!path) { > + free(entry); > + continue; > + } > + > + par = blkid_partlist_get_partition(ls, i); > + > + snprintf(path, ENTRY_PATH_MAX, "/partition%d", > + blkid_partition_get_partno(par)); > + entry->path = path; > + entry->size = blkid_partition_get_size(par) * BDRV_SECTOR_SIZE; > + entry->offset = blkid_partition_get_start(par) * BDRV_SECTOR_SIZE; > + > + pthread_mutex_lock(&entry_lock); > + > + last_entry->next = entry; > + last_entry = entry; > + > + pthread_mutex_unlock(&entry_lock); > + } > + > + blkid_free_probe(pr); > + } > + > + return NULL; > +} > + > +static void update_partitions(void) > +{ > + struct map_entry *entry = disk_entry.next; > + struct map_entry *old; > + > + /* release old partions */ > + pthread_mutex_lock(&entry_lock); > + > + while (entry) { > + old = entry; > + entry = entry->next; > + if (old->use_counter == 0) { > + free((void *)old->path); > + free(old); > + } else { > + old->flags = ENTRY_INVALID; > + } > + } > + > + disk_entry.next = NULL; > + last_entry =&disk_entry; > + > + disk_entry.flags&= ~ENTRY_DIRTY; > + > + pthread_mutex_unlock(&entry_lock); > + > + /* kick off partition table scan */ > + pthread_kill(reader_thread, SIGUSR1); > +} > + > +static void init_reader_thread(void) > +{ > + sigemptyset(&wakeup_sigset); > + sigaddset(&wakeup_sigset, SIGUSR1); > + sigprocmask(SIG_BLOCK,&wakeup_sigset, NULL); > + > + if (pthread_create(&reader_thread, NULL, partition_reader, NULL)) { > + error("Could not spawn partition reader thread"); > + } > +} > + > +#else /* !CONFIG_BLKID_PARTLIST */ > + > +static inline void update_partitions(void) { } > +static inline void init_reader_thread(void) { } > + > +#endif /* !CONFIG_BLKID_PARTLIST */ > + > +static struct map_entry *find_map_entry(const char *path) > +{ > + struct map_entry *entry =&disk_entry; > + > + do { > + if (strcmp(entry->path, path) == 0) { > + break; > + } > + entry = entry->next; > + } while (entry); > + > + return entry; > +} > + > +static void *map_init(struct fuse_conn_info *conn) > +{ > + init_reader_thread(); > + update_partitions(); > + return NULL; > +} > + > +static int map_getattr(const char *path, struct stat *stbuf) > +{ > + struct map_entry *entry; > + int res = 0; > + > + memset(stbuf, 0, sizeof(struct stat)); > + stbuf->st_uid = img_stat.st_uid; > + stbuf->st_gid = img_stat.st_gid; > + stbuf->st_atime = img_stat.st_atime; > + stbuf->st_mtime = img_stat.st_mtime; > + stbuf->st_ctime = img_stat.st_ctime; > + > + if (strcmp(path, "/") == 0) { > + stbuf->st_mode = S_IFDIR | 0111 | img_stat.st_mode; > + stbuf->st_nlink = 2; > + } else { > + entry = find_map_entry(path); > + if (entry) { > + stbuf->st_mode = S_IFREG | img_stat.st_mode; > + stbuf->st_nlink = 1; > + stbuf->st_size = entry->size; > + } else { > + res = -ENOENT; > + } > + } > + > + return res; > +} > + > +static int map_readdir(const char *path, void *buf, fuse_fill_dir_t filler, > + off_t offset, struct fuse_file_info *fi) > +{ > + struct map_entry *entry; > + > + if (strcmp(path, "/") != 0) { > + return -ENOENT; > + } > + filler(buf, ".", NULL, 0); > + filler(buf, "..", NULL, 0); > + for (entry =&disk_entry; entry; entry = entry->next) { > + filler(buf, entry->path+1, NULL, 0); > + } > + > + return 0; > +} > + > +static int map_open(const char *path, struct fuse_file_info *fi) > +{ > + struct map_entry *entry = find_map_entry(path); > + > + if (!entry) { > + return -ENOENT; > + } > + > + entry->use_counter++; > + fi->fh = (uint64_t)entry; > + > + return 0; > +} > + > +static int map_release(const char *path, struct fuse_file_info *fi) > +{ > + struct map_entry *entry = (struct map_entry *)fi->fh; > + > + entry->use_counter--; > + > + if (entry ==&disk_entry&& entry->flags& ENTRY_DIRTY) { > + update_partitions(); > + } > + if (entry->flags& ENTRY_INVALID&& entry->use_counter == 0) { > + free((void *)entry->path); > + free(entry); > + } > + > + return 0; > +} > + > +static int map_read(const char *path, char *buf, size_t size, off_t offset, > + struct fuse_file_info *fi) > +{ > + struct map_entry *entry = (struct map_entry *)fi->fh; > + int err; > + > + if (entry->flags& ENTRY_INVALID) { > + return -ENOENT; > + } > + > + if (offset + size> entry->size) { > + size = entry->size - offset; > + } > + > + err = bdrv_read(img_bs, (entry->offset + offset) / BDRV_SECTOR_SIZE, > + (uint8_t*)buf, size / BDRV_SECTOR_SIZE); > + if (err) { > + return err; > + } > + > + return size; > +} > + > +static int map_write(const char *path, const char *buf, size_t size, > + off_t offset, struct fuse_file_info *fi) > +{ > + struct map_entry *entry = (struct map_entry *)fi->fh; > + int err; > + > + if (entry->flags& ENTRY_INVALID) { > + return -ENOENT; > + } > + > + err = bdrv_write(img_bs, (entry->offset + offset) / BDRV_SECTOR_SIZE, > + (uint8_t*)buf, size / BDRV_SECTOR_SIZE); > + if (err) { > + return err; > + } > + > + entry->flags |= ENTRY_DIRTY; > + > + return size; > +} > + > +#if FUSE_VERSION>= 28 > +static int map_ioctl(const char *path, int cmd, void *arg, > + struct fuse_file_info *fi, unsigned int flags, void *data) > +{ > + struct map_entry *entry = (struct map_entry *)fi->fh; > + > + if (entry->flags& ENTRY_INVALID) { > + return -ENOENT; > + } > + > + switch (cmd) { > +#ifdef CONFIG_LINUX > + case BLKGETSIZE64: > + *(uint64_t *)data = entry->size; > + return 0; > +#endif /* CONFIG_LINUX */ > + default: > + return -ENOTTY; > + } > +} > +#endif /* FUSE_VERSION>= 28 */ > + > +static struct fuse_operations map_ops = { > + .init = map_init, > + .getattr = map_getattr, > + .readdir = map_readdir, > + .open = map_open, > + .release = map_release, > + .read = map_read, > + .write = map_write, > +#if FUSE_VERSION>= 28 > + .ioctl = map_ioctl, > +#endif > +}; > + > +static void QEMU_NORETURN map_help(struct fuse_args *args) > +{ > + printf("usage: qemu-img map [-F fmt] [FUSE options] filename mountpoint\n" > + "\ngeneral options:\n" > + " -o opt,[opt...] mount options\n" > + " -h --help print help\n" > + " -V --version print version\n" > + "\nqemu-img options:\n" > + " -F fmt image format\n\n"); > + fuse_opt_add_arg(args, "-ho"); > + fuse_main(args->argc, args->argv,&map_ops, NULL); > + exit(1); > +} > + > +int img_map(int argc, char **argv) > +{ > + struct fuse_args args = FUSE_ARGS_INIT(0, NULL); > + const char *filename = NULL; > + const char *fmt = NULL; > + const char *mountpoint; > + char *fs_name; > + uint64_t size; > + > + fuse_opt_add_arg(&args, argv[0]); > + fuse_opt_add_arg(&args, "-o"); > + fuse_opt_add_arg(&args, "subtype=qemu-img-map"); > + > + /* block layer is not thread-safe */ > + fuse_opt_add_arg(&args, "-s"); > + > + for (;;) { > + static const struct option long_opts[] = { > + { "--help", 0, NULL, 'h' }, > + { "--version", 0, NULL, 'v' }, > + { NULL, 0, NULL, 0 } > + }; > + int c; > + > + c = getopt_long(argc, argv, "F:dfsho:", long_opts, NULL); > + if (c< 0) { > + break; > + } > + switch (c) { > + case 'h': > + map_help(&args); > + break; > + case 'F': > + fmt = optarg; > + break; > + case 'o': > + fuse_opt_add_arg(&args, "-o"); > + fuse_opt_add_arg(&args, optarg); > + break; > + case 'd': > + fuse_opt_add_arg(&args, "-d"); > + break; > + case 'f': > + fuse_opt_add_arg(&args, "-f"); > + break; > + default: > + /* ignore -s, we enforce it anyway */ > + break; > + } > + } > + if (optind + 1>= argc) { > + map_help(&args); > + } > + > + filename = argv[optind++]; > + > + size = strlen(filename) + 8; > + fs_name = malloc(size); > + if (!fs_name) { > + error("Not enough memory"); > + } > + snprintf(fs_name, size, "fsname=%s", filename); > + fuse_opt_insert_arg(&args, 1, "-o"); > + fuse_opt_insert_arg(&args, 2, fs_name); > + free(fs_name); > + > + mountpoint = argv[optind]; > + fuse_opt_add_arg(&args, mountpoint); > + > + size = strlen(mountpoint) + strlen(disk_entry.path) + 1; > + disk_path = malloc(size); > + if (!disk_path) { > + error("Not enough memory"); > + } > + snprintf(disk_path, size, "%s%s", mountpoint, disk_entry.path); > + > + if (stat(filename,&img_stat)< 0) { > + perror("Unable to process image file"); > + exit(1); > + } > + img_stat.st_mode&= S_IRWXU | S_IRWXG | S_IRWXO; > + > + img_bs = bdrv_new_open(filename, fmt, 0); > + if (!img_bs) { > + error("Could not open '%s'", filename); > + } > + bdrv_get_geometry(img_bs,&size); > + disk_entry.size = size * BDRV_SECTOR_SIZE; > + > + return fuse_main(args.argc, args.argv,&map_ops, NULL); > +} > diff --git a/qemu-img.c b/qemu-img.c > index 9b28664..28b8427 100644 > --- a/qemu-img.c > +++ b/qemu-img.c > @@ -21,7 +21,7 @@ > * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN > * THE SOFTWARE. > */ > -#include "qemu-common.h" > +#include "qemu-img.h" > #include "qemu-option.h" > #include "osdep.h" > #include "block_int.h" > @@ -39,7 +39,7 @@ typedef struct img_cmd_t { > /* Default to cache=writeback as data integrity is not important for qemu-tcg. */ > #define BRDV_O_FLAGS BDRV_O_CACHE_WB > > -static void QEMU_NORETURN error(const char *fmt, ...) > +void QEMU_NORETURN error(const char *fmt, ...) > { > va_list ap; > va_start(ap, fmt); > @@ -97,6 +97,9 @@ static void help(void) > printf("%s\nSupported formats:", help_msg); > bdrv_iterate_format(format_print, NULL); > printf("\n"); > +#ifdef CONFIG_FUSE > + printf("\nInvoke 'qemu-img map --help' to list FUSE options.\n"); > +#endif > exit(1); > } > > @@ -188,9 +191,9 @@ static int read_password(char *buf, int buf_size) > } > #endif > > -static BlockDriverState *bdrv_new_open(const char *filename, > - const char *fmt, > - int readonly) > +BlockDriverState *bdrv_new_open(const char *filename, > + const char *fmt, > + int readonly) > { > BlockDriverState *bs; > BlockDriver *drv; > diff --git a/qemu-img.h b/qemu-img.h > new file mode 100644 > index 0000000..1bf0f27 > --- /dev/null > +++ b/qemu-img.h > @@ -0,0 +1,13 @@ > +#ifndef QEMU_IMG_H > +#define QEMU_IMG_H > + > +#include "qemu-common.h" > + > +void QEMU_NORETURN error(const char *fmt, ...); > +BlockDriverState *bdrv_new_open(const char *filename, > + const char *fmt, > + int readonly); > + > +int img_map(int argc, char **argv); > + > +#endif > diff --git a/qemu-img.texi b/qemu-img.texi > index ac97854..a85f454 100644 > --- a/qemu-img.texi > +++ b/qemu-img.texi > @@ -106,6 +106,16 @@ they are displayed too. > @item snapshot [-l | -a @var{snapshot} | -c @var{snapshot} | -d @var{snapshot} ] @var{filename} > > List, apply, create or delete snapshots in image @var{filename}. > + > +@item map [-F @var{fmt}] [@var{FUSE options}] @var{filename} @var{mountpoint} > + > +Make a disk image accessible via pseudo devices under @var{mountpoint}. This > +command will expose the whole raw image as well as individual partitions, the > +latter depending on the parsing capabilies of libblkid. The exposed disk > +device file can be passed to partitioning tools, and any device file containing > +a valid filesystem can be loop-back mounted to access its content (e.g. via > +mountlo without any root privileges). For the full list of FUSE-related > +options, invoke @code{qemu-img map --help}. > @end table > > Supported image file formats: > > > >
Anthony Liguori wrote: > On 03/25/2010 12:52 PM, Jan Kiszka wrote: >> This adds the "map" subcommand to qemu-img. It is able to expose the raw >> content of a disk image via a FUSE filesystem. Both the whole disk can >> be accessed, e.g. to run partitioning tools against it, as well as >> individual partitions. This allows to create new filesystems in the >> image or loop-back mount exiting ones. Using the great mountlo tool >> from the FUSE collection [1][2], the latter can even be done by non-root >> users (the former anyway). >> >> There are some dependency to fulfill to gain all features: Partition >> scanning is done via recent libblkid (I used version 2.17.1). If this >> library is not available, only the disk file is provide. Fortunately, >> mountlo can do partition scanning as well ("-p n") to work around this. >> >> Moreover, libfuse>= 2.8 and a host kernel>= 2.6.29 is required for >> seamless disk access via fdisk. Otherwise, the BLKGETSIZE64 IOCTL cannot >> be provided, and the number of cylinders has to set explicitly (e.g. via >> "-C n"). >> >> This work was inspired by Ashley Saulsbury's qemu-diskp [3]. >> >> [1] >> http://sourceforge.net/apps/mediawiki/fuse/index.php?title=FileSystems#Mountlo >> >> [2] http://sourceforge.net/projects/fuse/files/mountlo/ >> [3] http://www.saulsbury.org/software/virtualization.html >> >> Signed-off-by: Jan Kiszka<jan.kiszka@web.de> >> > > This has been proposed quite a few times. > > In fact, I wrote something like this prior to implementing qemu-nbd. > > The problem with fuse is that as default configured, you can't actually > enter into a fuse filesystem as root and since you need to be root to > loopback mount it, it pretty nasty from a usability perspective. You don't, see mountlo. > > So why did you go the fuse route instead of using qemu-nbd? Mostly usability. It's really straightforward to stack mountlo on top of the mapped image. And you can run (almost) all the filesystem and partitioning tools. Moreover, blkid_partlist provides a more complete partition parser than rolling your own version - which I briefly considered and then quickly dropped after looking at some implementations. Jan
On 03/25/2010 04:46 PM, Jan Kiszka wrote: > Anthony Liguori wrote: > >> On 03/25/2010 12:52 PM, Jan Kiszka wrote: >> >>> This adds the "map" subcommand to qemu-img. It is able to expose the raw >>> content of a disk image via a FUSE filesystem. Both the whole disk can >>> be accessed, e.g. to run partitioning tools against it, as well as >>> individual partitions. This allows to create new filesystems in the >>> image or loop-back mount exiting ones. Using the great mountlo tool >>> from the FUSE collection [1][2], the latter can even be done by non-root >>> users (the former anyway). >>> >>> There are some dependency to fulfill to gain all features: Partition >>> scanning is done via recent libblkid (I used version 2.17.1). If this >>> library is not available, only the disk file is provide. Fortunately, >>> mountlo can do partition scanning as well ("-p n") to work around this. >>> >>> Moreover, libfuse>= 2.8 and a host kernel>= 2.6.29 is required for >>> seamless disk access via fdisk. Otherwise, the BLKGETSIZE64 IOCTL cannot >>> be provided, and the number of cylinders has to set explicitly (e.g. via >>> "-C n"). >>> >>> This work was inspired by Ashley Saulsbury's qemu-diskp [3]. >>> >>> [1] >>> http://sourceforge.net/apps/mediawiki/fuse/index.php?title=FileSystems#Mountlo >>> >>> [2] http://sourceforge.net/projects/fuse/files/mountlo/ >>> [3] http://www.saulsbury.org/software/virtualization.html >>> >>> Signed-off-by: Jan Kiszka<jan.kiszka@web.de> >>> >>> >> This has been proposed quite a few times. >> >> In fact, I wrote something like this prior to implementing qemu-nbd. >> >> The problem with fuse is that as default configured, you can't actually >> enter into a fuse filesystem as root and since you need to be root to >> loopback mount it, it pretty nasty from a usability perspective. >> > You don't, see mountlo. > That definitely changes things. I assume it just uses libe2fs et al to display filesystem contents? Does it preserve ownership? You still can't do things as root I take it which is problematic. >> So why did you go the fuse route instead of using qemu-nbd? >> > Mostly usability. It's really straightforward to stack mountlo on top of > the mapped image. And you can run (almost) all the filesystem and > partitioning tools. > > Moreover, blkid_partlist provides a more complete partition parser than > rolling your own version We do in qemu-nbd and in qemu (to guess disk geometry). Regards, Anthony Liguori
Anthony Liguori wrote: > On 03/25/2010 04:46 PM, Jan Kiszka wrote: >> Anthony Liguori wrote: >> >>> On 03/25/2010 12:52 PM, Jan Kiszka wrote: >>> >>>> This adds the "map" subcommand to qemu-img. It is able to expose the >>>> raw >>>> content of a disk image via a FUSE filesystem. Both the whole disk can >>>> be accessed, e.g. to run partitioning tools against it, as well as >>>> individual partitions. This allows to create new filesystems in the >>>> image or loop-back mount exiting ones. Using the great mountlo tool >>>> from the FUSE collection [1][2], the latter can even be done by >>>> non-root >>>> users (the former anyway). >>>> >>>> There are some dependency to fulfill to gain all features: Partition >>>> scanning is done via recent libblkid (I used version 2.17.1). If this >>>> library is not available, only the disk file is provide. Fortunately, >>>> mountlo can do partition scanning as well ("-p n") to work around this. >>>> >>>> Moreover, libfuse>= 2.8 and a host kernel>= 2.6.29 is required for >>>> seamless disk access via fdisk. Otherwise, the BLKGETSIZE64 IOCTL >>>> cannot >>>> be provided, and the number of cylinders has to set explicitly (e.g. >>>> via >>>> "-C n"). >>>> >>>> This work was inspired by Ashley Saulsbury's qemu-diskp [3]. >>>> >>>> [1] >>>> http://sourceforge.net/apps/mediawiki/fuse/index.php?title=FileSystems#Mountlo >>>> >>>> >>>> [2] http://sourceforge.net/projects/fuse/files/mountlo/ >>>> [3] http://www.saulsbury.org/software/virtualization.html >>>> >>>> Signed-off-by: Jan Kiszka<jan.kiszka@web.de> >>>> >>>> >>> This has been proposed quite a few times. >>> >>> In fact, I wrote something like this prior to implementing qemu-nbd. >>> >>> The problem with fuse is that as default configured, you can't actually >>> enter into a fuse filesystem as root and since you need to be root to >>> loopback mount it, it pretty nasty from a usability perspective. >>> >> You don't, see mountlo. >> > > That definitely changes things. I assume it just uses libe2fs et al to > display filesystem contents? Nope. It's a bit like libguestfs as it uses Linux to access the filesystems, but that Linux runs in UML mode, thus does not require any qemu/kvm underneath. It simply maps the FUSE requests on corresponding VFS services in the UML kernel. > > Does it preserve ownership? Yep. > > You still can't do things as root I take it which is problematic. At least my default config does not prevent running qemu-img map as root and then performing a classic "mount -o loop" on the partitions it provides. Or what do you mean? What mountlo is lacking (at least so far) are things like LVM or soft-RAID. There were some posts on the fuse lists announcing work on it, but that dates 2 years back without any code traces. But if this path turns out to be useful for us (or libguestfs), I guess that should be easy to add. Jan
On 03/25/2010 05:27 PM, Jan Kiszka wrote: > Anthony Liguori wrote: > >> On 03/25/2010 04:46 PM, Jan Kiszka wrote: >> >>> Anthony Liguori wrote: >>> >>> >>>> On 03/25/2010 12:52 PM, Jan Kiszka wrote: >>>> >>>> >>>>> This adds the "map" subcommand to qemu-img. It is able to expose the >>>>> raw >>>>> content of a disk image via a FUSE filesystem. Both the whole disk can >>>>> be accessed, e.g. to run partitioning tools against it, as well as >>>>> individual partitions. This allows to create new filesystems in the >>>>> image or loop-back mount exiting ones. Using the great mountlo tool >>>>> from the FUSE collection [1][2], the latter can even be done by >>>>> non-root >>>>> users (the former anyway). >>>>> >>>>> There are some dependency to fulfill to gain all features: Partition >>>>> scanning is done via recent libblkid (I used version 2.17.1). If this >>>>> library is not available, only the disk file is provide. Fortunately, >>>>> mountlo can do partition scanning as well ("-p n") to work around this. >>>>> >>>>> Moreover, libfuse>= 2.8 and a host kernel>= 2.6.29 is required for >>>>> seamless disk access via fdisk. Otherwise, the BLKGETSIZE64 IOCTL >>>>> cannot >>>>> be provided, and the number of cylinders has to set explicitly (e.g. >>>>> via >>>>> "-C n"). >>>>> >>>>> This work was inspired by Ashley Saulsbury's qemu-diskp [3]. >>>>> >>>>> [1] >>>>> http://sourceforge.net/apps/mediawiki/fuse/index.php?title=FileSystems#Mountlo >>>>> >>>>> >>>>> [2] http://sourceforge.net/projects/fuse/files/mountlo/ >>>>> [3] http://www.saulsbury.org/software/virtualization.html >>>>> >>>>> Signed-off-by: Jan Kiszka<jan.kiszka@web.de> >>>>> >>>>> >>>>> >>>> This has been proposed quite a few times. >>>> >>>> In fact, I wrote something like this prior to implementing qemu-nbd. >>>> >>>> The problem with fuse is that as default configured, you can't actually >>>> enter into a fuse filesystem as root and since you need to be root to >>>> loopback mount it, it pretty nasty from a usability perspective. >>>> >>>> >>> You don't, see mountlo. >>> >>> >> That definitely changes things. I assume it just uses libe2fs et al to >> display filesystem contents? >> > Nope. It's a bit like libguestfs as it uses Linux to access the > filesystems, but that Linux runs in UML mode, thus does not require any > qemu/kvm underneath. It simply maps the FUSE requests on corresponding > VFS services in the UML kernel. > > >> Does it preserve ownership? >> > Yep. > > >> You still can't do things as root I take it which is problematic. >> > At least my default config does not prevent running qemu-img map as root > and then performing a classic "mount -o loop" on the partitions it > provides. Or what do you mean? > You need user_allow_other set in /etc/fuse.conf which isn't set by default. Regards, Anthony Liguori
Anthony Liguori wrote: > On 03/25/2010 05:27 PM, Jan Kiszka wrote: >> Anthony Liguori wrote: >> >>> On 03/25/2010 04:46 PM, Jan Kiszka wrote: >>> >>>> Anthony Liguori wrote: >>>> >>>> >>>>> On 03/25/2010 12:52 PM, Jan Kiszka wrote: >>>>> >>>>> >>>>>> This adds the "map" subcommand to qemu-img. It is able to expose the >>>>>> raw >>>>>> content of a disk image via a FUSE filesystem. Both the whole disk >>>>>> can >>>>>> be accessed, e.g. to run partitioning tools against it, as well as >>>>>> individual partitions. This allows to create new filesystems in the >>>>>> image or loop-back mount exiting ones. Using the great mountlo tool >>>>>> from the FUSE collection [1][2], the latter can even be done by >>>>>> non-root >>>>>> users (the former anyway). >>>>>> >>>>>> There are some dependency to fulfill to gain all features: Partition >>>>>> scanning is done via recent libblkid (I used version 2.17.1). If this >>>>>> library is not available, only the disk file is provide. Fortunately, >>>>>> mountlo can do partition scanning as well ("-p n") to work around >>>>>> this. >>>>>> >>>>>> Moreover, libfuse>= 2.8 and a host kernel>= 2.6.29 is required for >>>>>> seamless disk access via fdisk. Otherwise, the BLKGETSIZE64 IOCTL >>>>>> cannot >>>>>> be provided, and the number of cylinders has to set explicitly (e.g. >>>>>> via >>>>>> "-C n"). >>>>>> >>>>>> This work was inspired by Ashley Saulsbury's qemu-diskp [3]. >>>>>> >>>>>> [1] >>>>>> http://sourceforge.net/apps/mediawiki/fuse/index.php?title=FileSystems#Mountlo >>>>>> >>>>>> >>>>>> >>>>>> [2] http://sourceforge.net/projects/fuse/files/mountlo/ >>>>>> [3] http://www.saulsbury.org/software/virtualization.html >>>>>> >>>>>> Signed-off-by: Jan Kiszka<jan.kiszka@web.de> >>>>>> >>>>>> >>>>>> >>>>> This has been proposed quite a few times. >>>>> >>>>> In fact, I wrote something like this prior to implementing qemu-nbd. >>>>> >>>>> The problem with fuse is that as default configured, you can't >>>>> actually >>>>> enter into a fuse filesystem as root and since you need to be root to >>>>> loopback mount it, it pretty nasty from a usability perspective. >>>>> >>>>> >>>> You don't, see mountlo. >>>> >>>> >>> That definitely changes things. I assume it just uses libe2fs et al to >>> display filesystem contents? >>> >> Nope. It's a bit like libguestfs as it uses Linux to access the >> filesystems, but that Linux runs in UML mode, thus does not require any >> qemu/kvm underneath. It simply maps the FUSE requests on corresponding >> VFS services in the UML kernel. >> >> >>> Does it preserve ownership? >>> >> Yep. >> >> >>> You still can't do things as root I take it which is problematic. >>> >> At least my default config does not prevent running qemu-img map as root >> and then performing a classic "mount -o loop" on the partitions it >> provides. Or what do you mean? >> > > You need user_allow_other set in /etc/fuse.conf which isn't set by default. I don't see the need for sharing the mount. Either your are root, then you can do this anyway. Or you are a normal user, and then the vision is that you can do everything you need for setting up and maintaining guest images without ever becoming root. We aren't completely there yet. E.g., the Linux kernel blocks mknod of devices although FUSE filesystems are automatically mounted with nodev. But that should be fixable as well. I think this approach already covers the majority of use cases of manipulating guest images as normal user, and that without requiring more than 500 lines of code here plus the external mountlo tool. Jan
On Thu, Mar 25, 2010 at 06:52:59PM +0100, Jan Kiszka wrote: > This adds the "map" subcommand to qemu-img. It is able to expose the raw > content of a disk image via a FUSE filesystem. Both the whole disk can > be accessed, e.g. to run partitioning tools against it, as well as > individual partitions. This allows to create new filesystems in the > image or loop-back mount exiting ones. Using the great mountlo tool > from the FUSE collection [1][2], the latter can even be done by non-root > users (the former anyway). Is there a good reason to throw this into qemu-img instead of making a separate qemu-fuse or similar tool? It's doing something quite different than the rest of qemu-img.
Christoph Hellwig wrote: > On Thu, Mar 25, 2010 at 06:52:59PM +0100, Jan Kiszka wrote: >> This adds the "map" subcommand to qemu-img. It is able to expose the raw >> content of a disk image via a FUSE filesystem. Both the whole disk can >> be accessed, e.g. to run partitioning tools against it, as well as >> individual partitions. This allows to create new filesystems in the >> image or loop-back mount exiting ones. Using the great mountlo tool >> from the FUSE collection [1][2], the latter can even be done by non-root >> users (the former anyway). > > Is there a good reason to throw this into qemu-img instead of making > a separate qemu-fuse or similar tool? It's doing something quite > different than the rest of qemu-img. > qemu-img is the swiss knife for QEMU disk image manipulation (like git is for everything around a git repository). So, IHMO, mapping the image content into the host filesystem for further manipulation with standard tools belongs to this. If the "map" thing works out for most users, I could even imagine some helper sub-command "mount" that encapsulates map and mountlo (or some other unprivileged mounting mechanism). This should make it easier for users to explore all possibilities they have when working with disk images. Jan
On 29.03.2010, at 09:46, Jan Kiszka wrote: > Christoph Hellwig wrote: >> On Thu, Mar 25, 2010 at 06:52:59PM +0100, Jan Kiszka wrote: >>> This adds the "map" subcommand to qemu-img. It is able to expose the raw >>> content of a disk image via a FUSE filesystem. Both the whole disk can >>> be accessed, e.g. to run partitioning tools against it, as well as >>> individual partitions. This allows to create new filesystems in the >>> image or loop-back mount exiting ones. Using the great mountlo tool >>> from the FUSE collection [1][2], the latter can even be done by non-root >>> users (the former anyway). >> >> Is there a good reason to throw this into qemu-img instead of making >> a separate qemu-fuse or similar tool? It's doing something quite >> different than the rest of qemu-img. >> > > qemu-img is the swiss knife for QEMU disk image manipulation (like git > is for everything around a git repository). So, IHMO, mapping the image > content into the host filesystem for further manipulation with standard > tools belongs to this. > > If the "map" thing works out for most users, I could even imagine some > helper sub-command "mount" that encapsulates map and mountlo (or some > other unprivileged mounting mechanism). This should make it easier for > users to explore all possibilities they have when working with disk images. We also have a tool called "qemu-ext2" lying around that allows you to explore ext2 based file system contents in any qemu block layer supported backend. IMHO the best move to do here (Anthony's idea) is to somehow get the full block layer into a library, move it out of qemu into a separate project and allow other tools in there too. That move would vastly improve the situation of distributions too. I don't want to have a qemu-img each coming from the Xen, KVM and Qemu packages. One is enough :-). And it could enable block layer experienced people to be the project maintainers, making that more valuable. Alex
Alexander Graf wrote: > On 29.03.2010, at 09:46, Jan Kiszka wrote: > >> Christoph Hellwig wrote: >>> On Thu, Mar 25, 2010 at 06:52:59PM +0100, Jan Kiszka wrote: >>>> This adds the "map" subcommand to qemu-img. It is able to expose the raw >>>> content of a disk image via a FUSE filesystem. Both the whole disk can >>>> be accessed, e.g. to run partitioning tools against it, as well as >>>> individual partitions. This allows to create new filesystems in the >>>> image or loop-back mount exiting ones. Using the great mountlo tool >>>> from the FUSE collection [1][2], the latter can even be done by non-root >>>> users (the former anyway). >>> Is there a good reason to throw this into qemu-img instead of making >>> a separate qemu-fuse or similar tool? It's doing something quite >>> different than the rest of qemu-img. >>> >> qemu-img is the swiss knife for QEMU disk image manipulation (like git >> is for everything around a git repository). So, IHMO, mapping the image >> content into the host filesystem for further manipulation with standard >> tools belongs to this. >> >> If the "map" thing works out for most users, I could even imagine some >> helper sub-command "mount" that encapsulates map and mountlo (or some >> other unprivileged mounting mechanism). This should make it easier for >> users to explore all possibilities they have when working with disk images. > > We also have a tool called "qemu-ext2" lying around that allows you to explore ext2 based file system contents in any qemu block layer supported backend. "we" == SUSE? [ Wow - just typed "qemu-ext2" into Big Brother's search bar and found the very same mail I'm just replying to. That's fast. ] > > IMHO the best move to do here (Anthony's idea) is to somehow get the full block layer into a library, move it out of qemu into a separate project and allow other tools in there too. > > That move would vastly improve the situation of distributions too. I don't want to have a qemu-img each coming from the Xen, KVM and Qemu packages. One is enough :-). And it could enable block layer experienced people to be the project maintainers, making that more valuable. > Full ack. Jan
On 29.03.2010, at 11:37, Jan Kiszka wrote: > Alexander Graf wrote: >> On 29.03.2010, at 09:46, Jan Kiszka wrote: >> >>> Christoph Hellwig wrote: >>>> On Thu, Mar 25, 2010 at 06:52:59PM +0100, Jan Kiszka wrote: >>>>> This adds the "map" subcommand to qemu-img. It is able to expose the raw >>>>> content of a disk image via a FUSE filesystem. Both the whole disk can >>>>> be accessed, e.g. to run partitioning tools against it, as well as >>>>> individual partitions. This allows to create new filesystems in the >>>>> image or loop-back mount exiting ones. Using the great mountlo tool >>>>> from the FUSE collection [1][2], the latter can even be done by non-root >>>>> users (the former anyway). >>>> Is there a good reason to throw this into qemu-img instead of making >>>> a separate qemu-fuse or similar tool? It's doing something quite >>>> different than the rest of qemu-img. >>>> >>> qemu-img is the swiss knife for QEMU disk image manipulation (like git >>> is for everything around a git repository). So, IHMO, mapping the image >>> content into the host filesystem for further manipulation with standard >>> tools belongs to this. >>> >>> If the "map" thing works out for most users, I could even imagine some >>> helper sub-command "mount" that encapsulates map and mountlo (or some >>> other unprivileged mounting mechanism). This should make it easier for >>> users to explore all possibilities they have when working with disk images. >> >> We also have a tool called "qemu-ext2" lying around that allows you to explore ext2 based file system contents in any qemu block layer supported backend. > > "we" == SUSE? "we" == "SUSE Studio" (in fact, Nat wrote it). It is GPL'ed, just not released yet. As soon as there will be a separate project with a broader scope than just qemu for the block layer, I'll happily invest the time to clean it up for upstream submission. Alex
diff --git a/Makefile b/Makefile index 57c354d..d5a1dae 100644 --- a/Makefile +++ b/Makefile @@ -126,10 +126,12 @@ bt-host.o: QEMU_CFLAGS += $(BLUEZ_CFLAGS) ###################################################################### -qemu-img.o: qemu-img-cmds.h +qemu-img.o: qemu-img.h qemu-img-cmds.h qemu-img.o qemu-tool.o qemu-nbd.o qemu-io.o: $(GENERATED_HEADERS) -qemu-img$(EXESUF): qemu-img.o qemu-tool.o $(block-obj-y) $(qobject-obj-y) +qemu-img-map.o: QEMU_CFLAGS += $(FUSE_CFLAGS) $(BLKID_CFLAGS) + +qemu-img$(EXESUF): $(qemu-img-y) $(block-obj-y) $(qobject-obj-y) qemu-nbd$(EXESUF): qemu-nbd.o qemu-tool.o $(block-obj-y) $(qobject-obj-y) diff --git a/Makefile.objs b/Makefile.objs index 281f7a6..8a651d2 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -207,3 +207,9 @@ libdis-$(CONFIG_PPC_DIS) += ppc-dis.o libdis-$(CONFIG_S390_DIS) += s390-dis.o libdis-$(CONFIG_SH4_DIS) += sh4-dis.o libdis-$(CONFIG_SPARC_DIS) += sparc-dis.o + +###################################################################### +# qemu-img + +qemu-img-y = qemu-img.o qemu-tool.o +qemu-img-$(CONFIG_FUSE) += qemu-img-map.o diff --git a/configure b/configure index 6bc40a3..c84aaa9 100755 --- a/configure +++ b/configure @@ -263,6 +263,7 @@ vnc_tls="" vnc_sasl="" xen="" linux_aio="" +fuse="" gprof="no" debug_tcg="no" @@ -639,6 +640,10 @@ for opt do ;; --enable-linux-aio) linux_aio="yes" ;; + --disable-fuse) fuse="no" + ;; + --enable-fuse) fuse="yes" + ;; --enable-io-thread) io_thread="yes" ;; --disable-blobs) blobs="no" @@ -801,6 +806,8 @@ echo " --disable-vde disable support for vde network" echo " --enable-vde enable support for vde network" echo " --disable-linux-aio disable Linux AIO support" echo " --enable-linux-aio enable Linux AIO support" +echo " --disable-fuse disable support for FUSE in qemu-img" +echo " --enable-fuse enable support for FUSE in qemu-img" echo " --enable-io-thread enable IO thread" echo " --disable-blobs disable installing provided firmware blobs" echo " --kerneldir=PATH look for kernel includes in PATH" @@ -1586,6 +1593,44 @@ EOF fi fi +########################################## +# FUSE libraries probe +if test "$fuse" != "no" ; then + fuse_cflags=`pkg-config --cflags fuse 2> /dev/null` + fuse_libs=`pkg-config --libs fuse 2> /dev/null` + cat > $TMPC << EOF +#include <fuse.h> +int main(int argc, const char *argv[]) +{ + return fuse_main(argc, argv, NULL); +} +EOF + if compile_prog "$fuse_cflags" "$fuse_libs" ; then + fuse=yes + libs_tools="$fuse_libs $libs_tools" + else + if test "$fuse" = "yes" ; then + feature_not_found "FUSE" + fi + fuse=no + fi +fi + +########################################## +# blkid_partlist probe +blkid_cflags=`pkg-config --cflags blkid 2> /dev/null` +blkid_libs=`pkg-config --libs blkid 2> /dev/null` +cat > $TMPC <<EOF +#include <blkid.h> +int main(void) { blkid_partlist ls; return 0; } +EOF +blkid_partlist=no +if compile_prog "$blkid_cflags" "$blkid_libs" ; then + blkid_partlist=yes + libs_tools="$blkid_libs $libs_tools" +fi + + # # Check for xxxat() functions when we are building linux-user # emulator. This is done because older glibc versions don't @@ -1962,6 +2007,8 @@ echo "PIE user targets $user_pie" echo "vde support $vde" echo "IO thread $io_thread" echo "Linux AIO support $linux_aio" +echo "FUSE support $fuse" +echo "partlist support $blkid_partlist" echo "Install blobs $blobs" echo "KVM support $kvm" echo "fdt support $fdt" @@ -2183,6 +2230,14 @@ fi if test "$fdatasync" = "yes" ; then echo "CONFIG_FDATASYNC=y" >> $config_host_mak fi +if test "$fuse" = "yes" ; then + echo "CONFIG_FUSE=y" >> $config_host_mak + echo "FUSE_CFLAGS=$fuse_cflags" >> $config_host_mak +fi +if test "$blkid_partlist" = "yes" ; then + echo "CONFIG_BLKID_PARTLIST=y" >> $config_host_mak + echo "BLKID_CFLAGS=$blkid_cflags" >> $config_host_mak +fi # XXX: suppress that if [ "$bsd" = "yes" ] ; then diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx index f96876a..94c6e66 100644 --- a/qemu-img-cmds.hx +++ b/qemu-img-cmds.hx @@ -49,5 +49,16 @@ DEF("rebase", img_rebase, "rebase [-f fmt] [-u] -b backing_file [-F backing_fmt] filename") STEXI @item rebase [-f @var{fmt}] [-u] -b @var{backing_file} [-F @var{backing_fmt}] @var{filename} +ETEXI + +#ifdef CONFIG_FUSE +DEF("map", img_map, + "map [-f fmt] [<FUSE options>] filename mountpoint") +#endif +STEXI +@item map [@var{FUSE options}] @var{filename} @var{mountpoint} +ETEXI + +STEXI @end table ETEXI diff --git a/qemu-img-map.c b/qemu-img-map.c new file mode 100644 index 0000000..cd6bbf4 --- /dev/null +++ b/qemu-img-map.c @@ -0,0 +1,438 @@ +/* + * QEMU disk image utility + * + * Copyright (c) 2010 Jan Kiszka + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ +#include "qemu-img.h" +#include "qemu-option.h" +#include "osdep.h" +#include "block_int.h" +#include <stdio.h> +#include <getopt.h> +#include <pthread.h> +#include <signal.h> + +#define FUSE_USE_VERSION 28 +#include <fuse.h> + +#ifdef CONFIG_LINUX +#include <linux/fs.h> +#endif + +#define ENTRY_INVALID 1 +#define ENTRY_DIRTY 2 + +#define ENTRY_PATH_MAX 16 + +struct map_entry { + struct map_entry *next; + const char *path; + size_t size; + off_t offset; + unsigned int use_counter; + unsigned int flags; +}; + +static struct stat img_stat; +static BlockDriverState *img_bs; +static struct map_entry disk_entry = { .path = "/disk" }; +static char *disk_path; + +#ifdef CONFIG_BLKID_PARTLIST + +#include <blkid.h> + +static pthread_t reader_thread; +static sigset_t wakeup_sigset; +static pthread_mutex_t entry_lock = PTHREAD_MUTEX_INITIALIZER; +static struct map_entry *last_entry = &disk_entry; + +static void *partition_reader(void *unused) +{ + struct map_entry *entry; + blkid_partition par; + blkid_partlist ls; + blkid_probe pr; + int nparts, i; + char *path; + + while (sigwaitinfo(&wakeup_sigset, NULL) >= 0) { + pr = blkid_new_probe_from_filename(disk_path); + if (!pr) { + continue; + } + + ls = blkid_probe_get_partitions(pr); + if (!ls) { + blkid_free_probe(pr); + continue; + } + + nparts = blkid_partlist_numof_partitions(ls); + + for (i = 0; i < nparts; i++) { + entry = calloc(1, sizeof(*entry)); + if (!entry) { + continue; + } + path = malloc(ENTRY_PATH_MAX); + if (!path) { + free(entry); + continue; + } + + par = blkid_partlist_get_partition(ls, i); + + snprintf(path, ENTRY_PATH_MAX, "/partition%d", + blkid_partition_get_partno(par)); + entry->path = path; + entry->size = blkid_partition_get_size(par) * BDRV_SECTOR_SIZE; + entry->offset = blkid_partition_get_start(par) * BDRV_SECTOR_SIZE; + + pthread_mutex_lock(&entry_lock); + + last_entry->next = entry; + last_entry = entry; + + pthread_mutex_unlock(&entry_lock); + } + + blkid_free_probe(pr); + } + + return NULL; +} + +static void update_partitions(void) +{ + struct map_entry *entry = disk_entry.next; + struct map_entry *old; + + /* release old partions */ + pthread_mutex_lock(&entry_lock); + + while (entry) { + old = entry; + entry = entry->next; + if (old->use_counter == 0) { + free((void *)old->path); + free(old); + } else { + old->flags = ENTRY_INVALID; + } + } + + disk_entry.next = NULL; + last_entry = &disk_entry; + + disk_entry.flags &= ~ENTRY_DIRTY; + + pthread_mutex_unlock(&entry_lock); + + /* kick off partition table scan */ + pthread_kill(reader_thread, SIGUSR1); +} + +static void init_reader_thread(void) +{ + sigemptyset(&wakeup_sigset); + sigaddset(&wakeup_sigset, SIGUSR1); + sigprocmask(SIG_BLOCK, &wakeup_sigset, NULL); + + if (pthread_create(&reader_thread, NULL, partition_reader, NULL)) { + error("Could not spawn partition reader thread"); + } +} + +#else /* !CONFIG_BLKID_PARTLIST */ + +static inline void update_partitions(void) { } +static inline void init_reader_thread(void) { } + +#endif /* !CONFIG_BLKID_PARTLIST */ + +static struct map_entry *find_map_entry(const char *path) +{ + struct map_entry *entry = &disk_entry; + + do { + if (strcmp(entry->path, path) == 0) { + break; + } + entry = entry->next; + } while (entry); + + return entry; +} + +static void *map_init(struct fuse_conn_info *conn) +{ + init_reader_thread(); + update_partitions(); + return NULL; +} + +static int map_getattr(const char *path, struct stat *stbuf) +{ + struct map_entry *entry; + int res = 0; + + memset(stbuf, 0, sizeof(struct stat)); + stbuf->st_uid = img_stat.st_uid; + stbuf->st_gid = img_stat.st_gid; + stbuf->st_atime = img_stat.st_atime; + stbuf->st_mtime = img_stat.st_mtime; + stbuf->st_ctime = img_stat.st_ctime; + + if (strcmp(path, "/") == 0) { + stbuf->st_mode = S_IFDIR | 0111 | img_stat.st_mode; + stbuf->st_nlink = 2; + } else { + entry = find_map_entry(path); + if (entry) { + stbuf->st_mode = S_IFREG | img_stat.st_mode; + stbuf->st_nlink = 1; + stbuf->st_size = entry->size; + } else { + res = -ENOENT; + } + } + + return res; +} + +static int map_readdir(const char *path, void *buf, fuse_fill_dir_t filler, + off_t offset, struct fuse_file_info *fi) +{ + struct map_entry *entry; + + if (strcmp(path, "/") != 0) { + return -ENOENT; + } + filler(buf, ".", NULL, 0); + filler(buf, "..", NULL, 0); + for (entry = &disk_entry; entry; entry = entry->next) { + filler(buf, entry->path+1, NULL, 0); + } + + return 0; +} + +static int map_open(const char *path, struct fuse_file_info *fi) +{ + struct map_entry *entry = find_map_entry(path); + + if (!entry) { + return -ENOENT; + } + + entry->use_counter++; + fi->fh = (uint64_t)entry; + + return 0; +} + +static int map_release(const char *path, struct fuse_file_info *fi) +{ + struct map_entry *entry = (struct map_entry *)fi->fh; + + entry->use_counter--; + + if (entry == &disk_entry && entry->flags & ENTRY_DIRTY) { + update_partitions(); + } + if (entry->flags & ENTRY_INVALID && entry->use_counter == 0) { + free((void *)entry->path); + free(entry); + } + + return 0; +} + +static int map_read(const char *path, char *buf, size_t size, off_t offset, + struct fuse_file_info *fi) +{ + struct map_entry *entry = (struct map_entry *)fi->fh; + int err; + + if (entry->flags & ENTRY_INVALID) { + return -ENOENT; + } + + if (offset + size > entry->size) { + size = entry->size - offset; + } + + err = bdrv_read(img_bs, (entry->offset + offset) / BDRV_SECTOR_SIZE, + (uint8_t*)buf, size / BDRV_SECTOR_SIZE); + if (err) { + return err; + } + + return size; +} + +static int map_write(const char *path, const char *buf, size_t size, + off_t offset, struct fuse_file_info *fi) +{ + struct map_entry *entry = (struct map_entry *)fi->fh; + int err; + + if (entry->flags & ENTRY_INVALID) { + return -ENOENT; + } + + err = bdrv_write(img_bs, (entry->offset + offset) / BDRV_SECTOR_SIZE, + (uint8_t*)buf, size / BDRV_SECTOR_SIZE); + if (err) { + return err; + } + + entry->flags |= ENTRY_DIRTY; + + return size; +} + +#if FUSE_VERSION >= 28 +static int map_ioctl(const char *path, int cmd, void *arg, + struct fuse_file_info *fi, unsigned int flags, void *data) +{ + struct map_entry *entry = (struct map_entry *)fi->fh; + + if (entry->flags & ENTRY_INVALID) { + return -ENOENT; + } + + switch (cmd) { +#ifdef CONFIG_LINUX + case BLKGETSIZE64: + *(uint64_t *)data = entry->size; + return 0; +#endif /* CONFIG_LINUX */ + default: + return -ENOTTY; + } +} +#endif /* FUSE_VERSION >= 28 */ + +static struct fuse_operations map_ops = { + .init = map_init, + .getattr = map_getattr, + .readdir = map_readdir, + .open = map_open, + .release = map_release, + .read = map_read, + .write = map_write, +#if FUSE_VERSION >= 28 + .ioctl = map_ioctl, +#endif +}; + +static void QEMU_NORETURN map_help(struct fuse_args *args) +{ + printf("usage: qemu-img map [-F fmt] [FUSE options] filename mountpoint\n" + "\ngeneral options:\n" + " -o opt,[opt...] mount options\n" + " -h --help print help\n" + " -V --version print version\n" + "\nqemu-img options:\n" + " -F fmt image format\n\n"); + fuse_opt_add_arg(args, "-ho"); + fuse_main(args->argc, args->argv, &map_ops, NULL); + exit(1); +} + +int img_map(int argc, char **argv) +{ + struct fuse_args args = FUSE_ARGS_INIT(0, NULL); + const char *filename = NULL; + const char *fmt = NULL; + const char *mountpoint; + char *fs_name; + uint64_t size; + + fuse_opt_add_arg(&args, argv[0]); + fuse_opt_add_arg(&args, "-o"); + fuse_opt_add_arg(&args, "subtype=qemu-img-map"); + + /* block layer is not thread-safe */ + fuse_opt_add_arg(&args, "-s"); + + for (;;) { + static const struct option long_opts[] = { + { "--help", 0, NULL, 'h' }, + { "--version", 0, NULL, 'v' }, + { NULL, 0, NULL, 0 } + }; + int c; + + c = getopt_long(argc, argv, "F:dfsho:", long_opts, NULL); + if (c < 0) { + break; + } + switch (c) { + case 'h': + map_help(&args); + break; + case 'F': + fmt = optarg; + break; + case 'o': + fuse_opt_add_arg(&args, "-o"); + fuse_opt_add_arg(&args, optarg); + break; + case 'd': + fuse_opt_add_arg(&args, "-d"); + break; + case 'f': + fuse_opt_add_arg(&args, "-f"); + break; + default: + /* ignore -s, we enforce it anyway */ + break; + } + } + if (optind + 1 >= argc) { + map_help(&args); + } + + filename = argv[optind++]; + + size = strlen(filename) + 8; + fs_name = malloc(size); + if (!fs_name) { + error("Not enough memory"); + } + snprintf(fs_name, size, "fsname=%s", filename); + fuse_opt_insert_arg(&args, 1, "-o"); + fuse_opt_insert_arg(&args, 2, fs_name); + free(fs_name); + + mountpoint = argv[optind]; + fuse_opt_add_arg(&args, mountpoint); + + size = strlen(mountpoint) + strlen(disk_entry.path) + 1; + disk_path = malloc(size); + if (!disk_path) { + error("Not enough memory"); + } + snprintf(disk_path, size, "%s%s", mountpoint, disk_entry.path); + + if (stat(filename, &img_stat) < 0) { + perror("Unable to process image file"); + exit(1); + } + img_stat.st_mode &= S_IRWXU | S_IRWXG | S_IRWXO; + + img_bs = bdrv_new_open(filename, fmt, 0); + if (!img_bs) { + error("Could not open '%s'", filename); + } + bdrv_get_geometry(img_bs, &size); + disk_entry.size = size * BDRV_SECTOR_SIZE; + + return fuse_main(args.argc, args.argv, &map_ops, NULL); +} diff --git a/qemu-img.c b/qemu-img.c index 9b28664..28b8427 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -21,7 +21,7 @@ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * THE SOFTWARE. */ -#include "qemu-common.h" +#include "qemu-img.h" #include "qemu-option.h" #include "osdep.h" #include "block_int.h" @@ -39,7 +39,7 @@ typedef struct img_cmd_t { /* Default to cache=writeback as data integrity is not important for qemu-tcg. */ #define BRDV_O_FLAGS BDRV_O_CACHE_WB -static void QEMU_NORETURN error(const char *fmt, ...) +void QEMU_NORETURN error(const char *fmt, ...) { va_list ap; va_start(ap, fmt); @@ -97,6 +97,9 @@ static void help(void) printf("%s\nSupported formats:", help_msg); bdrv_iterate_format(format_print, NULL); printf("\n"); +#ifdef CONFIG_FUSE + printf("\nInvoke 'qemu-img map --help' to list FUSE options.\n"); +#endif exit(1); } @@ -188,9 +191,9 @@ static int read_password(char *buf, int buf_size) } #endif -static BlockDriverState *bdrv_new_open(const char *filename, - const char *fmt, - int readonly) +BlockDriverState *bdrv_new_open(const char *filename, + const char *fmt, + int readonly) { BlockDriverState *bs; BlockDriver *drv; diff --git a/qemu-img.h b/qemu-img.h new file mode 100644 index 0000000..1bf0f27 --- /dev/null +++ b/qemu-img.h @@ -0,0 +1,13 @@ +#ifndef QEMU_IMG_H +#define QEMU_IMG_H + +#include "qemu-common.h" + +void QEMU_NORETURN error(const char *fmt, ...); +BlockDriverState *bdrv_new_open(const char *filename, + const char *fmt, + int readonly); + +int img_map(int argc, char **argv); + +#endif diff --git a/qemu-img.texi b/qemu-img.texi index ac97854..a85f454 100644 --- a/qemu-img.texi +++ b/qemu-img.texi @@ -106,6 +106,16 @@ they are displayed too. @item snapshot [-l | -a @var{snapshot} | -c @var{snapshot} | -d @var{snapshot} ] @var{filename} List, apply, create or delete snapshots in image @var{filename}. + +@item map [-F @var{fmt}] [@var{FUSE options}] @var{filename} @var{mountpoint} + +Make a disk image accessible via pseudo devices under @var{mountpoint}. This +command will expose the whole raw image as well as individual partitions, the +latter depending on the parsing capabilies of libblkid. The exposed disk +device file can be passed to partitioning tools, and any device file containing +a valid filesystem can be loop-back mounted to access its content (e.g. via +mountlo without any root privileges). For the full list of FUSE-related +options, invoke @code{qemu-img map --help}. @end table Supported image file formats:
This adds the "map" subcommand to qemu-img. It is able to expose the raw content of a disk image via a FUSE filesystem. Both the whole disk can be accessed, e.g. to run partitioning tools against it, as well as individual partitions. This allows to create new filesystems in the image or loop-back mount exiting ones. Using the great mountlo tool from the FUSE collection [1][2], the latter can even be done by non-root users (the former anyway). There are some dependency to fulfill to gain all features: Partition scanning is done via recent libblkid (I used version 2.17.1). If this library is not available, only the disk file is provide. Fortunately, mountlo can do partition scanning as well ("-p n") to work around this. Moreover, libfuse >= 2.8 and a host kernel >= 2.6.29 is required for seamless disk access via fdisk. Otherwise, the BLKGETSIZE64 IOCTL cannot be provided, and the number of cylinders has to set explicitly (e.g. via "-C n"). This work was inspired by Ashley Saulsbury's qemu-diskp [3]. [1] http://sourceforge.net/apps/mediawiki/fuse/index.php?title=FileSystems#Mountlo [2] http://sourceforge.net/projects/fuse/files/mountlo/ [3] http://www.saulsbury.org/software/virtualization.html Signed-off-by: Jan Kiszka <jan.kiszka@web.de> --- Makefile | 6 +- Makefile.objs | 6 + configure | 55 +++++++ qemu-img-cmds.hx | 11 ++ qemu-img-map.c | 438 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ qemu-img.c | 13 +- qemu-img.h | 13 ++ qemu-img.texi | 10 ++ 8 files changed, 545 insertions(+), 7 deletions(-) create mode 100644 qemu-img-map.c create mode 100644 qemu-img.h