Message ID | 20210823164157.751807-1-philmd@redhat.com |
---|---|
Headers | show |
Series | physmem: Have flaview API check bus permission from MemTxAttrs argument | expand |
On Mon, 23 Aug 2021 at 17:42, Philippe Mathieu-Daudé <philmd@redhat.com> wrote: > > This series aim to kill a recent class of bug, the infamous > "DMA reentrancy" issues found by Alexander while fuzzing. > > Introduce the 'bus_perm' field in MemTxAttrs, defining 3 bits: > > - MEMTXPERM_UNSPECIFIED (current default, unchanged behavior) > - MEMTXPERM_UNRESTRICTED (allow list approach) > - MEMTXPERM_RAM_DEVICE (example of deny list approach) > > If a transaction permission is not allowed (for example access > to non-RAM device), we return the specific MEMTX_BUS_ERROR. > > Permissions are checked in after the flatview is resolved, and > before the access is done, in a new function: flatview_access_allowed(). So I'm not going to say 'no' to this, because we have a real recursive-device-handling problem and I don't have a better idea to hand, but the thing about this is that we end up with behaviour which is not what the real hardware does. I'm not aware of any DMA device which has this kind of "can only DMA to/from RAM, and aborts on access to a device" behaviour... -- PMM
On Mon, Aug 23, 2021 at 08:10:50PM +0100, Peter Maydell wrote: > On Mon, 23 Aug 2021 at 17:42, Philippe Mathieu-Daudé <philmd@redhat.com> wrote: > > > > This series aim to kill a recent class of bug, the infamous > > "DMA reentrancy" issues found by Alexander while fuzzing. > > > > Introduce the 'bus_perm' field in MemTxAttrs, defining 3 bits: > > > > - MEMTXPERM_UNSPECIFIED (current default, unchanged behavior) > > - MEMTXPERM_UNRESTRICTED (allow list approach) > > - MEMTXPERM_RAM_DEVICE (example of deny list approach) > > > > If a transaction permission is not allowed (for example access > > to non-RAM device), we return the specific MEMTX_BUS_ERROR. > > > > Permissions are checked in after the flatview is resolved, and > > before the access is done, in a new function: flatview_access_allowed(). > > So I'm not going to say 'no' to this, because we have a real > recursive-device-handling problem and I don't have a better > idea to hand, but the thing about this is that we end up with > behaviour which is not what the real hardware does. I'm not > aware of any DMA device which has this kind of "can only DMA > to/from RAM, and aborts on access to a device" behaviour... Sorry for not being familiar with the context - is there more info regarding the problem to fix? I'm looking at the links mentioned in the old series: https://lore.kernel.org/qemu-devel/20200903110831.353476-12-philmd@redhat.com/ https://bugs.launchpad.net/qemu/+bug/1886362 https://bugs.launchpad.net/qemu/+bug/1888606 They seem all marked as fixed now. Thanks,
On 210823 1650, Peter Xu wrote: > On Mon, Aug 23, 2021 at 08:10:50PM +0100, Peter Maydell wrote: > > On Mon, 23 Aug 2021 at 17:42, Philippe Mathieu-Daudé <philmd@redhat.com> wrote: > > > > > > This series aim to kill a recent class of bug, the infamous > > > "DMA reentrancy" issues found by Alexander while fuzzing. > > > > > > Introduce the 'bus_perm' field in MemTxAttrs, defining 3 bits: > > > > > > - MEMTXPERM_UNSPECIFIED (current default, unchanged behavior) > > > - MEMTXPERM_UNRESTRICTED (allow list approach) > > > - MEMTXPERM_RAM_DEVICE (example of deny list approach) > > > > > > If a transaction permission is not allowed (for example access > > > to non-RAM device), we return the specific MEMTX_BUS_ERROR. > > > > > > Permissions are checked in after the flatview is resolved, and > > > before the access is done, in a new function: flatview_access_allowed(). > > > > So I'm not going to say 'no' to this, because we have a real > > recursive-device-handling problem and I don't have a better > > idea to hand, but the thing about this is that we end up with > > behaviour which is not what the real hardware does. I'm not > > aware of any DMA device which has this kind of "can only DMA > > to/from RAM, and aborts on access to a device" behaviour... > > Sorry for not being familiar with the context - is there more info regarding > the problem to fix? I'm looking at the links mentioned in the old series: > > https://lore.kernel.org/qemu-devel/20200903110831.353476-12-philmd@redhat.com/ > https://bugs.launchpad.net/qemu/+bug/1886362 > https://bugs.launchpad.net/qemu/+bug/1888606 > > They seem all marked as fixed now. Here are some that should still reproduce: https://gitlab.com/qemu-project/qemu/-/issues/542 https://gitlab.com/qemu-project/qemu/-/issues/540 https://gitlab.com/qemu-project/qemu/-/issues/541 https://gitlab.com/qemu-project/qemu/-/issues/62 https://lore.kernel.org/qemu-devel/20210218140629.373646-1-ppandit@redhat.com/ (CVE-2021-20255) There's also this one, that I don't think I ever created a bug report for (working on it now): https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=33247 -Alex > > Thanks, > > -- > Peter Xu >
On 8/24/21 12:26 AM, Alexander Bulekov wrote: > On 210823 1650, Peter Xu wrote: >> On Mon, Aug 23, 2021 at 08:10:50PM +0100, Peter Maydell wrote: >>> On Mon, 23 Aug 2021 at 17:42, Philippe Mathieu-Daudé <philmd@redhat.com> wrote: >>>> >>>> This series aim to kill a recent class of bug, the infamous >>>> "DMA reentrancy" issues found by Alexander while fuzzing. >>>> >>>> Introduce the 'bus_perm' field in MemTxAttrs, defining 3 bits: >>>> >>>> - MEMTXPERM_UNSPECIFIED (current default, unchanged behavior) >>>> - MEMTXPERM_UNRESTRICTED (allow list approach) >>>> - MEMTXPERM_RAM_DEVICE (example of deny list approach) >>>> >>>> If a transaction permission is not allowed (for example access >>>> to non-RAM device), we return the specific MEMTX_BUS_ERROR. >>>> >>>> Permissions are checked in after the flatview is resolved, and >>>> before the access is done, in a new function: flatview_access_allowed(). >>> >>> So I'm not going to say 'no' to this, because we have a real >>> recursive-device-handling problem and I don't have a better >>> idea to hand, but the thing about this is that we end up with >>> behaviour which is not what the real hardware does. I'm not >>> aware of any DMA device which has this kind of "can only DMA >>> to/from RAM, and aborts on access to a device" behaviour... >> >> Sorry for not being familiar with the context - is there more info regarding >> the problem to fix? I'm looking at the links mentioned in the old series: >> >> https://lore.kernel.org/qemu-devel/20200903110831.353476-12-philmd@redhat.com/ >> https://bugs.launchpad.net/qemu/+bug/1886362 >> https://bugs.launchpad.net/qemu/+bug/1888606 >> >> They seem all marked as fixed now. > > Here are some that should still reproduce: > https://gitlab.com/qemu-project/qemu/-/issues/542 > https://gitlab.com/qemu-project/qemu/-/issues/540 > https://gitlab.com/qemu-project/qemu/-/issues/541 > https://gitlab.com/qemu-project/qemu/-/issues/62 > https://lore.kernel.org/qemu-devel/20210218140629.373646-1-ppandit@redhat.com/ (CVE-2021-20255) Also 305, 451, 557. Issues list tracked here: https://gitlab.com/qemu-project/qemu/-/issues/556 (Thanks Alex for updating it!) > > There's also this one, that I don't think I ever created a bug report > for (working on it now): > https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=33247 > -Alex > >> >> Thanks, >> >> -- >> Peter Xu >> >
On Mon, Aug 23, 2021 at 06:41:52PM +0200, Philippe Mathieu-Daudé wrote: > This series aim to kill a recent class of bug, the infamous > "DMA reentrancy" issues found by Alexander while fuzzing. > > Introduce the 'bus_perm' field in MemTxAttrs, defining 3 bits: > > - MEMTXPERM_UNSPECIFIED (current default, unchanged behavior) > - MEMTXPERM_UNRESTRICTED (allow list approach) > - MEMTXPERM_RAM_DEVICE (example of deny list approach) These names don't hint at their descriptions. I wouldn't be able to tell what they do based on the name. I'll think more about naming after reviewing the patches.
On Mon, Aug 23, 2021 at 08:10:50PM +0100, Peter Maydell wrote: > On Mon, 23 Aug 2021 at 17:42, Philippe Mathieu-Daudé <philmd@redhat.com> wrote: > > > > This series aim to kill a recent class of bug, the infamous > > "DMA reentrancy" issues found by Alexander while fuzzing. > > > > Introduce the 'bus_perm' field in MemTxAttrs, defining 3 bits: > > > > - MEMTXPERM_UNSPECIFIED (current default, unchanged behavior) > > - MEMTXPERM_UNRESTRICTED (allow list approach) > > - MEMTXPERM_RAM_DEVICE (example of deny list approach) > > > > If a transaction permission is not allowed (for example access > > to non-RAM device), we return the specific MEMTX_BUS_ERROR. > > > > Permissions are checked in after the flatview is resolved, and > > before the access is done, in a new function: flatview_access_allowed(). > > So I'm not going to say 'no' to this, because we have a real > recursive-device-handling problem and I don't have a better > idea to hand, but the thing about this is that we end up with > behaviour which is not what the real hardware does. I'm not > aware of any DMA device which has this kind of "can only DMA > to/from RAM, and aborts on access to a device" behaviour... > Yes, I agree. Having said that, There are DMA devices that do indicate to the interconnect and peripherals if they are targeting "normal" memory or device (together with cacheability, buffering and ordering attributes). Accessing registers of a device with "normal" memory and cache attributes is not a good idea and may lead to all kinds of weirdness on real HW. IMO, it would be better to model something like those attributes rather than permissions. And it's probably a good idea to not call it MEMTXPERM_RAM_DEVICE since in the AMBA documentation it's Normal Memory vs Device (calling it RAM_DEVICE is confusing to me at least). Adding a "memory" attribute to MemTxAttrs may be enough? If it's set, the access would only target memories. If targeting a device it could perhaps be logged and dropped rather than aborted? If not set (the default), accesses would target anything... Cheers, Edgar
On Mon, 23 Aug 2021 at 21:50, Peter Xu <peterx@redhat.com> wrote: > > On Mon, Aug 23, 2021 at 08:10:50PM +0100, Peter Maydell wrote: > > On Mon, 23 Aug 2021 at 17:42, Philippe Mathieu-Daudé <philmd@redhat.com> wrote: > > > > > > This series aim to kill a recent class of bug, the infamous > > > "DMA reentrancy" issues found by Alexander while fuzzing. > > > > > > Introduce the 'bus_perm' field in MemTxAttrs, defining 3 bits: > > > > > > - MEMTXPERM_UNSPECIFIED (current default, unchanged behavior) > > > - MEMTXPERM_UNRESTRICTED (allow list approach) > > > - MEMTXPERM_RAM_DEVICE (example of deny list approach) > > > > > > If a transaction permission is not allowed (for example access > > > to non-RAM device), we return the specific MEMTX_BUS_ERROR. > > > > > > Permissions are checked in after the flatview is resolved, and > > > before the access is done, in a new function: flatview_access_allowed(). > > > > So I'm not going to say 'no' to this, because we have a real > > recursive-device-handling problem and I don't have a better > > idea to hand, but the thing about this is that we end up with > > behaviour which is not what the real hardware does. I'm not > > aware of any DMA device which has this kind of "can only DMA > > to/from RAM, and aborts on access to a device" behaviour... > > Sorry for not being familiar with the context - is there more info regarding > the problem to fix? So, the general problem is that we have a whole class of bugs that look like this: * Device A is DMA-capable. It also has a set of memory mapped registers which can be used to control it. * Malicious guest code (or the fuzzer) programs A's DMA engine to do a DMA read or write to the address where A's own registers are mapped. * Typically, the MemoryRegionOps write function for the register block will handle the "write to start-dma register" by doing the DMA, ie calling address_space_write(), pci_dma_write(), or equivalent. Because of the target address the guest code has set, that will result in the memory subsystem calling back into the same MemoryRegionOps write function, recursively. * Our code implementing the model of device A is not at all expecting this re-entrancy, and might crash, access freed memory, or otherwise misbehave. You can elaborate on that basic scenario, for instance with a loop of multiple devices where you program device A to do a DMA write to device B's registers which starts device B doing a DMA write to A's registers. Nor is it inherently limited to DMA -- device A could be made to assert a qemu_irq that is connected to device B in a way that causes device B to do something that results in code calling back into device A; or maybe device A DMAs to a register in device B that implements a device-reset on device A. DMA is just the easiest for guest code to set up and has the least restrictions on how it's connected up. In some specific cases we have "fixed" individual instances of this bug by putting in checks or changes to whatever device model the fuzzer happened to find a problem with, or put in slightly wider-scale attempts to catch this (eg commit 22dc8663d9f which prevents re-entering a NIC device's packet-rx callback function if the guest has set it up so that a received packet results in DMA that triggers another received packet). But we don't have a coherent model of how we ought to structure device models that can avoid this problem in a general way, and I think that until we do we're liable to keep running into specific bugs, some of which will be (or at least be labelled as) "security issues". Philippe's series here tries to fix this for at least any variety of this bug where there is a DMA access in the loop, by forbidding DMA accesses to MMIO regions not backed by RAM. That works, in that it breaks the loop; as I mentioned in my other email, it does it in a way that's not the way real h/w behaves. Unfortunately "what does real h/w do?" is not in this situation necessarily a helpful guide for QEMU design: I think that real hardware: (a) often doesn't see the same kind of problem because the design will usually decouple the DMA engine from the register-access logic in a way that means it naturally doesn't literally lock up (b) often won't have been designed to deal with "software programs a DMA-to-self" either, but the threat model for real hw is different, in that software has many ways of making the overall system crash, hang or misbehave; it often doesn't have the "need to allow untrusted software to touch this device" situation. One could have QEMU work somewhat like (a) by mandating that all DMA of any kind was done in separate bottom-half routines and not directly from the register-write code. That would probably reduce performance and be a lot of code to restructure. It would deal also with another class of "guest code can make QEMU hang by programming it to do an enormous DMA all at once or by setting up an infinitely looping chain of DMA commands" bugs, though. I was vaguely tossing an idea around in the back of my mind about whether you could have a flag on devices that marked them as "this device is currently involved in IO", such that you could then just fail the last DMA (or qemu_irq_set, or whatever) that would complete the loop back to a device that was already doing IO. But that would need a lot of thinking through to figure out if it's feasible, and it's probably a lot of code change. thanks -- PMM
Hi, > I was vaguely tossing an idea around in the back of my mind > about whether you could have a flag on devices that marked > them as "this device is currently involved in IO", such that > you could then just fail the last DMA (or qemu_irq_set, or > whatever) that would complete the loop back to a device that > was already doing IO. But that would need a lot of thinking > through to figure out if it's feasible, and it's probably > a lot of code change. Quick & dirty hack trying the above. Not much code, it is opt-in per MemoryRegion (so less overhead for devices which already handle all DMA in a BH), tracks state in DeviceState. Adds a check to a rather hot code path though. Not tested yet (stopped investigating when I noticed Philippe tries to fix the same thing with another approach). Not benchmarked. Maybe it helps ... take care, Gerd From 80e58a2cd2c630f0bddd9d0eaee71abb7eeb9440 Mon Sep 17 00:00:00 2001 From: Gerd Hoffmann <kraxel@redhat.com> Date: Tue, 17 Aug 2021 07:35:37 +0200 Subject: [PATCH] allow track active mmio handlers --- include/exec/memory.h | 1 + include/hw/qdev-core.h | 1 + softmmu/memory.c | 24 ++++++++++++++++++++++-- 3 files changed, 24 insertions(+), 2 deletions(-) diff --git a/include/exec/memory.h b/include/exec/memory.h index c3d417d317f0..b1883d45e817 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -265,6 +265,7 @@ struct MemoryRegionOps { */ bool unaligned; } impl; + bool block_reenter; }; typedef struct MemoryRegionClass { diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h index bafc311bfa1b..4cf281a81fa9 100644 --- a/include/hw/qdev-core.h +++ b/include/hw/qdev-core.h @@ -191,6 +191,7 @@ struct DeviceState { int instance_id_alias; int alias_required_for_version; ResettableState reset; + bool io_handler_active; }; struct DeviceListener { diff --git a/softmmu/memory.c b/softmmu/memory.c index bfedaf9c4dfc..5eb5dd465dd2 100644 --- a/softmmu/memory.c +++ b/softmmu/memory.c @@ -437,7 +437,18 @@ static MemTxResult memory_region_read_accessor(MemoryRegion *mr, { uint64_t tmp; - tmp = mr->ops->read(mr->opaque, addr, size); + if (mr->ops->block_reenter) { + DeviceState *dev = DEVICE(mr->owner); + if (!dev->io_handler_active) { + dev->io_handler_active = true; + tmp = mr->ops->read(mr->opaque, addr, size); + dev->io_handler_active = false; + } else { + tmp = MEMTX_OK; + } + } else { + tmp = mr->ops->read(mr->opaque, addr, size); + } if (mr->subpage) { trace_memory_region_subpage_read(get_cpu_index(), mr, addr, tmp, size); } else if (trace_event_get_state_backends(TRACE_MEMORY_REGION_OPS_READ)) { @@ -489,7 +500,16 @@ static MemTxResult memory_region_write_accessor(MemoryRegion *mr, trace_memory_region_ops_write(get_cpu_index(), mr, abs_addr, tmp, size, memory_region_name(mr)); } - mr->ops->write(mr->opaque, addr, tmp, size); + if (mr->ops->block_reenter) { + DeviceState *dev = DEVICE(mr->owner); + if (!dev->io_handler_active) { + dev->io_handler_active = true; + mr->ops->write(mr->opaque, addr, tmp, size); + dev->io_handler_active = false; + } + } else { + mr->ops->write(mr->opaque, addr, tmp, size); + } return MEMTX_OK; }
Gerd Hoffmann <kraxel@redhat.com> 于2021年8月24日周二 下午8:02写道: > > Hi, > > > I was vaguely tossing an idea around in the back of my mind > > about whether you could have a flag on devices that marked > > them as "this device is currently involved in IO", such that > > you could then just fail the last DMA (or qemu_irq_set, or > > whatever) that would complete the loop back to a device that > > was already doing IO. But that would need a lot of thinking > > through to figure out if it's feasible, and it's probably > > a lot of code change. > > Quick & dirty hack trying the above. Not much code, it is opt-in per > MemoryRegion (so less overhead for devices which already handle all DMA > in a BH), tracks state in DeviceState. Adds a check to a rather hot > code path though. Not tested yet (stopped investigating when I noticed > Philippe tries to fix the same thing with another approach). Not > benchmarked. > > Maybe it helps ... > Gerd's patch just remind my approach here, Just add here: https://mail.gnu.org/archive/html/qemu-devel/2020-09/msg00906.html But I check and record it in the device MR handler. Thanks, Li Qiang > take care, > Gerd > > From 80e58a2cd2c630f0bddd9d0eaee71abb7eeb9440 Mon Sep 17 00:00:00 2001 > From: Gerd Hoffmann <kraxel@redhat.com> > Date: Tue, 17 Aug 2021 07:35:37 +0200 > Subject: [PATCH] allow track active mmio handlers > > --- > include/exec/memory.h | 1 + > include/hw/qdev-core.h | 1 + > softmmu/memory.c | 24 ++++++++++++++++++++++-- > 3 files changed, 24 insertions(+), 2 deletions(-) > > diff --git a/include/exec/memory.h b/include/exec/memory.h > index c3d417d317f0..b1883d45e817 100644 > --- a/include/exec/memory.h > +++ b/include/exec/memory.h > @@ -265,6 +265,7 @@ struct MemoryRegionOps { > */ > bool unaligned; > } impl; > + bool block_reenter; > }; > > typedef struct MemoryRegionClass { > diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h > index bafc311bfa1b..4cf281a81fa9 100644 > --- a/include/hw/qdev-core.h > +++ b/include/hw/qdev-core.h > @@ -191,6 +191,7 @@ struct DeviceState { > int instance_id_alias; > int alias_required_for_version; > ResettableState reset; > + bool io_handler_active; > }; > > struct DeviceListener { > diff --git a/softmmu/memory.c b/softmmu/memory.c > index bfedaf9c4dfc..5eb5dd465dd2 100644 > --- a/softmmu/memory.c > +++ b/softmmu/memory.c > @@ -437,7 +437,18 @@ static MemTxResult memory_region_read_accessor(MemoryRegion *mr, > { > uint64_t tmp; > > - tmp = mr->ops->read(mr->opaque, addr, size); > + if (mr->ops->block_reenter) { > + DeviceState *dev = DEVICE(mr->owner); > + if (!dev->io_handler_active) { > + dev->io_handler_active = true; > + tmp = mr->ops->read(mr->opaque, addr, size); > + dev->io_handler_active = false; > + } else { > + tmp = MEMTX_OK; > + } > + } else { > + tmp = mr->ops->read(mr->opaque, addr, size); > + } > if (mr->subpage) { > trace_memory_region_subpage_read(get_cpu_index(), mr, addr, tmp, size); > } else if (trace_event_get_state_backends(TRACE_MEMORY_REGION_OPS_READ)) { > @@ -489,7 +500,16 @@ static MemTxResult memory_region_write_accessor(MemoryRegion *mr, > trace_memory_region_ops_write(get_cpu_index(), mr, abs_addr, tmp, size, > memory_region_name(mr)); > } > - mr->ops->write(mr->opaque, addr, tmp, size); > + if (mr->ops->block_reenter) { > + DeviceState *dev = DEVICE(mr->owner); > + if (!dev->io_handler_active) { > + dev->io_handler_active = true; > + mr->ops->write(mr->opaque, addr, tmp, size); > + dev->io_handler_active = false; > + } > + } else { > + mr->ops->write(mr->opaque, addr, tmp, size); > + } > return MEMTX_OK; > } > > -- > 2.31.1 >
On Mon, Aug 23, 2021 at 08:10:50PM +0100, Peter Maydell wrote: > On Mon, 23 Aug 2021 at 17:42, Philippe Mathieu-Daudé <philmd@redhat.com> wrote: > > > > This series aim to kill a recent class of bug, the infamous > > "DMA reentrancy" issues found by Alexander while fuzzing. > > > > Introduce the 'bus_perm' field in MemTxAttrs, defining 3 bits: > > > > - MEMTXPERM_UNSPECIFIED (current default, unchanged behavior) > > - MEMTXPERM_UNRESTRICTED (allow list approach) > > - MEMTXPERM_RAM_DEVICE (example of deny list approach) > > > > If a transaction permission is not allowed (for example access > > to non-RAM device), we return the specific MEMTX_BUS_ERROR. > > > > Permissions are checked in after the flatview is resolved, and > > before the access is done, in a new function: flatview_access_allowed(). > > So I'm not going to say 'no' to this, because we have a real > recursive-device-handling problem and I don't have a better > idea to hand, but the thing about this is that we end up with > behaviour which is not what the real hardware does. I'm not > aware of any DMA device which has this kind of "can only DMA > to/from RAM, and aborts on access to a device" behaviour... Points that have come up in previous discussions on this topic: - We probably won't be able to find out the actual hardware behavior for all device models in QEMU. Strict RAM-only DMA restrictions can be merged early in the QEMU 6.2 development cycle so there's plenty of time to identify regressions. The benefit of a strict policy is that we eliminate this class of bugs for most devices now and in the future. - If the risk of regressions is too high, then this API can be used on a case-by-case basis to fix bugs such as those identified by Alexander's fuzzer. We'll be plagued with this class of bugs in the future though, so I prefer a strict policy. - DMA capabilities depend on the host bus adapter/controller. In order to faithfully emulate real hardware we need to know how it behaves. That needs to be done for each host bus adapter (e.g. PCI controllers). - SysBus devices each have their own behavior wrt device-to-device DMA. Stefan
Hi, Peter, Gerd, On Tue, Aug 24, 2021 at 02:01:53PM +0200, Gerd Hoffmann wrote: > Hi, > > > I was vaguely tossing an idea around in the back of my mind > > about whether you could have a flag on devices that marked > > them as "this device is currently involved in IO", such that > > you could then just fail the last DMA (or qemu_irq_set, or > > whatever) that would complete the loop back to a device that > > was already doing IO. But that would need a lot of thinking > > through to figure out if it's feasible, and it's probably > > a lot of code change. (Thanks for the write-up, Peter; it helps a lot) > > Quick & dirty hack trying the above. Not much code, it is opt-in per > MemoryRegion (so less overhead for devices which already handle all DMA > in a BH), tracks state in DeviceState. Adds a check to a rather hot > code path though. Not tested yet (stopped investigating when I noticed > Philippe tries to fix the same thing with another approach). Not > benchmarked. > > Maybe it helps ... > > take care, > Gerd > > From 80e58a2cd2c630f0bddd9d0eaee71abb7eeb9440 Mon Sep 17 00:00:00 2001 > From: Gerd Hoffmann <kraxel@redhat.com> > Date: Tue, 17 Aug 2021 07:35:37 +0200 > Subject: [PATCH] allow track active mmio handlers > > --- > include/exec/memory.h | 1 + > include/hw/qdev-core.h | 1 + > softmmu/memory.c | 24 ++++++++++++++++++++++-- > 3 files changed, 24 insertions(+), 2 deletions(-) > > diff --git a/include/exec/memory.h b/include/exec/memory.h > index c3d417d317f0..b1883d45e817 100644 > --- a/include/exec/memory.h > +++ b/include/exec/memory.h > @@ -265,6 +265,7 @@ struct MemoryRegionOps { > */ > bool unaligned; > } impl; > + bool block_reenter; > }; > > typedef struct MemoryRegionClass { > diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h > index bafc311bfa1b..4cf281a81fa9 100644 > --- a/include/hw/qdev-core.h > +++ b/include/hw/qdev-core.h > @@ -191,6 +191,7 @@ struct DeviceState { > int instance_id_alias; > int alias_required_for_version; > ResettableState reset; > + bool io_handler_active; > }; > > struct DeviceListener { > diff --git a/softmmu/memory.c b/softmmu/memory.c > index bfedaf9c4dfc..5eb5dd465dd2 100644 > --- a/softmmu/memory.c > +++ b/softmmu/memory.c > @@ -437,7 +437,18 @@ static MemTxResult memory_region_read_accessor(MemoryRegion *mr, > { > uint64_t tmp; > > - tmp = mr->ops->read(mr->opaque, addr, size); > + if (mr->ops->block_reenter) { > + DeviceState *dev = DEVICE(mr->owner); > + if (!dev->io_handler_active) { > + dev->io_handler_active = true; > + tmp = mr->ops->read(mr->opaque, addr, size); > + dev->io_handler_active = false; > + } else { > + tmp = MEMTX_OK; > + } > + } else { > + tmp = mr->ops->read(mr->opaque, addr, size); > + } > if (mr->subpage) { > trace_memory_region_subpage_read(get_cpu_index(), mr, addr, tmp, size); > } else if (trace_event_get_state_backends(TRACE_MEMORY_REGION_OPS_READ)) { > @@ -489,7 +500,16 @@ static MemTxResult memory_region_write_accessor(MemoryRegion *mr, > trace_memory_region_ops_write(get_cpu_index(), mr, abs_addr, tmp, size, > memory_region_name(mr)); > } > - mr->ops->write(mr->opaque, addr, tmp, size); > + if (mr->ops->block_reenter) { > + DeviceState *dev = DEVICE(mr->owner); > + if (!dev->io_handler_active) { > + dev->io_handler_active = true; > + mr->ops->write(mr->opaque, addr, tmp, size); > + dev->io_handler_active = false; > + } > + } else { > + mr->ops->write(mr->opaque, addr, tmp, size); > + } > return MEMTX_OK; > } Can I read this as a better approach if it still allows P2P so things that Paolo and Qiang used to mention will still work? https://lore.kernel.org/qemu-devel/7e4fd726-07e9-dc09-d66b-5692dd51820f@redhat.com/ https://lore.kernel.org/qemu-devel/CAKXe6S+v4z_PYbZ6MMzEZk7Q0Qc+q9tzL+a8918U_-XR=aj7RA@mail.gmail.com/ Can we do that similarly for qemu_set_irq() and friends but based on IRQState? Thanks,