Message ID | 20211215182421.418374-1-philmd@redhat.com |
---|---|
Headers | show |
Series | physmem: Have flaview API check bus permission from MemTxAttrs argument | expand |
On Wed, Dec 15, 2021 at 07:24:18PM +0100, Philippe Mathieu-Daudé wrote: > This series aim to kill a recent class of bug, the infamous > "DMA reentrancy" issues found by Alexander while fuzzing. I took a look at how to protect DMA transactions in VIRTIO devices. It will require setting the MemTxAttrs for address_space_ld/st_le/be_cached calls. Errors on write (store) can be ignored. Errors on read (load) are a bit more questionable since the device performs some operation based on the loaded value, but at this point the driver has already caused the device to do something no correct driver does (as of today, it could change in the future...) so undefined device behavior might be okay. It would be easier to be confident if there was a single place to disable DMA re-entrancy for a device. The currently proposed API requires per-device code audits and fixes. It leaves decisions to the developer of each device. This will be a lot of work to fix and we cannot be confident that everything has been covered since this is an opt-in mechanism. For these reasons it seems likely that DMA re-entrancy issues will continue to creep in. I think the only way to rule out this class of bugs is to implement a centralized change that doesn't involve fixing every DMA access in QEMU. Thoughts? Stefan
On 220124 1630, Stefan Hajnoczi wrote: > On Wed, Dec 15, 2021 at 07:24:18PM +0100, Philippe Mathieu-Daudé wrote: > > This series aim to kill a recent class of bug, the infamous > > "DMA reentrancy" issues found by Alexander while fuzzing. > > I took a look at how to protect DMA transactions in VIRTIO devices. It > will require setting the MemTxAttrs for address_space_ld/st_le/be_cached > calls. Errors on write (store) can be ignored. Errors on read (load) are > a bit more questionable since the device performs some operation based > on the loaded value, but at this point the driver has already caused the > device to do something no correct driver does (as of today, it could > change in the future...) so undefined device behavior might be okay. > > It would be easier to be confident if there was a single place to > disable DMA re-entrancy for a device. The currently proposed API > requires per-device code audits and fixes. It leaves decisions to the > developer of each device. This will be a lot of work to fix and we > cannot be confident that everything has been covered since this is an > opt-in mechanism. > > For these reasons it seems likely that DMA re-entrancy issues will > continue to creep in. I think the only way to rule out this class of > bugs is to implement a centralized change that doesn't involve fixing > every DMA access in QEMU. > > Thoughts? Hi Stefan, Do you have some ideas about how to do this centrally? There were at least two attempts to do this in a centralized way, but it seems there is some worry that edge cases will break. However, I'm not sure there were any concrete examples of such breakages. [1] https://lore.kernel.org/all/20210824120153.altqys6jjiuxh35p@sirius.home.kraxel.org/ [2] https://lore.kernel.org/all/20211217030858.834822-1-alxndr@bu.edu/ (AFAIK Neither handles the BH->DMA->MMIO case, at the moment) -Alex > > Stefan
On Mon, Jan 24, 2022 at 11:50:10AM -0500, Alexander Bulekov wrote: > On 220124 1630, Stefan Hajnoczi wrote: > > On Wed, Dec 15, 2021 at 07:24:18PM +0100, Philippe Mathieu-Daudé wrote: > > > This series aim to kill a recent class of bug, the infamous > > > "DMA reentrancy" issues found by Alexander while fuzzing. > > > > I took a look at how to protect DMA transactions in VIRTIO devices. It > > will require setting the MemTxAttrs for address_space_ld/st_le/be_cached > > calls. Errors on write (store) can be ignored. Errors on read (load) are > > a bit more questionable since the device performs some operation based > > on the loaded value, but at this point the driver has already caused the > > device to do something no correct driver does (as of today, it could > > change in the future...) so undefined device behavior might be okay. > > > > It would be easier to be confident if there was a single place to > > disable DMA re-entrancy for a device. The currently proposed API > > requires per-device code audits and fixes. It leaves decisions to the > > developer of each device. This will be a lot of work to fix and we > > cannot be confident that everything has been covered since this is an > > opt-in mechanism. > > > > For these reasons it seems likely that DMA re-entrancy issues will > > continue to creep in. I think the only way to rule out this class of > > bugs is to implement a centralized change that doesn't involve fixing > > every DMA access in QEMU. > > > > Thoughts? > > Hi Stefan, > Do you have some ideas about how to do this centrally? > There were at least two attempts to do this in a centralized way, but it > seems there is some worry that edge cases will break. However, I'm > not sure there were any concrete examples of such breakages. > > [1] https://lore.kernel.org/all/20210824120153.altqys6jjiuxh35p@sirius.home.kraxel.org/ > [2] https://lore.kernel.org/all/20211217030858.834822-1-alxndr@bu.edu/ > (AFAIK Neither handles the BH->DMA->MMIO case, at the moment) Regressions are the problem with defaulting to RAM-only DMA. There's no way to avoid the risk if we change the default. On the other hand, it's the only way to squash this class of bugs - most existing devices just aren't written to cope with DMA re-entrancy. The approach in your patch sounds good to me, but I haven't followed the discussions so maybe there were valid reasons to look for alternatives. Stefan