Message ID | 20101101151415.3927.87944.stgit@s20.home |
---|---|
State | New |
Headers | show |
On 11/01/2010 10:14 AM, Alex Williamson wrote: > Register the actual VM RAM using the new API > > Signed-off-by: Alex Williamson<alex.williamson@redhat.com> > --- > > hw/pc.c | 12 ++++++------ > 1 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/hw/pc.c b/hw/pc.c > index 69b13bf..0ea6d10 100644 > --- a/hw/pc.c > +++ b/hw/pc.c > @@ -912,14 +912,14 @@ void pc_memory_init(ram_addr_t ram_size, > /* allocate RAM */ > ram_addr = qemu_ram_alloc(NULL, "pc.ram", > below_4g_mem_size + above_4g_mem_size); > - cpu_register_physical_memory(0, 0xa0000, ram_addr); > - cpu_register_physical_memory(0x100000, > - below_4g_mem_size - 0x100000, > - ram_addr + 0x100000); > + > + qemu_ram_register(0, 0xa0000, ram_addr); > + qemu_ram_register(0x100000, below_4g_mem_size - 0x100000, > + ram_addr + 0x100000); > #if TARGET_PHYS_ADDR_BITS> 32 > if (above_4g_mem_size> 0) { > - cpu_register_physical_memory(0x100000000ULL, above_4g_mem_size, > - ram_addr + below_4g_mem_size); > + qemu_ram_register(0x100000000ULL, above_4g_mem_size, > + ram_addr + below_4g_mem_size); > } > Take a look at the memory shadowing in the i440fx. The regions of memory in the BIOS area can temporarily become RAM. That's because there is normally RAM backing this space but the memory controller redirects writes to the ROM space. Not sure the best way to handle this, but the basic concept is, RAM always exists but if a device tries to access it, it may or may not be accessible as RAM at any given point in time. Regards, Anthony Liguori > #endif > > > > >
On Tue, 2010-11-16 at 08:58 -0600, Anthony Liguori wrote: > On 11/01/2010 10:14 AM, Alex Williamson wrote: > > Register the actual VM RAM using the new API > > > > Signed-off-by: Alex Williamson<alex.williamson@redhat.com> > > --- > > > > hw/pc.c | 12 ++++++------ > > 1 files changed, 6 insertions(+), 6 deletions(-) > > > > diff --git a/hw/pc.c b/hw/pc.c > > index 69b13bf..0ea6d10 100644 > > --- a/hw/pc.c > > +++ b/hw/pc.c > > @@ -912,14 +912,14 @@ void pc_memory_init(ram_addr_t ram_size, > > /* allocate RAM */ > > ram_addr = qemu_ram_alloc(NULL, "pc.ram", > > below_4g_mem_size + above_4g_mem_size); > > - cpu_register_physical_memory(0, 0xa0000, ram_addr); > > - cpu_register_physical_memory(0x100000, > > - below_4g_mem_size - 0x100000, > > - ram_addr + 0x100000); > > + > > + qemu_ram_register(0, 0xa0000, ram_addr); > > + qemu_ram_register(0x100000, below_4g_mem_size - 0x100000, > > + ram_addr + 0x100000); > > #if TARGET_PHYS_ADDR_BITS> 32 > > if (above_4g_mem_size> 0) { > > - cpu_register_physical_memory(0x100000000ULL, above_4g_mem_size, > > - ram_addr + below_4g_mem_size); > > + qemu_ram_register(0x100000000ULL, above_4g_mem_size, > > + ram_addr + below_4g_mem_size); > > } > > > > Take a look at the memory shadowing in the i440fx. The regions of > memory in the BIOS area can temporarily become RAM. > > That's because there is normally RAM backing this space but the memory > controller redirects writes to the ROM space. > > Not sure the best way to handle this, but the basic concept is, RAM > always exists but if a device tries to access it, it may or may not be > accessible as RAM at any given point in time. Gack. For the benefit of those that want to join the fun without digging up the spec, these magic flippable segments the i440fx can toggle are 12 fixed 16k segments from 0xc0000 to 0xeffff and a single 64k segment from 0xf0000 to 0xfffff. There are read-enable and write-enable bits for each, so the chipset can be configured to read from the bios and write to memory (to setup BIOS-RAM caching), and read from memory and write to the bios (to enable BIOS-RAM caching). The other bit combinations are also available. For my purpose in using this to program the IOMMU with guest physical to host virtual addresses for device assignment, it doesn't really matter since there should never be a DMA in this range of memory. But for a general RAM API, I'm not sure either. I'm tempted to say that while this is in fact a use of RAM, the RAM is never presented to the guest as usable system memory (E820_RAM for x86), and should therefore be excluded from the RAM API if we're using it only to track regions that are actual guest usable physical memory. We had talked on irc that pc.c should be registering 0x0 to below_4g_mem_size as ram, but now I tend to disagree with that. The memory backing 0xa0000-0x100000 is present, but it's not presented to the guest as usable RAM. What's your strict definition of what the RAM API includes? Is it only what the guest could consider usable RAM or does it also include quirky chipset accelerator features like this (everything with a guest physical address)? Thanks, Alex
On Tue, Nov 16, 2010 at 02:24:06PM -0700, Alex Williamson wrote: > On Tue, 2010-11-16 at 08:58 -0600, Anthony Liguori wrote: > > On 11/01/2010 10:14 AM, Alex Williamson wrote: > > > Register the actual VM RAM using the new API > > > > > > Signed-off-by: Alex Williamson<alex.williamson@redhat.com> > > > --- > > > > > > hw/pc.c | 12 ++++++------ > > > 1 files changed, 6 insertions(+), 6 deletions(-) > > > > > > diff --git a/hw/pc.c b/hw/pc.c > > > index 69b13bf..0ea6d10 100644 > > > --- a/hw/pc.c > > > +++ b/hw/pc.c > > > @@ -912,14 +912,14 @@ void pc_memory_init(ram_addr_t ram_size, > > > /* allocate RAM */ > > > ram_addr = qemu_ram_alloc(NULL, "pc.ram", > > > below_4g_mem_size + above_4g_mem_size); > > > - cpu_register_physical_memory(0, 0xa0000, ram_addr); > > > - cpu_register_physical_memory(0x100000, > > > - below_4g_mem_size - 0x100000, > > > - ram_addr + 0x100000); > > > + > > > + qemu_ram_register(0, 0xa0000, ram_addr); > > > + qemu_ram_register(0x100000, below_4g_mem_size - 0x100000, > > > + ram_addr + 0x100000); > > > #if TARGET_PHYS_ADDR_BITS> 32 > > > if (above_4g_mem_size> 0) { > > > - cpu_register_physical_memory(0x100000000ULL, above_4g_mem_size, > > > - ram_addr + below_4g_mem_size); > > > + qemu_ram_register(0x100000000ULL, above_4g_mem_size, > > > + ram_addr + below_4g_mem_size); > > > } > > > > > > > Take a look at the memory shadowing in the i440fx. The regions of > > memory in the BIOS area can temporarily become RAM. > > > > That's because there is normally RAM backing this space but the memory > > controller redirects writes to the ROM space. > > > > Not sure the best way to handle this, but the basic concept is, RAM > > always exists but if a device tries to access it, it may or may not be > > accessible as RAM at any given point in time. > > Gack. For the benefit of those that want to join the fun without > digging up the spec, these magic flippable segments the i440fx can > toggle are 12 fixed 16k segments from 0xc0000 to 0xeffff and a single > 64k segment from 0xf0000 to 0xfffff. There are read-enable and > write-enable bits for each, so the chipset can be configured to read > from the bios and write to memory (to setup BIOS-RAM caching), and read > from memory and write to the bios (to enable BIOS-RAM caching). The > other bit combinations are also available. > There is also 0xa0000−0xbffff which is usually part of framebuffer, but chipset can be configured to access this memory as RAM when CPU is in SMM mode. > For my purpose in using this to program the IOMMU with guest physical to > host virtual addresses for device assignment, it doesn't really matter > since there should never be a DMA in this range of memory. But for a IIRC spec defines for each range of memory if it is accessed from PCI bus. > general RAM API, I'm not sure either. I'm tempted to say that while > this is in fact a use of RAM, the RAM is never presented to the guest as > usable system memory (E820_RAM for x86), and should therefore be > excluded from the RAM API if we're using it only to track regions that > are actual guest usable physical memory. A guest is no only OS (like Windows or Linux), but the bios code is also part of the guest and it can access all of this memory. > > We had talked on irc that pc.c should be registering 0x0 to > below_4g_mem_size as ram, but now I tend to disagree with that. The > memory backing 0xa0000-0x100000 is present, but it's not presented to > the guest as usable RAM. It is, during SMM, if bios configured chipset to do so. > What's your strict definition of what the RAM > API includes? Is it only what the guest could consider usable RAM or > does it also include quirky chipset accelerator features like this > (everything with a guest physical address)? Thanks, > -- Gleb.
On 11/16/2010 03:24 PM, Alex Williamson wrote: > On Tue, 2010-11-16 at 08:58 -0600, Anthony Liguori wrote: > >> On 11/01/2010 10:14 AM, Alex Williamson wrote: >> >>> Register the actual VM RAM using the new API >>> >>> Signed-off-by: Alex Williamson<alex.williamson@redhat.com> >>> --- >>> >>> hw/pc.c | 12 ++++++------ >>> 1 files changed, 6 insertions(+), 6 deletions(-) >>> >>> diff --git a/hw/pc.c b/hw/pc.c >>> index 69b13bf..0ea6d10 100644 >>> --- a/hw/pc.c >>> +++ b/hw/pc.c >>> @@ -912,14 +912,14 @@ void pc_memory_init(ram_addr_t ram_size, >>> /* allocate RAM */ >>> ram_addr = qemu_ram_alloc(NULL, "pc.ram", >>> below_4g_mem_size + above_4g_mem_size); >>> - cpu_register_physical_memory(0, 0xa0000, ram_addr); >>> - cpu_register_physical_memory(0x100000, >>> - below_4g_mem_size - 0x100000, >>> - ram_addr + 0x100000); >>> + >>> + qemu_ram_register(0, 0xa0000, ram_addr); >>> + qemu_ram_register(0x100000, below_4g_mem_size - 0x100000, >>> + ram_addr + 0x100000); >>> #if TARGET_PHYS_ADDR_BITS> 32 >>> if (above_4g_mem_size> 0) { >>> - cpu_register_physical_memory(0x100000000ULL, above_4g_mem_size, >>> - ram_addr + below_4g_mem_size); >>> + qemu_ram_register(0x100000000ULL, above_4g_mem_size, >>> + ram_addr + below_4g_mem_size); >>> } >>> >>> >> Take a look at the memory shadowing in the i440fx. The regions of >> memory in the BIOS area can temporarily become RAM. >> >> That's because there is normally RAM backing this space but the memory >> controller redirects writes to the ROM space. >> >> Not sure the best way to handle this, but the basic concept is, RAM >> always exists but if a device tries to access it, it may or may not be >> accessible as RAM at any given point in time. >> > Gack. For the benefit of those that want to join the fun without > digging up the spec, these magic flippable segments the i440fx can > toggle are 12 fixed 16k segments from 0xc0000 to 0xeffff and a single > 64k segment from 0xf0000 to 0xfffff. There are read-enable and > write-enable bits for each, so the chipset can be configured to read > from the bios and write to memory (to setup BIOS-RAM caching), and read > from memory and write to the bios (to enable BIOS-RAM caching). The > other bit combinations are also available. > Yup. As Gleb mentions, there's the SDRAM register which controls whether 0xa0000 is mapped to PCI or whether it's mapped to RAM (but KVM explicitly disabled SMM support). > For my purpose in using this to program the IOMMU with guest physical to > host virtual addresses for device assignment, it doesn't really matter > since there should never be a DMA in this range of memory. But for a > general RAM API, I'm not sure either. I'm tempted to say that while > this is in fact a use of RAM, the RAM is never presented to the guest as > usable system memory (E820_RAM for x86), and should therefore be > excluded from the RAM API if we're using it only to track regions that > are actual guest usable physical memory. > > We had talked on irc that pc.c should be registering 0x0 to > below_4g_mem_size as ram, but now I tend to disagree with that. The > memory backing 0xa0000-0x100000 is present, but it's not presented to > the guest as usable RAM. What's your strict definition of what the RAM > API includes? Is it only what the guest could consider usable RAM or > does it also include quirky chipset accelerator features like this > (everything with a guest physical address)? Thanks, > Today we model on flat space that's a mixed of device memory, RAM, or ROM. This is not how machines work and the limitations of this model is holding us back. IRL, there's a block of RAM that's connected to a memory controller. The CPU is also connected to the memory controller. Devices are connected to another controller which is in turn connected to the memory controller. There may, in fact, be more than one controller between a device and the memory controller. A controller may change the way a device sees memory in arbitrary ways. In fact, two controllers accessing the same page might see something totally different. The idea behind the RAM API is to begin to establish this hierarchy. RAM is not what any particular device sees--it's actual RAM. IOW, the RAM API should represent what address mapping I would get if I talked directly to DIMMs. This is not what RamBlock is even though the name would suggest otherwise. RamBlocks are anything that qemu represents as cache consistency directly accessable memory. Device ROMs and areas of device RAM are all allocated from the RamBlock space. So the very first task of a RAM API is to simplify differentiate these two things. Once we have the base RAM API, we can start adding the proper APIs that sit on top of it (like a PCI memory API). Regards, Anthony Liguori > Alex > > >
On 11/18/2010 01:42 AM, Anthony Liguori wrote: >> Gack. For the benefit of those that want to join the fun without >> digging up the spec, these magic flippable segments the i440fx can >> toggle are 12 fixed 16k segments from 0xc0000 to 0xeffff and a single >> 64k segment from 0xf0000 to 0xfffff. There are read-enable and >> write-enable bits for each, so the chipset can be configured to read >> from the bios and write to memory (to setup BIOS-RAM caching), and read >> from memory and write to the bios (to enable BIOS-RAM caching). The >> other bit combinations are also available. > > Yup. As Gleb mentions, there's the SDRAM register which controls > whether 0xa0000 is mapped to PCI or whether it's mapped to RAM (but > KVM explicitly disabled SMM support). KVM not supporting SMM is a bug (albeit one that is likely to remain unresolved for a while). Let's pretend that kvm smm support is not an issue. IIUC, SMM means that there two memory maps when the cpu accesses memory, one for SMM, one for non-SMM. > >> For my purpose in using this to program the IOMMU with guest physical to >> host virtual addresses for device assignment, it doesn't really matter >> since there should never be a DMA in this range of memory. But for a >> general RAM API, I'm not sure either. I'm tempted to say that while >> this is in fact a use of RAM, the RAM is never presented to the guest as >> usable system memory (E820_RAM for x86), and should therefore be >> excluded from the RAM API if we're using it only to track regions that >> are actual guest usable physical memory. >> >> We had talked on irc that pc.c should be registering 0x0 to >> below_4g_mem_size as ram, but now I tend to disagree with that. The >> memory backing 0xa0000-0x100000 is present, but it's not presented to >> the guest as usable RAM. What's your strict definition of what the RAM >> API includes? Is it only what the guest could consider usable RAM or >> does it also include quirky chipset accelerator features like this >> (everything with a guest physical address)? Thanks, > > Today we model on flat space that's a mixed of device memory, RAM, or > ROM. This is not how machines work and the limitations of this model > is holding us back. > > IRL, there's a block of RAM that's connected to a memory controller. > The CPU is also connected to the memory controller. Devices are > connected to another controller which is in turn connected to the > memory controller. There may, in fact, be more than one controller > between a device and the memory controller. > > A controller may change the way a device sees memory in arbitrary > ways. In fact, two controllers accessing the same page might see > something totally different. > > The idea behind the RAM API is to begin to establish this hierarchy. > RAM is not what any particular device sees--it's actual RAM. IOW, the > RAM API should represent what address mapping I would get if I talked > directly to DIMMs. > > This is not what RamBlock is even though the name would suggest > otherwise. RamBlocks are anything that qemu represents as cache > consistency directly accessable memory. Device ROMs and areas of > device RAM are all allocated from the RamBlock space. > > So the very first task of a RAM API is to simplify differentiate these > two things. Once we have the base RAM API, we can start adding the > proper APIs that sit on top of it (like a PCI memory API). Things aren't that bad - a ram_addr_t and a physical address are already different things, so we already have one level of translation.
On 11/18/2010 09:22 AM, Avi Kivity wrote: > On 11/18/2010 01:42 AM, Anthony Liguori wrote: >>> Gack. For the benefit of those that want to join the fun without >>> digging up the spec, these magic flippable segments the i440fx can >>> toggle are 12 fixed 16k segments from 0xc0000 to 0xeffff and a single >>> 64k segment from 0xf0000 to 0xfffff. There are read-enable and >>> write-enable bits for each, so the chipset can be configured to read >>> from the bios and write to memory (to setup BIOS-RAM caching), and read >>> from memory and write to the bios (to enable BIOS-RAM caching). The >>> other bit combinations are also available. >> >> Yup. As Gleb mentions, there's the SDRAM register which controls >> whether 0xa0000 is mapped to PCI or whether it's mapped to RAM (but >> KVM explicitly disabled SMM support). > > KVM not supporting SMM is a bug (albeit one that is likely to remain > unresolved for a while). Let's pretend that kvm smm support is not an > issue. > > IIUC, SMM means that there two memory maps when the cpu accesses > memory, one for SMM, one for non-SMM. No. That's not what it means. With the i440fx, when the CPU accesses 0xa0000, it gets forwarded to the PCI bus no different than an access to 0xe0000. If the CPU asserts the EXF4#/Ab7# signal, then the i440fx directs CPU accesses to 0xa0000 to RAM instead of the PCI bus. Alternatively, if the SMRAM register is activated, then the i440fx will redirect 0xa0000 to RAM regardless of whether the CPU asserts that signal. That means that even without KVM supporting SMM, this mode can happen. In general, the memory controller can redirect IO accesses to RAM or to the PCI bus. The PCI bus may redirect the access to the ISA bus. >>> For my purpose in using this to program the IOMMU with guest >>> physical to >>> host virtual addresses for device assignment, it doesn't really matter >>> since there should never be a DMA in this range of memory. But for a >>> general RAM API, I'm not sure either. I'm tempted to say that while >>> this is in fact a use of RAM, the RAM is never presented to the >>> guest as >>> usable system memory (E820_RAM for x86), and should therefore be >>> excluded from the RAM API if we're using it only to track regions that >>> are actual guest usable physical memory. >>> >>> We had talked on irc that pc.c should be registering 0x0 to >>> below_4g_mem_size as ram, but now I tend to disagree with that. The >>> memory backing 0xa0000-0x100000 is present, but it's not presented to >>> the guest as usable RAM. What's your strict definition of what the RAM >>> API includes? Is it only what the guest could consider usable RAM or >>> does it also include quirky chipset accelerator features like this >>> (everything with a guest physical address)? Thanks, >> >> Today we model on flat space that's a mixed of device memory, RAM, or >> ROM. This is not how machines work and the limitations of this model >> is holding us back. >> >> IRL, there's a block of RAM that's connected to a memory controller. >> The CPU is also connected to the memory controller. Devices are >> connected to another controller which is in turn connected to the >> memory controller. There may, in fact, be more than one controller >> between a device and the memory controller. >> >> A controller may change the way a device sees memory in arbitrary >> ways. In fact, two controllers accessing the same page might see >> something totally different. >> >> The idea behind the RAM API is to begin to establish this hierarchy. >> RAM is not what any particular device sees--it's actual RAM. IOW, >> the RAM API should represent what address mapping I would get if I >> talked directly to DIMMs. >> >> This is not what RamBlock is even though the name would suggest >> otherwise. RamBlocks are anything that qemu represents as cache >> consistency directly accessable memory. Device ROMs and areas of >> device RAM are all allocated from the RamBlock space. >> >> So the very first task of a RAM API is to simplify differentiate >> these two things. Once we have the base RAM API, we can start adding >> the proper APIs that sit on top of it (like a PCI memory API). > > Things aren't that bad - a ram_addr_t and a physical address are > already different things, so we already have one level of translation. Yeah, but ram_addr_t doesn't model anything meaningful IRL. It's an internal implementation detail. Regards, Anthony Liguori
On Wed, Nov 17, 2010 at 05:42:28PM -0600, Anthony Liguori wrote: > >For my purpose in using this to program the IOMMU with guest physical to > >host virtual addresses for device assignment, it doesn't really matter > >since there should never be a DMA in this range of memory. But for a > >general RAM API, I'm not sure either. I'm tempted to say that while > >this is in fact a use of RAM, the RAM is never presented to the guest as > >usable system memory (E820_RAM for x86), and should therefore be > >excluded from the RAM API if we're using it only to track regions that > >are actual guest usable physical memory. > > > >We had talked on irc that pc.c should be registering 0x0 to > >below_4g_mem_size as ram, but now I tend to disagree with that. The > >memory backing 0xa0000-0x100000 is present, but it's not presented to > >the guest as usable RAM. What's your strict definition of what the RAM > >API includes? Is it only what the guest could consider usable RAM or > >does it also include quirky chipset accelerator features like this > >(everything with a guest physical address)? Thanks, > > Today we model on flat space that's a mixed of device memory, RAM, > or ROM. This is not how machines work and the limitations of this > model is holding us back. > > IRL, there's a block of RAM that's connected to a memory controller. > The CPU is also connected to the memory controller. Devices are > connected to another controller which is in turn connected to the > memory controller. There may, in fact, be more than one controller > between a device and the memory controller. > > A controller may change the way a device sees memory in arbitrary > ways. In fact, two controllers accessing the same page might see > something totally different. > > The idea behind the RAM API is to begin to establish this hierarchy. > RAM is not what any particular device sees--it's actual RAM. IOW, > the RAM API should represent what address mapping I would get if I > talked directly to DIMMs. > > This is not what RamBlock is even though the name would suggest > otherwise. RamBlocks are anything that qemu represents as cache > consistency directly accessable memory. Device ROMs and areas of > device RAM are all allocated from the RamBlock space. > > So the very first task of a RAM API is to simplify differentiate > these two things. Once we have the base RAM API, we can start > adding the proper APIs that sit on top of it (like a PCI memory > API). > +1 for all above. What happens when device access some address is completely different from what happens when CPU access the same address (or even another device on another bus). For instance how MSI is implemented now CPU can send MSI by writing to 0xfee00000 memory range. I do not think you can do that on real HW. -- Gleb.
On 11/18/2010 05:46 PM, Anthony Liguori wrote: > On 11/18/2010 09:22 AM, Avi Kivity wrote: >> On 11/18/2010 01:42 AM, Anthony Liguori wrote: >>>> Gack. For the benefit of those that want to join the fun without >>>> digging up the spec, these magic flippable segments the i440fx can >>>> toggle are 12 fixed 16k segments from 0xc0000 to 0xeffff and a single >>>> 64k segment from 0xf0000 to 0xfffff. There are read-enable and >>>> write-enable bits for each, so the chipset can be configured to read >>>> from the bios and write to memory (to setup BIOS-RAM caching), and >>>> read >>>> from memory and write to the bios (to enable BIOS-RAM caching). The >>>> other bit combinations are also available. >>> >>> Yup. As Gleb mentions, there's the SDRAM register which controls >>> whether 0xa0000 is mapped to PCI or whether it's mapped to RAM (but >>> KVM explicitly disabled SMM support). >> >> KVM not supporting SMM is a bug (albeit one that is likely to remain >> unresolved for a while). Let's pretend that kvm smm support is not >> an issue. >> >> IIUC, SMM means that there two memory maps when the cpu accesses >> memory, one for SMM, one for non-SMM. > > No. That's not what it means. With the i440fx, when the CPU accesses > 0xa0000, it gets forwarded to the PCI bus no different than an access > to 0xe0000. > > If the CPU asserts the EXF4#/Ab7# signal, then the i440fx directs CPU > accesses to 0xa0000 to RAM instead of the PCI bus. That's what "two memory maps" mean. If you have one cpu in SMM and another outside SMM, then those two maps are active simultaneously. > > Alternatively, if the SMRAM register is activated, then the i440fx > will redirect 0xa0000 to RAM regardless of whether the CPU asserts > that signal. That means that even without KVM supporting SMM, this > mode can happen. That's a single memory map that is modified under hardware control, it's no different than BARs and such. >> Things aren't that bad - a ram_addr_t and a physical address are >> already different things, so we already have one level of translation. > > Yeah, but ram_addr_t doesn't model anything meaningful IRL. It's an > internal implementation detail. > Does it matter? We can say those are addresses on the memory bus. Since they are not observable anyway, who cares if the correspond with reality or not?
On 11/18/2010 09:57 AM, Avi Kivity wrote: > On 11/18/2010 05:46 PM, Anthony Liguori wrote: >> On 11/18/2010 09:22 AM, Avi Kivity wrote: >>> On 11/18/2010 01:42 AM, Anthony Liguori wrote: >>>>> Gack. For the benefit of those that want to join the fun without >>>>> digging up the spec, these magic flippable segments the i440fx can >>>>> toggle are 12 fixed 16k segments from 0xc0000 to 0xeffff and a single >>>>> 64k segment from 0xf0000 to 0xfffff. There are read-enable and >>>>> write-enable bits for each, so the chipset can be configured to read >>>>> from the bios and write to memory (to setup BIOS-RAM caching), and >>>>> read >>>>> from memory and write to the bios (to enable BIOS-RAM caching). The >>>>> other bit combinations are also available. >>>> >>>> Yup. As Gleb mentions, there's the SDRAM register which controls >>>> whether 0xa0000 is mapped to PCI or whether it's mapped to RAM (but >>>> KVM explicitly disabled SMM support). >>> >>> KVM not supporting SMM is a bug (albeit one that is likely to remain >>> unresolved for a while). Let's pretend that kvm smm support is not >>> an issue. >>> >>> IIUC, SMM means that there two memory maps when the cpu accesses >>> memory, one for SMM, one for non-SMM. >> >> No. That's not what it means. With the i440fx, when the CPU >> accesses 0xa0000, it gets forwarded to the PCI bus no different than >> an access to 0xe0000. >> >> If the CPU asserts the EXF4#/Ab7# signal, then the i440fx directs CPU >> accesses to 0xa0000 to RAM instead of the PCI bus. > > That's what "two memory maps" mean. If you have one cpu in SMM and > another outside SMM, then those two maps are active simultaneously. I'm not sure if more modern memory controllers do special things here, but for the i440fx, if any CPU asserts SMM mode, then any memory access to that space is going to access SMRAM. >> >> Alternatively, if the SMRAM register is activated, then the i440fx >> will redirect 0xa0000 to RAM regardless of whether the CPU asserts >> that signal. That means that even without KVM supporting SMM, this >> mode can happen. > > That's a single memory map that is modified under hardware control, > it's no different than BARs and such. There is a single block of RAM. The memory controller may either forward an address unmodified to the RAM block or it may forward the address to the PCI bus[1]. A non CPU access goes through a controller hierarchy and may be modified while it transverses the hierarchy. So really, we should have a big chunk of RAM that we associate with a guest, with a list of intercepts that changes as the devices are modified. Instead of having that list dispatch directly to a device, we should send all intercepted accesses to the memory controller and let the memory controller propagate out the access to the appropriate device. [1] The except is access to the local APIC. That's handled directly by the CPU (or immediately outside of the CPU before the access gets to the memory controller if the local APIC is external to the CPU). >>> Things aren't that bad - a ram_addr_t and a physical address are >>> already different things, so we already have one level of translation. >> >> Yeah, but ram_addr_t doesn't model anything meaningful IRL. It's an >> internal implementation detail. >> > > Does it matter? We can say those are addresses on the memory bus. > Since they are not observable anyway, who cares if the correspond with > reality or not? It matters a lot because the life cycle of RAM is different from the life cycle of ROM. For instance, the original goal was to madvise(MADV_DONTNEED) RAM on reboot. You can't do that to ROM because the contents matter. But for PV devices, we can be loose in how we define the way the devices interact with the rest of the system. For instance, we can say that virtio-pci devices are directly connected to RAM and do not go through the memory controllers. That means we could get stable mappings of the virtio ring. Regards, Anthony Liguori
On 11/18/2010 06:09 PM, Anthony Liguori wrote: >> That's what "two memory maps" mean. If you have one cpu in SMM and >> another outside SMM, then those two maps are active simultaneously. > > > I'm not sure if more modern memory controllers do special things here, > but for the i440fx, if any CPU asserts SMM mode, then any memory > access to that space is going to access SMRAM. How does SMP work then? > SMM Space Open (DOPEN). When DOPEN=1 and DLCK=0, SMM space DRAM is > made visible even > when CPU cycle does not indicate SMM mode access via EXF4#/Ab7# > signal. This is intended to help > BIOS initialize SMM space. Software should ensure that DOPEN=1 is > mutually exclusive with DCLS=1. > When DLCK is set to a 1, DOPEN is set to 0 and becomes read only. The words "cpu cycle does not indicate SMM mode" seem to say that SMM accesses are made on a per-transaction basis, or so my lawyers tell me. > >>> >>> Alternatively, if the SMRAM register is activated, then the i440fx >>> will redirect 0xa0000 to RAM regardless of whether the CPU asserts >>> that signal. That means that even without KVM supporting SMM, this >>> mode can happen. >> >> That's a single memory map that is modified under hardware control, >> it's no different than BARs and such. > > There is a single block of RAM. > > The memory controller may either forward an address unmodified to the > RAM block or it may forward the address to the PCI bus[1]. A non CPU > access goes through a controller hierarchy and may be modified while > it transverses the hierarchy. > > So really, we should have a big chunk of RAM that we associate with a > guest, with a list of intercepts that changes as the devices are > modified. Instead of having that list dispatch directly to a device, > we should send all intercepted accesses to the memory controller and > let the memory controller propagate out the access to the appropriate > device. > > [1] The except is access to the local APIC. That's handled directly > by the CPU (or immediately outside of the CPU before the access gets > to the memory controller if the local APIC is external to the CPU). > Agree. However the point with SMM is that the dispatch is made not only based on the address, but also based on SMM mode (and, unfortunately, can also be different based on read vs write). >>>> Things aren't that bad - a ram_addr_t and a physical address are >>>> already different things, so we already have one level of translation. >>> >>> Yeah, but ram_addr_t doesn't model anything meaningful IRL. It's an >>> internal implementation detail. >>> >> >> Does it matter? We can say those are addresses on the memory bus. >> Since they are not observable anyway, who cares if the correspond >> with reality or not? > > It matters a lot because the life cycle of RAM is different from the > life cycle of ROM. > > For instance, the original goal was to madvise(MADV_DONTNEED) RAM on > reboot. You can't do that to ROM because the contents matter. I don't think you can do that to RAM either. > > But for PV devices, we can be loose in how we define the way the > devices interact with the rest of the system. For instance, we can > say that virtio-pci devices are directly connected to RAM and do not > go through the memory controllers. That means we could get stable > mappings of the virtio ring. That wouldn't work once we have an iommu and start to assign them to nested guests.
On Thu, Nov 18, 2010 at 06:18:06PM +0200, Avi Kivity wrote: > >But for PV devices, we can be loose in how we define the way the > >devices interact with the rest of the system. For instance, we > >can say that virtio-pci devices are directly connected to RAM and > >do not go through the memory controllers. That means we could get > >stable mappings of the virtio ring. > > That wouldn't work once we have an iommu and start to assign them to > nested guests. Yea. Not sure whether I'm worried about that though. Mixing in all the problems inherent in nested virt, PV and assigned devices seems especially masochistic. > -- > error compiling committee.c: too many arguments to function
diff --git a/hw/pc.c b/hw/pc.c index 69b13bf..0ea6d10 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -912,14 +912,14 @@ void pc_memory_init(ram_addr_t ram_size, /* allocate RAM */ ram_addr = qemu_ram_alloc(NULL, "pc.ram", below_4g_mem_size + above_4g_mem_size); - cpu_register_physical_memory(0, 0xa0000, ram_addr); - cpu_register_physical_memory(0x100000, - below_4g_mem_size - 0x100000, - ram_addr + 0x100000); + + qemu_ram_register(0, 0xa0000, ram_addr); + qemu_ram_register(0x100000, below_4g_mem_size - 0x100000, + ram_addr + 0x100000); #if TARGET_PHYS_ADDR_BITS > 32 if (above_4g_mem_size > 0) { - cpu_register_physical_memory(0x100000000ULL, above_4g_mem_size, - ram_addr + below_4g_mem_size); + qemu_ram_register(0x100000000ULL, above_4g_mem_size, + ram_addr + below_4g_mem_size); } #endif
Register the actual VM RAM using the new API Signed-off-by: Alex Williamson <alex.williamson@redhat.com> --- hw/pc.c | 12 ++++++------ 1 files changed, 6 insertions(+), 6 deletions(-)