Message ID | 1331225951-31306-1-git-send-email-mark.langsdorf@calxeda.com |
---|---|
State | New |
Headers | show |
On 03/08/2012 09:59 AM, Mark Langsdorf wrote: > Allow load_image_targphys to load files on systems with more than 2G of > emulated memory by changing the max_sz parameter from an int to an > unsigned long. unsigned long is still 32-bits on a 32-bit host. You probably want to be using off_t.
On 03/08/2012 11:56 AM, Eric Blake wrote: > On 03/08/2012 09:59 AM, Mark Langsdorf wrote: >> Allow load_image_targphys to load files on systems with more than 2G of >> emulated memory by changing the max_sz parameter from an int to an >> unsigned long. > > unsigned long is still 32-bits on a 32-bit host. You probably want to > be using off_t. I know that unsigned long is 32-bits. The issue is more that comparing 0xf000_0000 > 0x1000_0000 returns FALSE if both values are compared as signed ints, the way the current code does. Strict correctness would be for max_sz to be of type size_t, and I can change it to that if people would prefer, but unsigned long is clear enough in this instance. --Mark Langsdorf Calxeda, Inc.
Mark Langsdorf <mark.langsdorf@calxeda.com> writes: > Allow load_image_targphys to load files on systems with more than 2G of > emulated memory by changing the max_sz parameter from an int to an > unsigned long. > > Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com> > --- > hw/loader.c | 4 ++-- > hw/loader.h | 3 ++- > 2 files changed, 4 insertions(+), 3 deletions(-) > > diff --git a/hw/loader.c b/hw/loader.c > index 415cdce..a5333d0 100644 > --- a/hw/loader.c > +++ b/hw/loader.c > @@ -103,9 +103,9 @@ ssize_t read_targphys(const char *name, > > /* return the size or -1 if error */ > int load_image_targphys(const char *filename, > - target_phys_addr_t addr, int max_sz) > + target_phys_addr_t addr, unsigned long max_sz) > { > - int size; > + unsigned long size; > > size = get_image_size(filename); > if (size > max_sz) { get_image_size() returns int. How does widening size and max_sz here improve things? > diff --git a/hw/loader.h b/hw/loader.h > index fbcaba9..35c1652 100644 > --- a/hw/loader.h > +++ b/hw/loader.h > @@ -4,7 +4,8 @@ > /* loader.c */ > int get_image_size(const char *filename); > int load_image(const char *filename, uint8_t *addr); /* deprecated */ > -int load_image_targphys(const char *filename, target_phys_addr_t, int max_sz); > +int load_image_targphys(const char *filename, target_phys_addr_t, > + unsigned long max_sz); > int load_elf(const char *filename, uint64_t (*translate_fn)(void *, uint64_t), > void *translate_opaque, uint64_t *pentry, uint64_t *lowaddr, > uint64_t *highaddr, int big_endian, int elf_machine,
On 03/09/2012 03:25 AM, Markus Armbruster wrote: > Mark Langsdorf <mark.langsdorf@calxeda.com> writes: > >> Allow load_image_targphys to load files on systems with more than 2G of >> emulated memory by changing the max_sz parameter from an int to an >> unsigned long. >> >> Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com> >> --- >> hw/loader.c | 4 ++-- >> hw/loader.h | 3 ++- >> 2 files changed, 4 insertions(+), 3 deletions(-) >> >> diff --git a/hw/loader.c b/hw/loader.c >> index 415cdce..a5333d0 100644 >> --- a/hw/loader.c >> +++ b/hw/loader.c >> @@ -103,9 +103,9 @@ ssize_t read_targphys(const char *name, >> >> /* return the size or -1 if error */ >> int load_image_targphys(const char *filename, >> - target_phys_addr_t addr, int max_sz) >> + target_phys_addr_t addr, unsigned long max_sz) >> { >> - int size; >> + unsigned long size; >> >> size = get_image_size(filename); >> if (size > max_sz) { > > get_image_size() returns int. How does widening size and max_sz here > improve things? If max_sz is greater than 2GB, then: int max_sz = 0xc0000000; int size = 0x300; if (size > max_sz) return -1; returns -1, even though size is much less than max_sz. doing it my way: unsigned long max_sz = 0xc0000000; unsigned long size = 0x300; if (size > max_sz) return -1; does not return -1. --Mark Langsdorf Calxeda, Inc.
On 09.03.2012, at 14:15, Mark Langsdorf wrote: > On 03/09/2012 03:25 AM, Markus Armbruster wrote: >> Mark Langsdorf <mark.langsdorf@calxeda.com> writes: >> >>> Allow load_image_targphys to load files on systems with more than 2G of >>> emulated memory by changing the max_sz parameter from an int to an >>> unsigned long. >>> >>> Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com> >>> --- >>> hw/loader.c | 4 ++-- >>> hw/loader.h | 3 ++- >>> 2 files changed, 4 insertions(+), 3 deletions(-) >>> >>> diff --git a/hw/loader.c b/hw/loader.c >>> index 415cdce..a5333d0 100644 >>> --- a/hw/loader.c >>> +++ b/hw/loader.c >>> @@ -103,9 +103,9 @@ ssize_t read_targphys(const char *name, >>> >>> /* return the size or -1 if error */ >>> int load_image_targphys(const char *filename, >>> - target_phys_addr_t addr, int max_sz) >>> + target_phys_addr_t addr, unsigned long max_sz) >>> { >>> - int size; >>> + unsigned long size; >>> >>> size = get_image_size(filename); >>> if (size > max_sz) { >> >> get_image_size() returns int. How does widening size and max_sz here >> improve things? > > If max_sz is greater than 2GB, then: > int max_sz = 0xc0000000; > int size = 0x300; > if (size > max_sz) > return -1; > > returns -1, even though size is much less than max_sz. > > doing it my way: > unsigned long max_sz = 0xc0000000; > unsigned long size = 0x300; > if (size > max_sz) > return -1; > > does not return -1. So why change it to long then? Unsigned int would have the same effect. But really what this should be changed to is target_phys_addr_t. That way it's aligned with the address. I guess we can leave int for return values for now though, since we won't get images that big. Also, why are you hitting this in the first place? How are you calling read_targphys that you end up with such a big max_sz? Using INT_MAX as max_sz should work just as well, no? It's probably cleaner to change the size type, but I'm curious why nobody else hit this before :). Alex
On 03/09/2012 07:21 AM, Alexander Graf wrote: > > On 09.03.2012, at 14:15, Mark Langsdorf wrote: > >> On 03/09/2012 03:25 AM, Markus Armbruster wrote: >>> Mark Langsdorf <mark.langsdorf@calxeda.com> writes: >>> >>>> Allow load_image_targphys to load files on systems with more than 2G of >>>> emulated memory by changing the max_sz parameter from an int to an >>>> unsigned long. >>>> >>>> Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com> >>>> --- >>>> hw/loader.c | 4 ++-- >>>> hw/loader.h | 3 ++- >>>> 2 files changed, 4 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/hw/loader.c b/hw/loader.c >>>> index 415cdce..a5333d0 100644 >>>> --- a/hw/loader.c >>>> +++ b/hw/loader.c >>>> @@ -103,9 +103,9 @@ ssize_t read_targphys(const char *name, >>>> >>>> /* return the size or -1 if error */ >>>> int load_image_targphys(const char *filename, >>>> - target_phys_addr_t addr, int max_sz) >>>> + target_phys_addr_t addr, unsigned long max_sz) >>>> { >>>> - int size; >>>> + unsigned long size; >>>> >>>> size = get_image_size(filename); >>>> if (size > max_sz) { >>> >>> get_image_size() returns int. How does widening size and max_sz here >>> improve things? >> >> If max_sz is greater than 2GB, then: >> int max_sz = 0xc0000000; >> int size = 0x300; >> if (size > max_sz) >> return -1; >> >> returns -1, even though size is much less than max_sz. >> >> doing it my way: >> unsigned long max_sz = 0xc0000000; >> unsigned long size = 0x300; >> if (size > max_sz) >> return -1; >> >> does not return -1. > > So why change it to long then? Unsigned int would have the same effect. Well, I hit this bug because the arm_loader code passes the system's memory size as the max_sz argument. Currently, we have a 32-bit memory bus, but I know we'll move to 64-bits in the future, and I wanted to be type safe. > But really what this should be changed to is target_phys_addr_t. That > way it's aligned with the address. I guess we can leave int for return > values for now though, since we won't get images that big. Or convert all this stuff to size_t, since that's also appropriate. > Also, why are you hitting this in the first place? How are you calling > read_targphys that you end up with such a big max_sz? Using INT_MAX > as max_sz should work just as well, no? It's probably cleaner to > change the size type, but I'm curious why nobody else hit this before :). Well, arm_load_kernel calls read_targphys and passes (ram_size - KERNEL_LOAD_ADDR) as the max_sz argument. As far as I know, the highbank model is the only ARM model that uses more than 2 GB as ram_size. The change that actually tested the max_sz argument didn't go in until January 2012, and our internal tree was lagging until quite recently, so our testing didn't catch it either. I'll resubmit the patch with target_phys_addr_t. --Mark Langsdorf Calxeda, Inc.
On 09.03.2012, at 14:34, Mark Langsdorf wrote: > On 03/09/2012 07:21 AM, Alexander Graf wrote: >> >> On 09.03.2012, at 14:15, Mark Langsdorf wrote: >> >>> On 03/09/2012 03:25 AM, Markus Armbruster wrote: >>>> Mark Langsdorf <mark.langsdorf@calxeda.com> writes: >>>> >>>>> Allow load_image_targphys to load files on systems with more than 2G of >>>>> emulated memory by changing the max_sz parameter from an int to an >>>>> unsigned long. >>>>> >>>>> Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com> >>>>> --- >>>>> hw/loader.c | 4 ++-- >>>>> hw/loader.h | 3 ++- >>>>> 2 files changed, 4 insertions(+), 3 deletions(-) >>>>> >>>>> diff --git a/hw/loader.c b/hw/loader.c >>>>> index 415cdce..a5333d0 100644 >>>>> --- a/hw/loader.c >>>>> +++ b/hw/loader.c >>>>> @@ -103,9 +103,9 @@ ssize_t read_targphys(const char *name, >>>>> >>>>> /* return the size or -1 if error */ >>>>> int load_image_targphys(const char *filename, >>>>> - target_phys_addr_t addr, int max_sz) >>>>> + target_phys_addr_t addr, unsigned long max_sz) >>>>> { >>>>> - int size; >>>>> + unsigned long size; >>>>> >>>>> size = get_image_size(filename); >>>>> if (size > max_sz) { >>>> >>>> get_image_size() returns int. How does widening size and max_sz here >>>> improve things? >>> >>> If max_sz is greater than 2GB, then: >>> int max_sz = 0xc0000000; >>> int size = 0x300; >>> if (size > max_sz) >>> return -1; >>> >>> returns -1, even though size is much less than max_sz. >>> >>> doing it my way: >>> unsigned long max_sz = 0xc0000000; >>> unsigned long size = 0x300; >>> if (size > max_sz) >>> return -1; >>> >>> does not return -1. >> >> So why change it to long then? Unsigned int would have the same effect. > > Well, I hit this bug because the arm_loader code passes the system's > memory size as the max_sz argument. Currently, we have a 32-bit memory > bus, but I know we'll move to 64-bits in the future, and I wanted to > be type safe. Then use uint64_t or target_phys_addr_t really. Longs are almost always wrong in qemu code, because the guest shouldn't care about the host's bitness. > >> But really what this should be changed to is target_phys_addr_t. That >> way it's aligned with the address. I guess we can leave int for return >> values for now though, since we won't get images that big. > > Or convert all this stuff to size_t, since that's also appropriate. Semantically, I would rather go with target_phys_addr_t. You're trying to describe addresses. If anything, ram_addr_t might work too - never quite grasped the difference. > >> Also, why are you hitting this in the first place? How are you calling >> read_targphys that you end up with such a big max_sz? Using INT_MAX >> as max_sz should work just as well, no? It's probably cleaner to >> change the size type, but I'm curious why nobody else hit this before :). > > Well, arm_load_kernel calls read_targphys and passes > (ram_size - KERNEL_LOAD_ADDR) as the max_sz argument. As far as I know, > the highbank model is the only ARM model that uses more than 2 GB as > ram_size. The change that actually tested the max_sz argument didn't > go in until January 2012, and our internal tree was lagging until > quite recently, so our testing didn't catch it either. Ah, that makes sense :). > I'll resubmit the patch with target_phys_addr_t. Thanks! And please check for the other loaders too if they suffer from the same badness. Alex
On 9 March 2012 13:50, Alexander Graf <agraf@suse.de> wrote: > Semantically, I would rather go with target_phys_addr_t. You're > trying to describe addresses. If anything, ram_addr_t might work too > - never quite grasped the difference. HACKING says "ram_addr_t is a QEMU internal address space that maps guest RAM physical addresses into an intermediate address space that can map to host virtual address spaces". Unless you're actually dealing with things in that intermediate address space you don't want ram_addr_t. -- PMM
Mark Langsdorf <mark.langsdorf@calxeda.com> writes: > On 03/09/2012 03:25 AM, Markus Armbruster wrote: >> Mark Langsdorf <mark.langsdorf@calxeda.com> writes: >> >>> Allow load_image_targphys to load files on systems with more than 2G of >>> emulated memory by changing the max_sz parameter from an int to an >>> unsigned long. >>> >>> Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com> >>> --- >>> hw/loader.c | 4 ++-- >>> hw/loader.h | 3 ++- >>> 2 files changed, 4 insertions(+), 3 deletions(-) >>> >>> diff --git a/hw/loader.c b/hw/loader.c >>> index 415cdce..a5333d0 100644 >>> --- a/hw/loader.c >>> +++ b/hw/loader.c >>> @@ -103,9 +103,9 @@ ssize_t read_targphys(const char *name, >>> >>> /* return the size or -1 if error */ >>> int load_image_targphys(const char *filename, >>> - target_phys_addr_t addr, int max_sz) >>> + target_phys_addr_t addr, unsigned long max_sz) >>> { >>> - int size; >>> + unsigned long size; >>> >>> size = get_image_size(filename); >>> if (size > max_sz) { >> >> get_image_size() returns int. How does widening size and max_sz here >> improve things? > > If max_sz is greater than 2GB, then: > int max_sz = 0xc0000000; > int size = 0x300; > if (size > max_sz) > return -1; > > returns -1, even though size is much less than max_sz. > > doing it my way: > unsigned long max_sz = 0xc0000000; > unsigned long size = 0x300; > if (size > max_sz) > return -1; > > does not return -1. The current code limits max_sz to INT_MAX. Passing 0xc0000000 is a bug. You want to relax the limit. That's fair. But I'm afraid your patch doesn't really relax the limit, or rather it relaxes it just by one bit. Your example shows it works for a case where the higher limit isn't needed. Let's examine three cases where it is: image sizes 2GiB, 4GiB and 5GiB on a host with 32 bit twos complement int, and 64 bit unsigned size_t, max_sz 0xc0000000. Expected result: the first one works, the other two exceed max_sz and get rejected. Actual result, correct me if I'm wrong: 1. get_image_size() calls lseek(), which returns (off_t)2147483648 for 2GiB, (off_t)4294967296 for 4GiB, and (off_t)5368709120 for 5GiB. 2. get_image_size() assigns it to int size. Since int can't hold the value, the result is technically implementation-defined, but in practice it's simply -2147483648 for 2GiB, 0 for 4GiB, and 1073741824 for 5GiB. 3. get_image_size() returns the same. 4. load_image_targphys() assigns the return value to unsigned long size. This changes the value back to 2147483648 for 2GiB, leaves it at 0 for 4GiB, and at 1073741824 for 5GiB. 5. load_image_targphys() fails if size > max_sz. Doesn't fail for any of the three. 6. load_image_targphys() adds the ROM file unless size is zero. Adds the first and the last file. Bug: the second case's image is ignored silently, not rejected for exceeding max_sz. 7. rom_add_file() creates a Rom object. It gets the image size again (hello TOCTTOU, not your patch's fault), and stores it in member size_t romsize. Bug: the third case's image is loaded even though it exceeds max_sz. Hello buffer overrun. I'm afraid you need to do more work to solve this problem. If you're willing to do that, please check out http://lists.nongnu.org/archive/html/qemu-devel/2012-02/msg03627.html
Am 09.03.2012 14:50, schrieb Alexander Graf: > > On 09.03.2012, at 14:34, Mark Langsdorf wrote: > >> On 03/09/2012 07:21 AM, Alexander Graf wrote: >>> >>> On 09.03.2012, at 14:15, Mark Langsdorf wrote: >>> >>>> On 03/09/2012 03:25 AM, Markus Armbruster wrote: >>>>> Mark Langsdorf <mark.langsdorf@calxeda.com> writes: >>>>> >>>>>> Allow load_image_targphys to load files on systems with more than 2G of >>>>>> emulated memory by changing the max_sz parameter from an int to an >>>>>> unsigned long. >>>>>> >>>>>> Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com> >>>>>> --- >>>>>> hw/loader.c | 4 ++-- >>>>>> hw/loader.h | 3 ++- >>>>>> 2 files changed, 4 insertions(+), 3 deletions(-) >>>>>> >>>>>> diff --git a/hw/loader.c b/hw/loader.c >>>>>> index 415cdce..a5333d0 100644 >>>>>> --- a/hw/loader.c >>>>>> +++ b/hw/loader.c >>>>>> @@ -103,9 +103,9 @@ ssize_t read_targphys(const char *name, >>>>>> >>>>>> /* return the size or -1 if error */ >>>>>> int load_image_targphys(const char *filename, >>>>>> - target_phys_addr_t addr, int max_sz) >>>>>> + target_phys_addr_t addr, unsigned long max_sz) >>>>>> { >>>>>> - int size; >>>>>> + unsigned long size; >>>>>> >>>>>> size = get_image_size(filename); >>>>>> if (size > max_sz) { >>>>> >>>>> get_image_size() returns int. How does widening size and max_sz here >>>>> improve things? >>>> >>>> If max_sz is greater than 2GB, then: >>>> int max_sz = 0xc0000000; >>>> int size = 0x300; >>>> if (size > max_sz) >>>> return -1; >>>> >>>> returns -1, even though size is much less than max_sz. >>>> >>>> doing it my way: >>>> unsigned long max_sz = 0xc0000000; >>>> unsigned long size = 0x300; >>>> if (size > max_sz) >>>> return -1; >>>> >>>> does not return -1. >>> >>> So why change it to long then? Unsigned int would have the same effect. >> >> Well, I hit this bug because the arm_loader code passes the system's >> memory size as the max_sz argument. Currently, we have a 32-bit memory >> bus, but I know we'll move to 64-bits in the future, and I wanted to >> be type safe. > > Then use uint64_t or target_phys_addr_t really. Longs are almost always wrong in qemu code, because the guest shouldn't care about the host's bitness. > >> >>> But really what this should be changed to is target_phys_addr_t. That >>> way it's aligned with the address. I guess we can leave int for return >>> values for now though, since we won't get images that big. >> >> Or convert all this stuff to size_t, since that's also appropriate. > > Semantically, I would rather go with target_phys_addr_t. You're trying to describe addresses. If anything, ram_addr_t might work too - never quite grasped the difference. > >> >>> Also, why are you hitting this in the first place? How are you calling >>> read_targphys that you end up with such a big max_sz? Using INT_MAX >>> as max_sz should work just as well, no? It's probably cleaner to >>> change the size type, but I'm curious why nobody else hit this before :). >> >> Well, arm_load_kernel calls read_targphys and passes >> (ram_size - KERNEL_LOAD_ADDR) as the max_sz argument. As far as I know, >> the highbank model is the only ARM model that uses more than 2 GB as >> ram_size. The change that actually tested the max_sz argument didn't >> go in until January 2012, and our internal tree was lagging until >> quite recently, so our testing didn't catch it either. > > Ah, that makes sense :). > >> I'll resubmit the patch with target_phys_addr_t. No, please. We're describing sizes, not addresses. target_phys_addr_t thus is semantically wrong here. The RAM size is unsigned long IIRC (it is limited by the host's available memory). If you subtract something from a size it remains a size. I had therefore suggested size_t before. I expect sizeof(size_t) >= sizeof(unsigned long). In the previous discussion someone suggested off_t due to some function used internally returning it. I don't know the exact difference between the two, but off_t still sounds wrong to me since we want a size, not a file offset. Andreas > Thanks! And please check for the other loaders too if they suffer from the same badness. > > > Alex
On 03/09/2012 08:17 AM, Markus Armbruster wrote: > Mark Langsdorf <mark.langsdorf@calxeda.com> writes: > >> On 03/09/2012 03:25 AM, Markus Armbruster wrote: >>> get_image_size() returns int. How does widening size and max_sz here >>> improve things? >> >> If max_sz is greater than 2GB, then: >> int max_sz = 0xc0000000; >> int size = 0x300; >> if (size > max_sz) >> return -1; >> >> returns -1, even though size is much less than max_sz. >> >> doing it my way: >> unsigned long max_sz = 0xc0000000; >> unsigned long size = 0x300; >> if (size > max_sz) >> return -1; >> >> does not return -1. > > The current code limits max_sz to INT_MAX. Passing 0xc0000000 is a bug. > > You want to relax the limit. That's fair. But I'm afraid your patch > doesn't really relax the limit, or rather it relaxes it just by one bit. > > Your example shows it works for a case where the higher limit isn't > needed. Let's examine three cases where it is: image sizes 2GiB, 4GiB > and 5GiB on a host with 32 bit twos complement int, and 64 bit unsigned > size_t, max_sz 0xc0000000. I'm quite willing to leave the problem of people loading 2GiB images to another day. Loading a 300KiB on an ARM machine with 4GiB of memory is a problem I'm facing right now, though. > I'm afraid you need to do more work to solve this problem. If you're > willing to do that, please check out > http://lists.nongnu.org/archive/html/qemu-devel/2012-02/msg03627.html I'm inclined to agree with David that this is not the issue I'm trying to solve =) --Mark Langsdorf Calxeda, Inc.
Mark Langsdorf <mark.langsdorf@calxeda.com> writes: > On 03/09/2012 08:17 AM, Markus Armbruster wrote: >> Mark Langsdorf <mark.langsdorf@calxeda.com> writes: >> >>> On 03/09/2012 03:25 AM, Markus Armbruster wrote: >>>> get_image_size() returns int. How does widening size and max_sz here >>>> improve things? >>> >>> If max_sz is greater than 2GB, then: >>> int max_sz = 0xc0000000; >>> int size = 0x300; >>> if (size > max_sz) >>> return -1; >>> >>> returns -1, even though size is much less than max_sz. >>> >>> doing it my way: >>> unsigned long max_sz = 0xc0000000; >>> unsigned long size = 0x300; >>> if (size > max_sz) >>> return -1; >>> >>> does not return -1. >> >> The current code limits max_sz to INT_MAX. Passing 0xc0000000 is a bug. >> >> You want to relax the limit. That's fair. But I'm afraid your patch >> doesn't really relax the limit, or rather it relaxes it just by one bit. >> >> Your example shows it works for a case where the higher limit isn't >> needed. Let's examine three cases where it is: image sizes 2GiB, 4GiB >> and 5GiB on a host with 32 bit twos complement int, and 64 bit unsigned >> size_t, max_sz 0xc0000000. > > I'm quite willing to leave the problem of people loading 2GiB images to > another day. Loading a 300KiB on an ARM machine with 4GiB of memory is > a problem I'm facing right now, though. Easiest fix for that is passing a max_sz argument that doesn't overflow the parameter type :) [...]
On 9 March 2012 14:28, Andreas Färber <afaerber@suse.de> wrote: > No, please. We're describing sizes, not addresses. target_phys_addr_t > thus is semantically wrong here. The RAM size is unsigned long IIRC (it > is limited by the host's available memory). If you subtract something > from a size it remains a size. I had therefore suggested size_t before. > I expect sizeof(size_t) >= sizeof(unsigned long). We're discussing target sizes. size_t might be smaller than target_phys_addr_t, so it's also semantically wrong. We don't have a target_size_t, though, and I think "use an address related type for an offset" is less bad than "use a host sized type for a guest sized value". Compare the way we use target_phys_addr_t for the offset arguments to MemoryRegion read/write functions. -- PMM
Am 09.03.2012 18:11, schrieb Peter Maydell: > On 9 March 2012 14:28, Andreas Färber <afaerber@suse.de> wrote: >> No, please. We're describing sizes, not addresses. target_phys_addr_t >> thus is semantically wrong here. The RAM size is unsigned long IIRC (it >> is limited by the host's available memory). If you subtract something >> from a size it remains a size. I had therefore suggested size_t before. >> I expect sizeof(size_t) >= sizeof(unsigned long). > > We're discussing target sizes. size_t might be smaller than > target_phys_addr_t, so it's also semantically wrong. We don't > have a target_size_t, though, and I think "use an address > related type for an offset" is less bad than "use a host > sized type for a guest sized value". That is a moot point. There is no such thing as a "target size". The size is defined by the guest OS, not by the architecture. And it doesn't matter if the guest OS's size is defined larger than the host's because we process those files on the host and they must fit into host memory. So unsigned long would be perfectly fine if ignoring oddball win64. > Compare the way we use target_phys_addr_t for the offset arguments > to MemoryRegion read/write functions. Nobody here has been arguing against using target_phys_addr_t for guest memory *offsets*. Actually, the size (1, 2, 4) is an unsigned int there. :) Fine with me. And due to very similar signed overflow issues with int64_t, 128-bit integer support was introduced as a workaround. ;) Anyway, POSIX:2008 says this about sys/types.h: off_t Used for file sizes. size_t Used for sizes of objects. ssize_t Used for a count of bytes or an error indication. So off_t seems right for get_image_size() although a bit counter-intuitive. However later is said, "off_t shall be signed integer types". So on a 32-bit host that does not necessarily help for the case at hands. (Since get_image_size() gets its value as an off_t though, it doesn't matter there and improves the 64-bit host case.) By comparison, "size_t shall be an unsigned integer type" and "The implementation shall support one or more programming environments in which the widths of [...] size_t, ssize_t, [...] are no greater than the width of type long". So I still think that size_t is our best bet for potentially large sizes here. Comparison with off_t would cause warnings though... Andreas
On 09.03.2012, at 19:47, Andreas Färber wrote: > Am 09.03.2012 18:11, schrieb Peter Maydell: >> On 9 March 2012 14:28, Andreas Färber <afaerber@suse.de> wrote: >>> No, please. We're describing sizes, not addresses. target_phys_addr_t >>> thus is semantically wrong here. The RAM size is unsigned long IIRC (it >>> is limited by the host's available memory). If you subtract something >>> from a size it remains a size. I had therefore suggested size_t before. >>> I expect sizeof(size_t) >= sizeof(unsigned long). >> >> We're discussing target sizes. size_t might be smaller than >> target_phys_addr_t, so it's also semantically wrong. We don't >> have a target_size_t, though, and I think "use an address >> related type for an offset" is less bad than "use a host >> sized type for a guest sized value". > > That is a moot point. There is no such thing as a "target size". The > size is defined by the guest OS, not by the architecture. And it doesn't > matter if the guest OS's size is defined larger than the host's because > we process those files on the host and they must fit into host memory. > > So unsigned long would be perfectly fine if ignoring oddball win64. > >> Compare the way we use target_phys_addr_t for the offset arguments >> to MemoryRegion read/write functions. > > Nobody here has been arguing against using target_phys_addr_t for guest > memory *offsets*. > > Actually, the size (1, 2, 4) is an unsigned int there. :) Fine with me. > > And due to very similar signed overflow issues with int64_t, 128-bit > integer support was introduced as a workaround. ;) > > > Anyway, POSIX:2008 says this about sys/types.h: > > off_t > Used for file sizes. > size_t > Used for sizes of objects. > ssize_t > Used for a count of bytes or an error indication. > > So off_t seems right for get_image_size() although a bit > counter-intuitive. However later is said, "off_t shall be signed integer > types". So on a 32-bit host that does not necessarily help for the case > at hands. (Since get_image_size() gets its value as an off_t though, it > doesn't matter there and improves the 64-bit host case.) > > By comparison, "size_t shall be an unsigned integer type" and "The > implementation shall support one or more programming environments in > which the widths of [...] size_t, ssize_t, [...] are no greater than the > width of type long". > > So I still think that size_t is our best bet for potentially large sizes > here. Comparison with off_t would cause warnings though... You're on the wrong track here really :). Size, max_size and friends are all offsets into the guest's address space, so that's the type they should have. We don't care about host POSIX whatever semantics here - it's a question of how big can guest address space grow. Imagine, your host can only do 32 bit file offsets. You want to emulate a 64bit guest though. To load an image, you say "load the image at offset x, max size <guest ram size - x>". If you use size_t, it would break for large ram guests, because the size really means semantically that you want to be able to load into address [x...ram_size] of the guest's physical memory range. If the host can actually load images that big is an orthogonal question. Alex
Alexander Graf <agraf@suse.de> writes: [...] > Imagine, your host can only do 32 bit file offsets. You want to > emulate a 64bit guest though. To load an image, you say "load the > image at offset x, max size <guest ram size - x>". If you use size_t, > it would break for large ram guests, because the size really means > semantically that you want to be able to load into address > [x...ram_size] of the guest's physical memory range. If the host can > actually load images that big is an orthogonal question. A host with 64 bit virtual address space, yet file offsets (and thus file sizes) limited to 32 bits? That's a contrived example if I ever saw one. You need to come up with a remotely practical one to convince me :)
On 9 March 2012 18:47, Andreas Färber <afaerber@suse.de> wrote: > Am 09.03.2012 18:11, schrieb Peter Maydell: >> On 9 March 2012 14:28, Andreas Färber <afaerber@suse.de> wrote: >>> No, please. We're describing sizes, not addresses. target_phys_addr_t >>> thus is semantically wrong here. The RAM size is unsigned long IIRC (it >>> is limited by the host's available memory). If you subtract something >>> from a size it remains a size. I had therefore suggested size_t before. >>> I expect sizeof(size_t) >= sizeof(unsigned long). >> >> We're discussing target sizes. size_t might be smaller than >> target_phys_addr_t, so it's also semantically wrong. We don't >> have a target_size_t, though, and I think "use an address >> related type for an offset" is less bad than "use a host >> sized type for a guest sized value". > > That is a moot point. There is no such thing as a "target size". "Length of a block of memory on the guest" is what I meant. What you need is "an integer type large enough to hold the difference between two guest pointer values". The size of that type should depend only on the guest config, not on the host, so 'unsigned long', 'size_t', 'off_t' etc are all wrong. -- PMM
Am 10.03.2012 14:51, schrieb Peter Maydell: > On 9 March 2012 18:47, Andreas Färber <afaerber@suse.de> wrote: >> Am 09.03.2012 18:11, schrieb Peter Maydell: >>> On 9 March 2012 14:28, Andreas Färber <afaerber@suse.de> wrote: >>>> No, please. We're describing sizes, not addresses. target_phys_addr_t >>>> thus is semantically wrong here. The RAM size is unsigned long IIRC (it >>>> is limited by the host's available memory). If you subtract something >>>> from a size it remains a size. I had therefore suggested size_t before. >>>> I expect sizeof(size_t) >= sizeof(unsigned long). >>> >>> We're discussing target sizes. size_t might be smaller than >>> target_phys_addr_t, so it's also semantically wrong. We don't >>> have a target_size_t, though, and I think "use an address >>> related type for an offset" is less bad than "use a host >>> sized type for a guest sized value". >> >> That is a moot point. There is no such thing as a "target size". > > "Length of a block of memory on the guest" is what I meant. > What you need is "an integer type large enough to hold the > difference between two guest pointer values". The size of > that type should depend only on the guest config, not on the > host, so 'unsigned long', 'size_t', 'off_t' etc are all wrong. Your view is very ARM-centric. In the PowerPC domain for instance, we have a number of usages where we hardcode a size of, e.g., 1 MB. And Alex should know that. I don't want to use target_phys_addr_t for that and forcing an end address calculation, as suggested by Alex, would be possible but is not as convenient as the current API. Doing a size check as Mark has demonstrated in ARM code (one place!) seems much simpler to me than hurting all targets just because ARM wants to pass a possibly stupid value unchecked to the common API. Compare David's off_t patch from March 8th: We'll never get an image size larger than off_t's max. Using target_phys_addr_t, uint64_t or __int128_t for the max are all moot (academic) because we'll never get to their max if it's larger than off_t. Therefore I've been saying, the host's limit is the upper realistic limit for image_load_targphys(). ELF may be a different case to consider since it can be sparse. Andreas
Am 09.03.2012 20:04, schrieb Alexander Graf: > > On 09.03.2012, at 19:47, Andreas Färber wrote: > >> Am 09.03.2012 18:11, schrieb Peter Maydell: >>> On 9 March 2012 14:28, Andreas Färber <afaerber@suse.de> wrote: >>>> No, please. We're describing sizes, not addresses. target_phys_addr_t >>>> thus is semantically wrong here. The RAM size is unsigned long IIRC (it >>>> is limited by the host's available memory). If you subtract something >>>> from a size it remains a size. I had therefore suggested size_t before. >>>> I expect sizeof(size_t) >= sizeof(unsigned long). >>> >>> We're discussing target sizes. size_t might be smaller than >>> target_phys_addr_t, so it's also semantically wrong. We don't >>> have a target_size_t, though, and I think "use an address >>> related type for an offset" is less bad than "use a host >>> sized type for a guest sized value". >> >> That is a moot point. There is no such thing as a "target size". The >> size is defined by the guest OS, not by the architecture. And it doesn't >> matter if the guest OS's size is defined larger than the host's because >> we process those files on the host and they must fit into host memory. >> >> So unsigned long would be perfectly fine if ignoring oddball win64. >> >>> Compare the way we use target_phys_addr_t for the offset arguments >>> to MemoryRegion read/write functions. >> >> Nobody here has been arguing against using target_phys_addr_t for guest >> memory *offsets*. >> >> Actually, the size (1, 2, 4) is an unsigned int there. :) Fine with me. >> >> And due to very similar signed overflow issues with int64_t, 128-bit >> integer support was introduced as a workaround. ;) >> >> >> Anyway, POSIX:2008 says this about sys/types.h: >> >> off_t >> Used for file sizes. >> size_t >> Used for sizes of objects. >> ssize_t >> Used for a count of bytes or an error indication. >> >> So off_t seems right for get_image_size() although a bit >> counter-intuitive. However later is said, "off_t shall be signed integer >> types". So on a 32-bit host that does not necessarily help for the case >> at hands. (Since get_image_size() gets its value as an off_t though, it >> doesn't matter there and improves the 64-bit host case.) >> >> By comparison, "size_t shall be an unsigned integer type" and "The >> implementation shall support one or more programming environments in >> which the widths of [...] size_t, ssize_t, [...] are no greater than the >> width of type long". >> >> So I still think that size_t is our best bet for potentially large sizes >> here. Comparison with off_t would cause warnings though... > > You're on the wrong track here really :). Size, max_size and friends are all offsets into the guest's address space, so that's the type they should have. We don't care about host POSIX whatever semantics here - it's a question of how big can guest address space grow. No, we're talking about different use cases here. You *assume* the size depends on guest RAM. I don't. And no, max_size is not an offset. start + max_size is. The difference is that this expression may overflow to zero in cases I care about. For both PReP and Macs the size depends solely on the physical address space and we know the max size ahead of time. For your use case you can easily provide a separate function as a wrapper, taking target_phys_addr_t start, end if you like (you're right, that's possible) and calculating the off_t/int/... max_sz there and pass it on to the existing API. Anything bigger we just cannot load as a blob. Andreas
On 10 March 2012 14:08, Andreas Färber <afaerber@suse.de> wrote: > Am 10.03.2012 14:51, schrieb Peter Maydell: >> "Length of a block of memory on the guest" is what I meant. >> What you need is "an integer type large enough to hold the >> difference between two guest pointer values". The size of >> that type should depend only on the guest config, not on the >> host, so 'unsigned long', 'size_t', 'off_t' etc are all wrong. > > Your view is very ARM-centric. I don't understand this remark. Nothing about the above explanation is ARM-centric -- it's just pointing out that the guest and the host are two separate things and maximum widths of size and pointer types aren't necessarily the same. Assuming they were the same would be an x86-64-ism :-) > In the PowerPC domain for instance, we > have a number of usages where we hardcode a size of, e.g., 1 MB. And > Alex should know that. I don't want to use target_phys_addr_t for that > and forcing an end address calculation, as suggested by Alex, would be > possible but is not as convenient as the current API. > > Doing a size check as Mark has demonstrated in ARM code (one place!) > seems much simpler to me than hurting all targets just because ARM wants > to pass a possibly stupid value unchecked to the common API. Where the ARM specific code has particular restrictions then I'm happy to have the ARM specific code either relax those, or do explicit checks so we can fail cleanly or whatever. Putting a "restrict to INT_MAX" in the highbank code is definitely wrong (not least because passing values to load_image_targphys isn't the only thing we use that field in arm_boot_info for!) The ARM boot code needs updating because it shouldn't be using 'int' for arm_boot_info.ram_size, and using target_phys_addr_t the same as we do for initrd_size is the obvious thing. I have no particular objection to having some new target_phys_size_t or whatever, and I could be persuaded that we should follow the memory API in using uint64_t for sizes, but it needs to be a type that either follows guest phys addr restrictions or a fixed-width type which we have decreed is always large enough, not a type which varies based on host properties. (arm_boot also needs to fail if the size is > 4GB since at the moment we do assume it to be 32 bits wide, but that's a different problem.) > Compare David's off_t patch from March 8th: We'll never get an image > size larger than off_t's max. Using target_phys_addr_t, uint64_t or > __int128_t for the max are all moot (academic) because we'll never get > to their max if it's larger than off_t. Therefore I've been saying, the > host's limit is the upper realistic limit for image_load_targphys(). You mean this patch? http://patchwork.ozlabs.org/patch/140064/ get_image_size() is asking a question about the host (ie "what is the size of this host file?"). It therefore makes perfect sense for its return type to be one whose size depends on properties of the host, like off_t. load_image_targphys()'s max_sz parameter is passing it information which is about the guest (usually "how big is this block of RAM we're going to try to load this thing into?"). It therefore makes perfect sense for its type to be one whose size depends on properties of the guest, or alternatively one whose type is guaranteed to always be at least as large as the worst case guest we support (so you could use uint64_t, as the memory region API does for region sizes). -- PMM
On 03/10/2012 09:27 AM, Peter Maydell wrote: > On 10 March 2012 14:08, Andreas Färber <afaerber@suse.de> wrote: >> Am 10.03.2012 14:51, schrieb Peter Maydell: >>> "Length of a block of memory on the guest" is what I meant. >>> What you need is "an integer type large enough to hold the >>> difference between two guest pointer values". The size of >>> that type should depend only on the guest config, not on the >>> host, so 'unsigned long', 'size_t', 'off_t' etc are all wrong. >> >> Your view is very ARM-centric. > > I don't understand this remark. Nothing about the above explanation > is ARM-centric -- it's just pointing out that the guest and the > host are two separate things and maximum widths of size and > pointer types aren't necessarily the same. Assuming they were the > same would be an x86-64-ism :-) >> >> Doing a size check as Mark has demonstrated in ARM code (one place!) >> seems much simpler to me than hurting all targets just because ARM wants >> to pass a possibly stupid value unchecked to the common API. > > Where the ARM specific code has particular restrictions then > I'm happy to have the ARM specific code either relax those, or > do explicit checks so we can fail cleanly or whatever. > > Putting a "restrict to INT_MAX" in the highbank code is definitely > wrong (not least because passing values to load_image_targphys isn't > the only thing we use that field in arm_boot_info for!) > The ARM boot code needs updating because it shouldn't be > using 'int' for arm_boot_info.ram_size, and using target_phys_addr_t > the same as we do for initrd_size is the obvious thing. I have no > particular objection to having some new target_phys_size_t or whatever, > and I could be persuaded that we should follow the memory API in > using uint64_t for sizes, but it needs to be a type that either follows > guest phys addr restrictions or a fixed-width type which we have decreed > is always large enough, not a type which varies based on host properties. I'm reluctant to continue this thread, but I feel obliged to ask: what types of changes should I make to allow the highbank model to put with more than 2GB of emulated DRAM? I've fired off 3 versions of a patch that answers that question, some of which I've liked more than others. I'm willing to do a reasonable amount of refactoring the general QEMU image loading code, but I don't want to do that until I have a sense that the maintainers agree on the general solution and that I'm working toward their understand. Thanks for thinking this over. --Mark Langsdorf Calxeda, Inc.
Peter Maydell <peter.maydell@linaro.org> writes: [...] > Putting a "restrict to INT_MAX" in the highbank code is definitely > wrong (not least because passing values to load_image_targphys isn't > the only thing we use that field in arm_boot_info for!) > The ARM boot code needs updating because it shouldn't be > using 'int' for arm_boot_info.ram_size, and using target_phys_addr_t > the same as we do for initrd_size is the obvious thing. I have no > particular objection to having some new target_phys_size_t or whatever, > and I could be persuaded that we should follow the memory API in > using uint64_t for sizes, but it needs to be a type that either follows > guest phys addr restrictions or a fixed-width type which we have decreed > is always large enough, There is already a type that is defined to be wide enough for any object size: size_t. > not a type which varies based on host properties. To be honest, the whole debate feels like bikeshedding to me. Yes, load_image_targphys()'s argument max_sz is the size of a slice of guest memory. It's also the size of a host object, allocated with g_malloc0() in rom_add_file(). It's also the size of a disk file. I'd make it size_t and be done with it. If you absolutely must overengineer things, go ahead and create a new type for target sizes. I doubt making it wider than size_t will work in practice without a lot of hoop jumping, though. [...]
On 12.03.2012, at 16:53, Markus Armbruster wrote: > Peter Maydell <peter.maydell@linaro.org> writes: > > [...] >> Putting a "restrict to INT_MAX" in the highbank code is definitely >> wrong (not least because passing values to load_image_targphys isn't >> the only thing we use that field in arm_boot_info for!) >> The ARM boot code needs updating because it shouldn't be >> using 'int' for arm_boot_info.ram_size, and using target_phys_addr_t >> the same as we do for initrd_size is the obvious thing. I have no >> particular objection to having some new target_phys_size_t or whatever, >> and I could be persuaded that we should follow the memory API in >> using uint64_t for sizes, but it needs to be a type that either follows >> guest phys addr restrictions or a fixed-width type which we have decreed >> is always large enough, > > There is already a type that is defined to be wide enough for any object > size: size_t. > >> not a type which varies based on host properties. > > To be honest, the whole debate feels like bikeshedding to me. Yes, > load_image_targphys()'s argument max_sz is the size of a slice of guest > memory. It's also the size of a host object, allocated with g_malloc0() > in rom_add_file(). It's also the size of a disk file. > > I'd make it size_t and be done with it. If you absolutely must > overengineer things, go ahead and create a new type for target sizes. I > doubt making it wider than size_t will work in practice without a lot of > hoop jumping, though. I agree. Let's end this discussion and use the biggest variable type we support for addresses: uint64_t. That way we have host independent predictability, but don't use _addr_t types which Andreas seems to dislike. Alex
On 12 March 2012 16:04, Alexander Graf <agraf@suse.de> wrote: > On 12.03.2012, at 16:53, Markus Armbruster wrote: >> To be honest, the whole debate feels like bikeshedding to me. Yes, >> load_image_targphys()'s argument max_sz is the size of a slice of guest >> memory. It's also the size of a host object, allocated with g_malloc0() >> in rom_add_file(). It's also the size of a disk file. >> >> I'd make it size_t and be done with it. If you absolutely must >> overengineer things, go ahead and create a new type for target sizes. I >> doubt making it wider than size_t will work in practice without a lot of >> hoop jumping, though. > > I agree. Let's end this discussion and use the biggest variable type we > support for addresses: uint64_t. That way we have host independent > predictability, but don't use _addr_t types which Andreas seems to dislike. Works for me, and follows the precedent of the memory region API, which makes sense. -- PMM
Am 10.03.2012 16:27, schrieb Peter Maydell: > On 10 March 2012 14:08, Andreas Färber <afaerber@suse.de> wrote: >> Am 10.03.2012 14:51, schrieb Peter Maydell: >>> "Length of a block of memory on the guest" is what I meant. >>> What you need is "an integer type large enough to hold the >>> difference between two guest pointer values". The size of >>> that type should depend only on the guest config, not on the >>> host, so 'unsigned long', 'size_t', 'off_t' etc are all wrong. >> >> Your view is very ARM-centric. > > I don't understand this remark. Nothing about the above explanation > is ARM-centric [snip] ARM-centric is that you and Mark are trying to solve the issue of ARM boards supplying guest RAM size as limit for image loading. I'm saying many other boards don't have that issue because they're using a constant size value way below any theoretical limit. Now the theory behind this part of the thread is about what type to use for a delta vs. absolute value. If you subtract void * from void * you get a ptrdiff_t. Similarly in C# or Java or whatever subtracing two Time objects will return a TimeDiff object or so. Whether you specify a temperature in degrees Celsius or Kelvin, the difference is in degrees. In that same spirit I'm opposed to using target_phys_addr_t for a size (delta) between two addresses. HTE. Andreas
Am 12.03.2012 17:04, schrieb Alexander Graf: > > On 12.03.2012, at 16:53, Markus Armbruster wrote: > >> Peter Maydell <peter.maydell@linaro.org> writes: >> >> [...] >>> Putting a "restrict to INT_MAX" in the highbank code is definitely >>> wrong (not least because passing values to load_image_targphys isn't >>> the only thing we use that field in arm_boot_info for!) >>> The ARM boot code needs updating because it shouldn't be >>> using 'int' for arm_boot_info.ram_size, and using target_phys_addr_t >>> the same as we do for initrd_size is the obvious thing. I have no >>> particular objection to having some new target_phys_size_t or whatever, >>> and I could be persuaded that we should follow the memory API in >>> using uint64_t for sizes, but it needs to be a type that either follows >>> guest phys addr restrictions or a fixed-width type which we have decreed >>> is always large enough, >> >> There is already a type that is defined to be wide enough for any object >> size: size_t. >> >>> not a type which varies based on host properties. >> >> To be honest, the whole debate feels like bikeshedding to me. Yes, >> load_image_targphys()'s argument max_sz is the size of a slice of guest >> memory. It's also the size of a host object, allocated with g_malloc0() >> in rom_add_file(). It's also the size of a disk file. >> >> I'd make it size_t and be done with it. If you absolutely must >> overengineer things, go ahead and create a new type for target sizes. I >> doubt making it wider than size_t will work in practice without a lot of >> hoop jumping, though. > > I agree. Let's end this discussion and use the biggest variable type we support for addresses: uint64_t. That way we have host independent predictability, but don't use _addr_t types which Andreas seems to dislike. Fine with me. Andreas
diff --git a/hw/loader.c b/hw/loader.c index 415cdce..a5333d0 100644 --- a/hw/loader.c +++ b/hw/loader.c @@ -103,9 +103,9 @@ ssize_t read_targphys(const char *name, /* return the size or -1 if error */ int load_image_targphys(const char *filename, - target_phys_addr_t addr, int max_sz) + target_phys_addr_t addr, unsigned long max_sz) { - int size; + unsigned long size; size = get_image_size(filename); if (size > max_sz) { diff --git a/hw/loader.h b/hw/loader.h index fbcaba9..35c1652 100644 --- a/hw/loader.h +++ b/hw/loader.h @@ -4,7 +4,8 @@ /* loader.c */ int get_image_size(const char *filename); int load_image(const char *filename, uint8_t *addr); /* deprecated */ -int load_image_targphys(const char *filename, target_phys_addr_t, int max_sz); +int load_image_targphys(const char *filename, target_phys_addr_t, + unsigned long max_sz); int load_elf(const char *filename, uint64_t (*translate_fn)(void *, uint64_t), void *translate_opaque, uint64_t *pentry, uint64_t *lowaddr, uint64_t *highaddr, int big_endian, int elf_machine,
Allow load_image_targphys to load files on systems with more than 2G of emulated memory by changing the max_sz parameter from an int to an unsigned long. Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com> --- hw/loader.c | 4 ++-- hw/loader.h | 3 ++- 2 files changed, 4 insertions(+), 3 deletions(-)