diff mbox series

contrib/elf2dmp: a workaround for the buggy msvcrt.dll!fwrite

Message ID 20240708112520.106127-1-junjiehua@tencent.com
State New
Headers show
Series contrib/elf2dmp: a workaround for the buggy msvcrt.dll!fwrite | expand

Commit Message

junjiehua July 8, 2024, 11:25 a.m. UTC
when building elf2dump with x86_64-w64-mingw32-gcc, fwrite is imported from
msvcrt.dll. However, the implementation of msvcrt.dll!fwrite is buggy:
it enters an infinite loop when the size of a single write exceeds 4GB.
This patch addresses the issue by splitting large physical memory
blocks into smaller chunks.

Signed-off-by: junjiehua <junjiehua@tencent.com>
---
 contrib/elf2dmp/main.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

Comments

Peter Maydell July 9, 2024, 2:39 p.m. UTC | #1
On Mon, 8 Jul 2024 at 14:24, junjiehua <halouworls@gmail.com> wrote:
>
> when building elf2dump with x86_64-w64-mingw32-gcc, fwrite is imported from
> msvcrt.dll. However, the implementation of msvcrt.dll!fwrite is buggy:
> it enters an infinite loop when the size of a single write exceeds 4GB.
> This patch addresses the issue by splitting large physical memory
> blocks into smaller chunks.

Hi; thanks for this patch.

(Does the library fwrite fail for > 4GB, or for >= 4GB ?)

> Signed-off-by: junjiehua <junjiehua@tencent.com>
> ---
>  contrib/elf2dmp/main.c | 19 +++++++++++++++----
>  1 file changed, 15 insertions(+), 4 deletions(-)
>
> diff --git a/contrib/elf2dmp/main.c b/contrib/elf2dmp/main.c
> index d046a72ae6..1994553d95 100644
> --- a/contrib/elf2dmp/main.c
> +++ b/contrib/elf2dmp/main.c
> @@ -23,6 +23,8 @@
>  #define INITIAL_MXCSR   0x1f80
>  #define MAX_NUMBER_OF_RUNS  42
>
> +#define MAX_CHUNK_SIZE (128 * 1024 * 1024)

I think we could add a comment here, something like:

/*
 * Maximum size to fwrite() to the output file at once;
 * the MSVCRT runtime will not correctly handle fwrite()
 * of more than 4GB at once.
 */

That will act as a reminder about why we do it.

(Does the library fwrite fail for > 4GB, or for >= 4GB ?
Your commit message says the former, so I've gone with that,
but if it's an "overflows 32 bit variable" kind of bug then
4GB exactly probably also doesn't work.)

Is there a particular reason to use 128MB here? If the
runtime only fails on 4GB or more, maybe we should use
a larger MAX_CHUNK_SIZE, like 2GB ?



> +
>  typedef struct idt_desc {
>      uint16_t offset1;   /* offset bits 0..15 */
>      uint16_t selector;
> @@ -434,13 +436,22 @@ static bool write_dump(struct pa_space *ps,
>
>      for (i = 0; i < ps->block_nr; i++) {
>          struct pa_block *b = &ps->block[i];
> +        size_t offset = 0;
> +        size_t chunk_size;
>
>          printf("Writing block #%zu/%zu of %"PRIu64" bytes to file...\n", i,
>                  ps->block_nr, b->size);
> -        if (fwrite(b->addr, b->size, 1, dmp_file) != 1) {
> -            eprintf("Failed to write block\n");
> -            fclose(dmp_file);
> -            return false;
> +
> +        while (offset < b->size) {
> +            chunk_size = (b->size - offset > MAX_CHUNK_SIZE)
> +                         ? MAX_CHUNK_SIZE
> +                         : (b->size - offset);

You can write this as
     chunk_size = MIN(b->size - offset, MAX_CHUNK_SIZE);
which I think is clearer. (Our osdep header provides MIN().)

> +            if (fwrite(b->addr + offset, chunk_size, 1, dmp_file) != 1) {
> +                eprintf("Failed to write block\n");
> +                fclose(dmp_file);
> +                return false;
> +            }
> +            offset += chunk_size;

I think we should abstract out the bug workaround into a
separate function, with the same API as fwrite(). Call
it do_fwrite() or something, and make all the fwrite()
calls use it. I know at the moment there's only two of
them, and one of them is the header so never 4GB, but
I think this more cleanly separates out the "work around
a runtime library problem" part from the main logic of
the program, and will mean that if we ever need to rearrange
how we write out the data in future it will be simple.

thanks
-- PMM
junjiehua July 10, 2024, 8:02 a.m. UTC | #2
>

On Tue, Jul 9, 2024 at 10:39 PM Peter Maydell <peter.maydell@linaro.org>
> wrote:


> > +#define MAX_CHUNK_SIZE (128 * 1024 * 1024)
>
> I think we could add a comment here, something like:
>
> /*
>  * Maximum size to fwrite() to the output file at once;
>  * the MSVCRT runtime will not correctly handle fwrite()
>  * of more than 4GB at once.
>  */

That will act as a reminder about why we do it.
>
>
Thanks, I agree.


> (Does the library fwrite fail for > 4GB, or for >= 4GB ?
> Your commit message says the former, so I've gone with that,
> but if it's an "overflows 32 bit variable" kind of bug then
> 4GB exactly probably also doesn't work.)
>


It fails for > 4GB.
The msvcrt.dll!fwrite(buff, (4G+0x1000), 1, file)  works as following:
(based on assembly, not the original source, irrelevant parts have been
omitted)

size_t fwrite(const void* buff, size_t element_size, size_t element_count,
FILE* file_p)
{
    size_t size_t_total_size = element_size * element_count;
    size_t size_t_remain_size = size_t_total_size;

    unsigned int u32_written_bytes;
    unsigned int buf_size;

    /* The register used is r12d but not r12.
     * So I suspect that Microsoft wrote it as an unsigned int type
     * (or msvc compiler bug? seems unlikely)
     */
    unsigned int u32_chunk_size;

    while (true) {

        if ((file_p->flags & 0x10C) != 0) {
            buf_size = file_p->buf_size;
        }
        else {
            // Always reaches here on the first fwrite() after fopen().
            buf_size = 4096;  // mov     r15d, 1000h
        }

        if (size_t_remain_size > buf_size) {

            u32_chunk_size = size_t_remain_size;

            if (buf_size) {

                // div ... ;  sub r12d,edx;   size_t stored into r12d ,
lost high 32 bits
                // u32_chunk_size = 0x100000FFF - 0x100000FFF % 0x1000
                //                            = 0x100000FFF - 0xFFF
                //                            = 0x1 0000 0000
                //                            = (u32) 0x1 0000 0000
                //                            = 0
                u32_chunk_size = size_t_remain_size - (size_t_remain_size %
buf_size);
            }

            //call _write() with zero size, returns 0
            u32_written_bytes = __write(file_p, data_buff, u32_chunk_size);

            // They didn't check if __write() returns 0.
            if (u32_written_bytes == -1 || u32_written_bytes <
u32_chunk_size) {
                return (size_t_total_size - size_t_remain_size) /
element_size;
            }

            //size_t_remain_size -= 0
            size_t_remain_size -= u32_written_bytes;
            buff += u32_written_bytes;

            //size_t_remain_size will never decrease to zero, then
while(true) loop forever.
            if (size_t_remain_size == 0) {
                return element_count;
            }
            // ...
        }
        else {
            // ...
        }
    }
    // ...
}

note:
1. The path of buggy msvcrt.dll is c:\windows\system32\msvcrt.dll( mingw64
links to it );
2. fwrite implementation in another msvc library which is called
ucrtbase.dll is correct(msvc links to it by default).



> Is there a particular reason to use 128MB here? If the
> runtime only fails on 4GB or more, maybe we should use
> a larger MAX_CHUNK_SIZE, like 2GB ?
>

According to current analysis, size <= 4GB all are safe, however there are
many
versions of msvcrt, this bug exists on Server 2008/2019/2022 and Windows
11(all
with full latest updates), and it may also exist in other versions, but it
is difficult to
check each version individually. I am not sure if all versions handle
boundary sizes
like 2GB/4GB correctly. So I prefer a relatively conservative value: 128MB.

Maybe we could use #ifdef _WIN32 to differentiate the handling between
Linux and
Windows. For Linux, it remains unchanged, while for Windows, it processes
by chunks
with max_chunk_sizeto 1GB.



> > +        while (offset < b->size) {
> > +            chunk_size = (b->size - offset > MAX_CHUNK_SIZE)
> > +                         ? MAX_CHUNK_SIZE
> > +                         : (b->size - offset);
>
> You can write this as
>      chunk_size = MIN(b->size - offset, MAX_CHUNK_SIZE);
> which I think is clearer. (Our osdep header provides MIN().)
>
> I think we should abstract out the bug workaround into a
> separate function, with the same API as fwrite(). Call
> it do_fwrite() or something, and make all the fwrite()
> calls use it. I know at the moment there's only two of
> them, and one of them is the header so never 4GB, but
> I think this more cleanly separates out the "work around
> a runtime library problem" part from the main logic of
> the program, and will mean that if we ever need to rearrange
> how we write out the data in future it will be simple.
>
>
Thanks, you are right!
Peter Maydell July 10, 2024, 4:25 p.m. UTC | #3
On Wed, 10 Jul 2024 at 09:02, hellord <halouworls@gmail.com> wrote:
>
>
>>
>>
>> On Tue, Jul 9, 2024 at 10:39 PM Peter Maydell <peter.maydell@linaro.org> wrote:
>> Is there a particular reason to use 128MB here? If the
>> runtime only fails on 4GB or more, maybe we should use
>> a larger MAX_CHUNK_SIZE, like 2GB ?
>
>
> According to current analysis, size <= 4GB all are safe, however there are many
> versions of msvcrt, this bug exists on Server 2008/2019/2022 and Windows 11(all
> with full latest updates), and it may also exist in other versions, but it is difficult to
> check each version individually. I am not sure if all versions handle boundary sizes
> like 2GB/4GB correctly. So I prefer a relatively conservative value: 128MB.
>
> Maybe we could use #ifdef _WIN32 to differentiate the handling between Linux and
> Windows. For Linux, it remains unchanged, while for Windows, it processes by chunks
> with max_chunk_sizeto 1GB.

I don't think it's worth making this Windows-specific. I agree that
it's OK to be a bit conservative, but 128MB seems to me extremely
conservative. I think we could say, for instance, 512MB or 1GB, without
being at much danger of running into broken implementations here.

thanks
-- PMM
Akihiko Odaki July 11, 2024, 7:53 a.m. UTC | #4
On 2024/07/10 17:02, hellord wrote:
> 
>     On Tue, Jul 9, 2024 at 10:39 PM Peter Maydell
>     <peter.maydell@linaro.org <mailto:peter.maydell@linaro.org>> wrote:
> 
> 
>     > +#define MAX_CHUNK_SIZE (128 * 1024 * 1024)
> 
>     I think we could add a comment here, something like:
> 
>     /*
>       * Maximum size to fwrite() to the output file at once;
>       * the MSVCRT runtime will not correctly handle fwrite()
>       * of more than 4GB at once.
>       */
> 
>     That will act as a reminder about why we do it.
> 
> 
> Thanks, I agree.
> 
>     (Does the library fwrite fail for > 4GB, or for >= 4GB ?
>     Your commit message says the former, so I've gone with that,
>     but if it's an "overflows 32 bit variable" kind of bug then
>     4GB exactly probably also doesn't work.)
> 
> 
> 
> It fails for > 4GB.
> The msvcrt.dll!fwrite(buff, (4G+0x1000), 1, file)  works as following:
> (based on assembly, not the original source, irrelevant parts have been 
> omitted)
> 
> size_t fwrite(const void* buff, size_t element_size, size_t 
> element_count, FILE* file_p)
> {
>      size_t size_t_total_size = element_size * element_count;
>      size_t size_t_remain_size = size_t_total_size;
> 
>      unsigned int u32_written_bytes;
>      unsigned int buf_size;
> 
>      /* The register used is r12d but not r12.
>       * So I suspect that Microsoft wrote it as an unsigned int type
>       * (or msvc compiler bug? seems unlikely)
>       */
>      unsigned int u32_chunk_size;
> 
>      while (true) {
> 
>          if ((file_p->flags & 0x10C) != 0) {
>              buf_size = file_p->buf_size;
>          }
>          else {
>              // Always reaches here on the first fwrite() after fopen().
>              buf_size = 4096;  // mov     r15d, 1000h
>          }
> 
>          if (size_t_remain_size > buf_size) {
> 
>              u32_chunk_size = size_t_remain_size;
> 
>              if (buf_size) {
> 
>                  // div ... ;  sub r12d,edx;   size_t stored into r12d , 
> lost high 32 bits
>                  // u32_chunk_size = 0x100000FFF - 0x100000FFF % 0x1000
>                  //                            = 0x100000FFF - 0xFFF
>                  //                            = 0x1 0000 0000
>                  //                            = (u32) 0x1 0000 0000
>                  //                            = 0
>                  u32_chunk_size = size_t_remain_size - 
> (size_t_remain_size % buf_size);
>              }
> 
>              //call _write() with zero size, returns 0
>              u32_written_bytes = __write(file_p, data_buff, u32_chunk_size);
> 
>              // They didn't check if __write() returns 0.
>              if (u32_written_bytes == -1 || u32_written_bytes < 
> u32_chunk_size) {
>                  return (size_t_total_size - size_t_remain_size) / 
> element_size;
>              }
> 
>              //size_t_remain_size -= 0
>              size_t_remain_size -= u32_written_bytes;
>              buff += u32_written_bytes;
> 
>              //size_t_remain_size will never decrease to zero, then 
> while(true) loop forever.
>              if (size_t_remain_size == 0) {
>                  return element_count;
>              }
>              // ...
>          }
>          else {
>              // ...
>          }
>      }
>      // ...
> }
> 
> note:
> 1. The path of buggy msvcrt.dll is c:\windows\system32\msvcrt.dll( 
> mingw64 links to it );
> 2. fwrite implementation in another msvc library which is called 
> ucrtbase.dll is correct(msvc links to it by default).

I don't object to this change but you should use ucrt whenever possible. 
I'm not confident that elf2dmp and other QEMU binaries would work well 
with mvcrt.

I even would like to force users to use ucrt and call setlocale(LC_ALL, 
".UTF8") to fix text encoding, but I haven't done that yet because 
Fedora, which cross-compiles QEMU for CI, still uses msvcrt.

Regards,
Akihiko Odaki
junjiehua July 11, 2024, 4:20 p.m. UTC | #5
>
>
> On Thu, Jul 11, 2024 at 3:53 PM Akihiko Odaki <akihiko.odaki@daynix.com>
> wrote:

    On 2024/07/10 17:02, hellord wrote:
> >
> > note:
> > 1. The path of buggy msvcrt.dll is c:\windows\system32\msvcrt.dll(
> > mingw64 links to it );
> > 2. fwrite implementation in another msvc library which is called
> > ucrtbase.dll is correct(msvc links to it by default).
>
> I don't object to this change but you should use ucrt whenever possible.
> I'm not confident that elf2dmp and other QEMU binaries would work well
> with mvcrt.
>
> I even would like to force users to use ucrt and call setlocale(LC_ALL,
> ".UTF8") to fix text encoding, but I haven't done that yet because
> Fedora, which cross-compiles QEMU for CI, still uses msvcrt.
>


Thanks.

Using ucrt by default (or mandatorily) is a good point, helps avoid other
unknown bugs in msvcrt and is
beneficial for the long-term development of QEMU on Windows.
junjiehua July 11, 2024, 4:24 p.m. UTC | #6
On Thu, Jul 11, 2024 at 12:25 AM Peter Maydell <peter.maydell@linaro.org>
> wrote:

On Wed, 10 Jul 2024 at 09:02, hellord <halouworls@gmail.com> wrote:
> >
> >
> >>
> >>
> >> On Tue, Jul 9, 2024 at 10:39 PM Peter Maydell <peter.maydell@linaro.org>
> wrote:
> >> Is there a particular reason to use 128MB here? If the
> >> runtime only fails on 4GB or more, maybe we should use
> >> a larger MAX_CHUNK_SIZE, like 2GB ?
> >
> >
> > According to current analysis, size <= 4GB all are safe, however there
> are many
> > versions of msvcrt, this bug exists on Server 2008/2019/2022 and Windows
> 11(all
> > with full latest updates), and it may also exist in other versions, but
> it is difficult to
> > check each version individually. I am not sure if all versions handle
> boundary sizes
> > like 2GB/4GB correctly. So I prefer a relatively conservative value:
> 128MB.
> >
> > Maybe we could use #ifdef _WIN32 to differentiate the handling between
> Linux and
> > Windows. For Linux, it remains unchanged, while for Windows, it
> processes by chunks
> > with max_chunk_sizeto 1GB.
>
> I don't think it's worth making this Windows-specific. I agree that
> it's OK to be a bit conservative, but 128MB seems to me extremely
> conservative. I think we could say, for instance, 512MB or 1GB, without
> being at much danger of running into broken implementations here.
>


OK, I will change the max size to 1GB and send patch V2 in the next few
days.
Daniel P. Berrangé July 11, 2024, 4:31 p.m. UTC | #7
On Thu, Jul 11, 2024 at 04:53:50PM +0900, Akihiko Odaki wrote:
> On 2024/07/10 17:02, hellord wrote:
> > 
> > note:
> > 1. The path of buggy msvcrt.dll is c:\windows\system32\msvcrt.dll(
> > mingw64 links to it );
> > 2. fwrite implementation in another msvc library which is called
> > ucrtbase.dll is correct(msvc links to it by default).
> 
> I don't object to this change but you should use ucrt whenever possible. I'm
> not confident that elf2dmp and other QEMU binaries would work well with
> mvcrt.
> 
> I even would like to force users to use ucrt and call setlocale(LC_ALL,
> ".UTF8") to fix text encoding, but I haven't done that yet because Fedora,
> which cross-compiles QEMU for CI, still uses msvcrt.

Our native Windows builds are also validating with msvcrt, and Stefan's
Windows packages for QEMU are also msvcrt.

Users getting QEMU packages from msys can choose whether to pull the
msvcrt build or the ucrt build, but forcing ucrt is a non-starter IMHO.


With regards,
Daniel
junjiehua July 11, 2024, 4:49 p.m. UTC | #8
On Fri, Jul 12, 2024 at 12:31 AM Daniel P. Berrangé <berrange@redhat.com>
> wrote:

On Thu, Jul 11, 2024 at 04:53:50PM +0900, Akihiko Odaki wrote:
> > On 2024/07/10 17:02, hellord wrote:
> > >
> > > note:
> > > 1. The path of buggy msvcrt.dll is c:\windows\system32\msvcrt.dll(
> > > mingw64 links to it );
> > > 2. fwrite implementation in another msvc library which is called
> > > ucrtbase.dll is correct(msvc links to it by default).
> >
> > I don't object to this change but you should use ucrt whenever possible.
> I'm
> > not confident that elf2dmp and other QEMU binaries would work well with
> > mvcrt.
> >
> > I even would like to force users to use ucrt and call setlocale(LC_ALL,
> > ".UTF8") to fix text encoding, but I haven't done that yet because
> Fedora,
> > which cross-compiles QEMU for CI, still uses msvcrt.
>
> Our native Windows builds are also validating with msvcrt, and Stefan's
> Windows packages for QEMU are also msvcrt.
>
> Users getting QEMU packages from msys can choose whether to pull the
> msvcrt build or the ucrt build, but forcing ucrt is a non-starter IMHO.
>
>
I will also submit a ticket to Microsoft to report this bug next week.
I believe they should fix it, as this file is distributed along with the
operating system,
and many of the OS's own components are also linked to it.
Daniel P. Berrangé July 11, 2024, 5:05 p.m. UTC | #9
On Mon, Jul 08, 2024 at 07:25:20PM +0800, junjiehua wrote:
> when building elf2dump with x86_64-w64-mingw32-gcc, fwrite is imported from
> msvcrt.dll. However, the implementation of msvcrt.dll!fwrite is buggy:
> it enters an infinite loop when the size of a single write exceeds 4GB.
> This patch addresses the issue by splitting large physical memory
> blocks into smaller chunks.
> 
> Signed-off-by: junjiehua <junjiehua@tencent.com>
> ---
>  contrib/elf2dmp/main.c | 19 +++++++++++++++----
>  1 file changed, 15 insertions(+), 4 deletions(-)
> 
> diff --git a/contrib/elf2dmp/main.c b/contrib/elf2dmp/main.c
> index d046a72ae6..1994553d95 100644
> --- a/contrib/elf2dmp/main.c
> +++ b/contrib/elf2dmp/main.c
> @@ -23,6 +23,8 @@
>  #define INITIAL_MXCSR   0x1f80
>  #define MAX_NUMBER_OF_RUNS  42
>  
> +#define MAX_CHUNK_SIZE (128 * 1024 * 1024)
> +
>  typedef struct idt_desc {
>      uint16_t offset1;   /* offset bits 0..15 */
>      uint16_t selector;
> @@ -434,13 +436,22 @@ static bool write_dump(struct pa_space *ps,
>  
>      for (i = 0; i < ps->block_nr; i++) {
>          struct pa_block *b = &ps->block[i];
> +        size_t offset = 0;
> +        size_t chunk_size;
>  
>          printf("Writing block #%zu/%zu of %"PRIu64" bytes to file...\n", i,
>                  ps->block_nr, b->size);
> -        if (fwrite(b->addr, b->size, 1, dmp_file) != 1) {
> -            eprintf("Failed to write block\n");
> -            fclose(dmp_file);
> -            return false;
> +
> +        while (offset < b->size) {
> +            chunk_size = (b->size - offset > MAX_CHUNK_SIZE)
> +                         ? MAX_CHUNK_SIZE
> +                         : (b->size - offset);
> +            if (fwrite(b->addr + offset, chunk_size, 1, dmp_file) != 1) {
> +                eprintf("Failed to write block\n");
> +                fclose(dmp_file);
> +                return false;
> +            }
> +            offset += chunk_size;
>          }
>      }

When reading the original ELF file, we don't actually fread() it,
instead we mmap it, using GMappedFile on Windows. Rather than
working around fwrite() bugs, we could do the same for writing
and create a mapped file and just memcpy the data across.


With regards,
Daniel
Akihiko Odaki July 13, 2024, 12:56 p.m. UTC | #10
On 2024/07/12 2:05, Daniel P. Berrangé wrote:
> On Mon, Jul 08, 2024 at 07:25:20PM +0800, junjiehua wrote:
>> when building elf2dump with x86_64-w64-mingw32-gcc, fwrite is imported from
>> msvcrt.dll. However, the implementation of msvcrt.dll!fwrite is buggy:
>> it enters an infinite loop when the size of a single write exceeds 4GB.
>> This patch addresses the issue by splitting large physical memory
>> blocks into smaller chunks.
>>
>> Signed-off-by: junjiehua <junjiehua@tencent.com>
>> ---
>>   contrib/elf2dmp/main.c | 19 +++++++++++++++----
>>   1 file changed, 15 insertions(+), 4 deletions(-)
>>
>> diff --git a/contrib/elf2dmp/main.c b/contrib/elf2dmp/main.c
>> index d046a72ae6..1994553d95 100644
>> --- a/contrib/elf2dmp/main.c
>> +++ b/contrib/elf2dmp/main.c
>> @@ -23,6 +23,8 @@
>>   #define INITIAL_MXCSR   0x1f80
>>   #define MAX_NUMBER_OF_RUNS  42
>>   
>> +#define MAX_CHUNK_SIZE (128 * 1024 * 1024)
>> +
>>   typedef struct idt_desc {
>>       uint16_t offset1;   /* offset bits 0..15 */
>>       uint16_t selector;
>> @@ -434,13 +436,22 @@ static bool write_dump(struct pa_space *ps,
>>   
>>       for (i = 0; i < ps->block_nr; i++) {
>>           struct pa_block *b = &ps->block[i];
>> +        size_t offset = 0;
>> +        size_t chunk_size;
>>   
>>           printf("Writing block #%zu/%zu of %"PRIu64" bytes to file...\n", i,
>>                   ps->block_nr, b->size);
>> -        if (fwrite(b->addr, b->size, 1, dmp_file) != 1) {
>> -            eprintf("Failed to write block\n");
>> -            fclose(dmp_file);
>> -            return false;
>> +
>> +        while (offset < b->size) {
>> +            chunk_size = (b->size - offset > MAX_CHUNK_SIZE)
>> +                         ? MAX_CHUNK_SIZE
>> +                         : (b->size - offset);
>> +            if (fwrite(b->addr + offset, chunk_size, 1, dmp_file) != 1) {
>> +                eprintf("Failed to write block\n");
>> +                fclose(dmp_file);
>> +                return false;
>> +            }
>> +            offset += chunk_size;
>>           }
>>       }
> 
> When reading the original ELF file, we don't actually fread() it,
> instead we mmap it, using GMappedFile on Windows. Rather than
> working around fwrite() bugs, we could do the same for writing
> and create a mapped file and just memcpy the data across.

It is a bit more complicated to map a file for writing as we need to 
allocate the file size beforehand. We will also need to extract the 
logic in QEMU_Elf_map(), which also adds another complexity.

Regards,
Akihiko Odaki
diff mbox series

Patch

diff --git a/contrib/elf2dmp/main.c b/contrib/elf2dmp/main.c
index d046a72ae6..1994553d95 100644
--- a/contrib/elf2dmp/main.c
+++ b/contrib/elf2dmp/main.c
@@ -23,6 +23,8 @@ 
 #define INITIAL_MXCSR   0x1f80
 #define MAX_NUMBER_OF_RUNS  42
 
+#define MAX_CHUNK_SIZE (128 * 1024 * 1024)
+
 typedef struct idt_desc {
     uint16_t offset1;   /* offset bits 0..15 */
     uint16_t selector;
@@ -434,13 +436,22 @@  static bool write_dump(struct pa_space *ps,
 
     for (i = 0; i < ps->block_nr; i++) {
         struct pa_block *b = &ps->block[i];
+        size_t offset = 0;
+        size_t chunk_size;
 
         printf("Writing block #%zu/%zu of %"PRIu64" bytes to file...\n", i,
                 ps->block_nr, b->size);
-        if (fwrite(b->addr, b->size, 1, dmp_file) != 1) {
-            eprintf("Failed to write block\n");
-            fclose(dmp_file);
-            return false;
+
+        while (offset < b->size) {
+            chunk_size = (b->size - offset > MAX_CHUNK_SIZE)
+                         ? MAX_CHUNK_SIZE
+                         : (b->size - offset);
+            if (fwrite(b->addr + offset, chunk_size, 1, dmp_file) != 1) {
+                eprintf("Failed to write block\n");
+                fclose(dmp_file);
+                return false;
+            }
+            offset += chunk_size;
         }
     }