Message ID | SYYP282MB1246AB8BFC31C4E51C010F228B9D2@SYYP282MB1246.AUSP282.PROD.OUTLOOK.COM |
---|---|
State | New |
Headers | show |
Series | flashrom1.4 segfault in init_eraseblock | expand |
Grant, Thank you so much for posting! And for debugging and suggesting a potential solution. I really like that you unblocked yourself, and I also want to get to the root cause. From looking at the code, it seemed like a real issue, and after some time of debugging, I think this is a real issue. But the big question for me was: why no one noticed before? For context, this code path has been enabled by default since May 2023 , and since that time we have had lots of contributions and new development, and I was thinking how could it be that no one hit the issue before? I can't see your values in your debugger on erasure_layout.c line 55, but I did my best to reproduce. For debugging I used Test case #8 from tests/erase_func_algo.c (set args 'Erase test case #8' in gdb in case someone wants to run) and I think I reproed for eraseblock of size 8. subedata went out of bounds! it was pointing to a memory it shouldn't be , like this {start_addr = 4025479151, end_addr = 4025479151, selected = 239, block_num = 0, first_sub_block_index = 5249, last_sub_block_index = 0} but it wasn't segfaulting , and those random large numbers in start_addr and end_addr made the loop condition false and the program kept running. without segfaults! I think your patch may be valid, and please don't throw it away, maybe you will send it as a patch. Can it be the difference between Linux and OpenBSD? It's true that most developers work on Linux, also from what I remember OpenBSD has more strict memory protections. Is it possible that OpenBSD segfaults in the same code where Linux keeps running? On Fri, Sep 6, 2024 at 3:58 AM Grant Pannell via flashrom <flashrom@flashrom.org> wrote: > > Hi, > > Trying to report a bug in flashrom 1.4. I'm not subject matter expert, but I've done my best to try and debug and fix the issue. Seeking guidance from the experts on whether this is the solution :-) > > I found that in 1.4, it looks like the erase and write logic was refactored. Since then, I get segfaults when trying to flash my coreboot image. > > The chip is a Macronix MX25U6435E in a Protectli FW4B. I'm running on OpenBSD 7.5. > > The segfault occurs in erasure_layout.c: init_eraseblock line 55. I am no expert on the internals of flashrom, or what's going on here...but I've narrowed it down to segfaulting at the last iteration of this while loop. Code in question looks like: > > edata->first_sub_block_index = *sub_block_index; > struct eraseblock_data *subedata = &layout[idx - 1].layout_list[*sub_block_index]; > while (subedata->start_addr >= start_addr && subedata->end_addr <= end_addr && > *sub_block_index < layout[idx-1].block_count) { > (*sub_block_index)++; > subedata++; > } > > In my case, it seems that the last iteration looks like: > > layout[idx-1].block_count == 2048 > *sub_block_index == 2047 > subedata->end_addr == end_addr > > What then happens is, the variable "subedata" is incremented and the while condition is checked, but subedata is now out of bounds and the application segfaults. I'm pretty sure the while loop shouldn't iterate again because the next iteration would fail the *sub_block_index < layout[idx-1].block_count check (2048 < 2028). I solved this by short circuiting the while condition and checking that condition first, so that subedata is not accessed and flashrom successfully flashes my coreboot image. Patch included below. Is this an appropriate fix? > > Thank you, > > Grant > > --- erasure_layout.c.orig > +++ erasure_layout.c > @@ -52,8 +52,8 @@ > > edata->first_sub_block_index = *sub_block_index; > struct eraseblock_data *subedata = &layout[idx - 1].layout_list[*sub_block_index]; > - while (subedata->start_addr >= start_addr && subedata->end_addr <= end_addr && > - *sub_block_index < layout[idx-1].block_count) { > + while (*sub_block_index < layout[idx-1].block_count && > + subedata->start_addr >= start_addr && subedata->end_addr <= end_addr) { > (*sub_block_index)++; > subedata++; > } > _______________________________________________ > flashrom mailing list -- flashrom@flashrom.org > To unsubscribe send an email to flashrom-leave@flashrom.org
Hi there! Thanks for looking. I just also submitted a patch via Gerrit -- hopefully, I did that right. I'm not too sure of technicalities. Could this be a compiler flag? I suspect you are right that OpenBSD has more strict memory protections. As for why it has not been noticed, I think I can explain that. Flashrom in OpenBSD's ports isn't updated to 1.4.... it's still pointing to 1.2. Unfortunately. I'm not the maintainer, so I have been manually updating the port for my own purposes. As for Linux, LTS distros are still mostly packaging 1.3, so that is probably also a contributor.....if it ever does pop its head up on Linux. Additionally, until recently, pciutils's OpenBSD support was lacking. I submitted support for this a few months ago and support has been merged. So support for internal flashing on OpenBSD has been lacking. On top of all that, to actually flash on OpenBSD, you need to be in an insecure kernel mode (at very early boot), or single-user mode, since access to /dev/mem gets locked down. I'm sure the combination of all these factors narrows users down quite a bit :-) Hopefully I submitted the patch correctly. Thanks again -----Original Message----- From: Anastasia Klimchuk <aklm@chromium.org> Sent: Friday, 6 September 2024 11:13 PM To: Grant Pannell <grant@digitaldj.net> Cc: flashrom@flashrom.org Subject: Re: [flashrom] flashrom1.4 segfault in init_eraseblock [You don't often get email from aklm@chromium.org. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] Grant, Thank you so much for posting! And for debugging and suggesting a potential solution. I really like that you unblocked yourself, and I also want to get to the root cause. From looking at the code, it seemed like a real issue, and after some time of debugging, I think this is a real issue. But the big question for me was: why no one noticed before? For context, this code path has been enabled by default since May 2023 , and since that time we have had lots of contributions and new development, and I was thinking how could it be that no one hit the issue before? I can't see your values in your debugger on erasure_layout.c line 55, but I did my best to reproduce. For debugging I used Test case #8 from tests/erase_func_algo.c (set args 'Erase test case #8' in gdb in case someone wants to run) and I think I reproed for eraseblock of size 8. subedata went out of bounds! it was pointing to a memory it shouldn't be , like this {start_addr = 4025479151, end_addr = 4025479151, selected = 239, block_num = 0, first_sub_block_index = 5249, last_sub_block_index = 0} but it wasn't segfaulting , and those random large numbers in start_addr and end_addr made the loop condition false and the program kept running. without segfaults! I think your patch may be valid, and please don't throw it away, maybe you will send it as a patch. Can it be the difference between Linux and OpenBSD? It's true that most developers work on Linux, also from what I remember OpenBSD has more strict memory protections. Is it possible that OpenBSD segfaults in the same code where Linux keeps running? On Fri, Sep 6, 2024 at 3:58 AM Grant Pannell via flashrom <flashrom@flashrom.org> wrote: > > Hi, > > Trying to report a bug in flashrom 1.4. I'm not subject matter expert, > but I've done my best to try and debug and fix the issue. Seeking > guidance from the experts on whether this is the solution :-) > > I found that in 1.4, it looks like the erase and write logic was refactored. Since then, I get segfaults when trying to flash my coreboot image. > > The chip is a Macronix MX25U6435E in a Protectli FW4B. I'm running on OpenBSD 7.5. > > The segfault occurs in erasure_layout.c: init_eraseblock line 55. I am no expert on the internals of flashrom, or what's going on here...but I've narrowed it down to segfaulting at the last iteration of this while loop. Code in question looks like: > > edata->first_sub_block_index = *sub_block_index; > struct eraseblock_data *subedata = &layout[idx - > 1].layout_list[*sub_block_index]; while (subedata->start_addr >= start_addr && subedata->end_addr <= end_addr && > *sub_block_index < layout[idx-1].block_count) { > (*sub_block_index)++; > subedata++; > } > > In my case, it seems that the last iteration looks like: > > layout[idx-1].block_count == 2048 > *sub_block_index == 2047 > subedata->end_addr == end_addr > > What then happens is, the variable "subedata" is incremented and the while condition is checked, but subedata is now out of bounds and the application segfaults. I'm pretty sure the while loop shouldn't iterate again because the next iteration would fail the *sub_block_index < layout[idx-1].block_count check (2048 < 2028). I solved this by short circuiting the while condition and checking that condition first, so that subedata is not accessed and flashrom successfully flashes my coreboot image. Patch included below. Is this an appropriate fix? > > Thank you, > > Grant > > --- erasure_layout.c.orig > +++ erasure_layout.c > @@ -52,8 +52,8 @@ > > edata->first_sub_block_index = *sub_block_index; > struct eraseblock_data *subedata = &layout[idx - 1].layout_list[*sub_block_index]; > - while (subedata->start_addr >= start_addr && subedata->end_addr <= end_addr && > - *sub_block_index < layout[idx-1].block_count) { > + while (*sub_block_index < layout[idx-1].block_count && > + subedata->start_addr >= start_addr && > + subedata->end_addr <= end_addr) { > (*sub_block_index)++; > subedata++; > } > _______________________________________________ > flashrom mailing list -- flashrom@flashrom.org To unsubscribe send an > email to flashrom-leave@flashrom.org -- Anastasia.
I should also note....I had no issues on the same setup but a different flash chip. That is why I was a little confused and not quite sure if this is the full solution. Protectli VP2410 - looks like that's a MX25U6473F instead of the MX25U6435E in the FW4B I am having issues with. Honestly, I haven't gone that deep into the code to figure out how flashrom actually works. I guess with the chip that works, the block count must be higher, but there is a lower end_addr restriction somewhere, so the pointer is still valid, I guess. Anyway, food for thought. -----Original Message----- From: Grant Pannell via flashrom <flashrom@flashrom.org> Sent: Friday, 6 September 2024 11:30 PM To: Anastasia Klimchuk <aklm@chromium.org> Cc: flashrom@flashrom.org Subject: [flashrom] Re: flashrom1.4 segfault in init_eraseblock Hi there! Thanks for looking. I just also submitted a patch via Gerrit -- hopefully, I did that right. I'm not too sure of technicalities. Could this be a compiler flag? I suspect you are right that OpenBSD has more strict memory protections. As for why it has not been noticed, I think I can explain that. Flashrom in OpenBSD's ports isn't updated to 1.4.... it's still pointing to 1.2. Unfortunately. I'm not the maintainer, so I have been manually updating the port for my own purposes. As for Linux, LTS distros are still mostly packaging 1.3, so that is probably also a contributor.....if it ever does pop its head up on Linux. Additionally, until recently, pciutils's OpenBSD support was lacking. I submitted support for this a few months ago and support has been merged. So support for internal flashing on OpenBSD has been lacking. On top of all that, to actually flash on OpenBSD, you need to be in an insecure kernel mode (at very early boot), or single-user mode, since access to /dev/mem gets locked down. I'm sure the combination of all these factors narrows users down quite a bit :-) Hopefully I submitted the patch correctly. Thanks again -----Original Message----- From: Anastasia Klimchuk <aklm@chromium.org> Sent: Friday, 6 September 2024 11:13 PM To: Grant Pannell <grant@digitaldj.net> Cc: flashrom@flashrom.org Subject: Re: [flashrom] flashrom1.4 segfault in init_eraseblock [You don't often get email from aklm@chromium.org. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] Grant, Thank you so much for posting! And for debugging and suggesting a potential solution. I really like that you unblocked yourself, and I also want to get to the root cause. From looking at the code, it seemed like a real issue, and after some time of debugging, I think this is a real issue. But the big question for me was: why no one noticed before? For context, this code path has been enabled by default since May 2023 , and since that time we have had lots of contributions and new development, and I was thinking how could it be that no one hit the issue before? I can't see your values in your debugger on erasure_layout.c line 55, but I did my best to reproduce. For debugging I used Test case #8 from tests/erase_func_algo.c (set args 'Erase test case #8' in gdb in case someone wants to run) and I think I reproed for eraseblock of size 8. subedata went out of bounds! it was pointing to a memory it shouldn't be , like this {start_addr = 4025479151, end_addr = 4025479151, selected = 239, block_num = 0, first_sub_block_index = 5249, last_sub_block_index = 0} but it wasn't segfaulting , and those random large numbers in start_addr and end_addr made the loop condition false and the program kept running. without segfaults! I think your patch may be valid, and please don't throw it away, maybe you will send it as a patch. Can it be the difference between Linux and OpenBSD? It's true that most developers work on Linux, also from what I remember OpenBSD has more strict memory protections. Is it possible that OpenBSD segfaults in the same code where Linux keeps running? On Fri, Sep 6, 2024 at 3:58 AM Grant Pannell via flashrom <flashrom@flashrom.org> wrote: > > Hi, > > Trying to report a bug in flashrom 1.4. I'm not subject matter expert, > but I've done my best to try and debug and fix the issue. Seeking > guidance from the experts on whether this is the solution :-) > > I found that in 1.4, it looks like the erase and write logic was refactored. Since then, I get segfaults when trying to flash my coreboot image. > > The chip is a Macronix MX25U6435E in a Protectli FW4B. I'm running on OpenBSD 7.5. > > The segfault occurs in erasure_layout.c: init_eraseblock line 55. I am no expert on the internals of flashrom, or what's going on here...but I've narrowed it down to segfaulting at the last iteration of this while loop. Code in question looks like: > > edata->first_sub_block_index = *sub_block_index; > struct eraseblock_data *subedata = &layout[idx - > 1].layout_list[*sub_block_index]; while (subedata->start_addr >= start_addr && subedata->end_addr <= end_addr && > *sub_block_index < layout[idx-1].block_count) { > (*sub_block_index)++; > subedata++; > } > > In my case, it seems that the last iteration looks like: > > layout[idx-1].block_count == 2048 > *sub_block_index == 2047 > subedata->end_addr == end_addr > > What then happens is, the variable "subedata" is incremented and the while condition is checked, but subedata is now out of bounds and the application segfaults. I'm pretty sure the while loop shouldn't iterate again because the next iteration would fail the *sub_block_index < layout[idx-1].block_count check (2048 < 2028). I solved this by short circuiting the while condition and checking that condition first, so that subedata is not accessed and flashrom successfully flashes my coreboot image. Patch included below. Is this an appropriate fix? > > Thank you, > > Grant > > --- erasure_layout.c.orig > +++ erasure_layout.c > @@ -52,8 +52,8 @@ > > edata->first_sub_block_index = *sub_block_index; > struct eraseblock_data *subedata = &layout[idx - 1].layout_list[*sub_block_index]; > - while (subedata->start_addr >= start_addr && subedata->end_addr <= end_addr && > - *sub_block_index < layout[idx-1].block_count) { > + while (*sub_block_index < layout[idx-1].block_count && > + subedata->start_addr >= start_addr && > + subedata->end_addr <= end_addr) { > (*sub_block_index)++; > subedata++; > } > _______________________________________________ > flashrom mailing list -- flashrom@flashrom.org To unsubscribe send an > email to flashrom-leave@flashrom.org -- Anastasia.
> I just also submitted a patch via Gerrit -- hopefully, I did that right. You did great, welcome to flashrom development! For people who are interested, this is the patch https://review.coreboot.org/c/flashrom/+/84234 Yes, I am guessing there may be a compiler flag somewhere which would achieve the same strict memory checks. We need to run with this flag one day, and then fix all 100 thousand warnings :) > As for why it has not been noticed, I think I can explain that. Flashrom in OpenBSD's ports isn't updated to 1.4.... it's still pointing to 1.2. Unfortunately. I'm not the maintainer, so I have been manually updating the port for my own purposes. As for Linux, LTS distros are still mostly packaging 1.3, so that is probably also a contributor.....if it ever does pop its head up on Linux. I see what you mean. I understand there might be delays with packaging, or anything can be happening with packaging to distros. Whom I was thinking about, are developers who are regularly building from the latest source, adding features, or adding support for new devices. From that point of view, release is just a tag on the branch. About packaging, if OpenBSD skipped two releases already, now it seems a good idea to wait for a little bit more and package the next one? We plan one more release later this year (no exact timeline yet), but it will include your fix! > I should also note....I had no issues on the same setup but a different flash chip. That is why I was a little confused and not quite sure if this is the full solution. > > Protectli VP2410 - looks like that's a MX25U6473F instead of the MX25U6435E in the FW4B I am having issues with. This is interesting information. I checked and we actually don't have a definition for MX25U6473F, but if it worked this means it shares the device ID with some other model for which we do have a definition. And it's close enough so it worked. What is important for this bug is the eraseblocks layout, with a different layout maybe the bug is not repro? But anyway, you found a real bug. On Sat, Sep 7, 2024 at 12:51 AM Grant Pannell <grant@pannell.net.au> wrote: > > I should also note....I had no issues on the same setup but a different flash chip. That is why I was a little confused and not quite sure if this is the full solution. > > Protectli VP2410 - looks like that's a MX25U6473F instead of the MX25U6435E in the FW4B I am having issues with. > > Honestly, I haven't gone that deep into the code to figure out how flashrom actually works. I guess with the chip that works, the block count must be higher, but there is a lower end_addr restriction somewhere, so the pointer is still valid, I guess. > > Anyway, food for thought. > > -----Original Message----- > From: Grant Pannell via flashrom <flashrom@flashrom.org> > Sent: Friday, 6 September 2024 11:30 PM > To: Anastasia Klimchuk <aklm@chromium.org> > Cc: flashrom@flashrom.org > Subject: [flashrom] Re: flashrom1.4 segfault in init_eraseblock > > Hi there! > > Thanks for looking. I just also submitted a patch via Gerrit -- hopefully, I did that right. > > I'm not too sure of technicalities. Could this be a compiler flag? I suspect you are right that OpenBSD has more strict memory protections. > > As for why it has not been noticed, I think I can explain that. Flashrom in OpenBSD's ports isn't updated to 1.4.... it's still pointing to 1.2. Unfortunately. I'm not the maintainer, so I have been manually updating the port for my own purposes. As for Linux, LTS distros are still mostly packaging 1.3, so that is probably also a contributor.....if it ever does pop its head up on Linux. > > Additionally, until recently, pciutils's OpenBSD support was lacking. I submitted support for this a few months ago and support has been merged. So support for internal flashing on OpenBSD has been lacking. > > On top of all that, to actually flash on OpenBSD, you need to be in an insecure kernel mode (at very early boot), or single-user mode, since access to /dev/mem gets locked down. I'm sure the combination of all these factors narrows users down quite a bit :-) > > Hopefully I submitted the patch correctly. Thanks again > > -----Original Message----- > From: Anastasia Klimchuk <aklm@chromium.org> > Sent: Friday, 6 September 2024 11:13 PM > To: Grant Pannell <grant@digitaldj.net> > Cc: flashrom@flashrom.org > Subject: Re: [flashrom] flashrom1.4 segfault in init_eraseblock > > [You don't often get email from aklm@chromium.org. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] > > Grant, > > Thank you so much for posting! And for debugging and suggesting a potential solution. I really like that you unblocked yourself, and I also want to get to the root cause. > > From looking at the code, it seemed like a real issue, and after some time of debugging, I think this is a real issue. But the big question for me was: why no one noticed before? > For context, this code path has been enabled by default since May 2023 , and since that time we have had lots of contributions and new development, and I was thinking how could it be that no one hit the issue before? > > I can't see your values in your debugger on erasure_layout.c line 55, but I did my best to reproduce. For debugging I used Test case #8 from tests/erase_func_algo.c (set args 'Erase test case #8' in gdb in case someone wants to run) and I think I reproed for eraseblock of size 8. > subedata went out of bounds! it was pointing to a memory it shouldn't be , like this > > {start_addr = 4025479151, end_addr = 4025479151, selected = 239, block_num = 0, first_sub_block_index = 5249, last_sub_block_index = 0} > > but it wasn't segfaulting , and those random large numbers in start_addr and end_addr made the loop condition false and the program kept running. without segfaults! > > I think your patch may be valid, and please don't throw it away, maybe you will send it as a patch. > > Can it be the difference between Linux and OpenBSD? It's true that most developers work on Linux, also from what I remember OpenBSD has more strict memory protections. Is it possible that OpenBSD segfaults in the same code where Linux keeps running? > > On Fri, Sep 6, 2024 at 3:58 AM Grant Pannell via flashrom <flashrom@flashrom.org> wrote: > > > > Hi, > > > > Trying to report a bug in flashrom 1.4. I'm not subject matter expert, > > but I've done my best to try and debug and fix the issue. Seeking > > guidance from the experts on whether this is the solution :-) > > > > I found that in 1.4, it looks like the erase and write logic was refactored. Since then, I get segfaults when trying to flash my coreboot image. > > > > The chip is a Macronix MX25U6435E in a Protectli FW4B. I'm running on OpenBSD 7.5. > > > > The segfault occurs in erasure_layout.c: init_eraseblock line 55. I am no expert on the internals of flashrom, or what's going on here...but I've narrowed it down to segfaulting at the last iteration of this while loop. Code in question looks like: > > > > edata->first_sub_block_index = *sub_block_index; > > struct eraseblock_data *subedata = &layout[idx - > > 1].layout_list[*sub_block_index]; while (subedata->start_addr >= start_addr && subedata->end_addr <= end_addr && > > *sub_block_index < layout[idx-1].block_count) { > > (*sub_block_index)++; > > subedata++; > > } > > > > In my case, it seems that the last iteration looks like: > > > > layout[idx-1].block_count == 2048 > > *sub_block_index == 2047 > > subedata->end_addr == end_addr > > > > What then happens is, the variable "subedata" is incremented and the while condition is checked, but subedata is now out of bounds and the application segfaults. I'm pretty sure the while loop shouldn't iterate again because the next iteration would fail the *sub_block_index < layout[idx-1].block_count check (2048 < 2028). I solved this by short circuiting the while condition and checking that condition first, so that subedata is not accessed and flashrom successfully flashes my coreboot image. Patch included below. Is this an appropriate fix? > > > > Thank you, > > > > Grant > > > > --- erasure_layout.c.orig > > +++ erasure_layout.c > > @@ -52,8 +52,8 @@ > > > > edata->first_sub_block_index = *sub_block_index; > > struct eraseblock_data *subedata = &layout[idx - 1].layout_list[*sub_block_index]; > > - while (subedata->start_addr >= start_addr && subedata->end_addr <= end_addr && > > - *sub_block_index < layout[idx-1].block_count) { > > + while (*sub_block_index < layout[idx-1].block_count && > > + subedata->start_addr >= start_addr && > > + subedata->end_addr <= end_addr) { > > (*sub_block_index)++; > > subedata++; > > } > > _______________________________________________ > > flashrom mailing list -- flashrom@flashrom.org To unsubscribe send an > > email to flashrom-leave@flashrom.org > > > > -- > Anastasia. > _______________________________________________ > flashrom mailing list -- flashrom@flashrom.org To unsubscribe send an email to flashrom-leave@flashrom.org
--- erasure_layout.c.orig +++ erasure_layout.c @@ -52,8 +52,8 @@ edata->first_sub_block_index = *sub_block_index; struct eraseblock_data *subedata = &layout[idx - 1].layout_list[*sub_block_index]; - while (subedata->start_addr >= start_addr && subedata->end_addr <= end_addr && - *sub_block_index < layout[idx-1].block_count) { + while (*sub_block_index < layout[idx-1].block_count && + subedata->start_addr >= start_addr && subedata->end_addr <= end_addr) { (*sub_block_index)++; subedata++; }