Message ID | 1545539756-16815-1-git-send-email-james.hilliard1@gmail.com |
---|---|
State | Changes Requested |
Headers | show |
Series | [1/1] archive handler: set locale for libarchive | expand |
Hi James, (you get a Christmas' review): On 23/12/18 05:35, james.hilliard1@gmail.com wrote: > From: James Hilliard <james.hilliard1@gmail.com> > > Signed-off-by: James Hilliard <james.hilliard1@gmail.com> There is not a full description of the problem in the commit message. Anyway, I am not sure if the patch is really addressing the issue. The cause as reported by several sources is that libarchive does not handle correctly the filenames. The expectation is that the issue is fixed than in libarchive and not in SWUpdate. Anyway, it will help if there is a real example with filenames that cause the issue to happens. Can you provide such a case for this? Any program is started with an implicit set to "C" as locale. If SWUpdate should able to work with locale settings, I guess it is enough to drop the LC_ALL set. Can you check if this small patch solve your issue ? diff --git a/core/swupdate.c b/core/swupdate.c index c59a258..7ff8138 100644 --- a/core/swupdate.c +++ b/core/swupdate.c @@ -24,6 +24,7 @@ #include <signal.h> #include <sys/wait.h> #include <ftw.h> +#include <locale.h> #include "bsdqueue.h" #include "cpiohdr.h" @@ -630,6 +631,15 @@ int main(int argc, char **argv) */ notify_init(); + /* + * Enable system locale - change from the standard (C) to system locale. + * This allows libarchive (in case it is activated) to handle filenames. + * See on libarchive Websiete for a more complete description of the issue: + * https://github.com/libarchive/libarchive/issues/587 + * https://github.com/libarchive/libarchive/wiki/Filenames + */ + setlocale(LC_ALL, ""); + /* * Check if there is a configuration file and parse it * Parse once the command line just to find if a > --- > handlers/archive_handler.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/handlers/archive_handler.c b/handlers/archive_handler.c > index a44819d..42bbfbb 100644 > --- a/handlers/archive_handler.c > +++ b/handlers/archive_handler.c > @@ -6,6 +6,7 @@ > */ > > #include <sys/types.h> > +#include <locale.h> > #include <stdio.h> > #include <sys/stat.h> > #include <unistd.h> > @@ -75,6 +76,10 @@ extract(void *p) > struct extract_data *data = (struct extract_data *)p; > flags = data->flags; > > + char *LANG = getenv("LC_ALL"); > + if (!LANG) LANG = getenv("LC_CTYPE"); > + if (!LANG) LANG = getenv("LANG"); > + setlocale(LC_CTYPE, LANG ? LANG : ""); > a = archive_read_new(); > ext = archive_write_disk_new(); > archive_write_disk_set_options(ext, flags); > @@ -97,6 +102,7 @@ extract(void *p) > if ((r = archive_read_open_filename(a, FIFO, 4096))) { > ERROR("archive_read_open_filename(): %s %d", > archive_error_string(a), r); > + setlocale(LC_CTYPE, "C"); This does not set back if it was set before. It works for you because LC_TYPE *was* "C". > pthread_exit((void *)-1); > } > for (;;) { > @@ -106,6 +112,7 @@ extract(void *p) > if (r != ARCHIVE_OK) { > ERROR("archive_read_next_header(): %s %d", > archive_error_string(a), 1); > + setlocale(LC_CTYPE, "C"); > pthread_exit((void *)-1); > } > > @@ -122,6 +129,7 @@ extract(void *p) > if (r != ARCHIVE_OK) { > ERROR("archive_write_finish_entry(): %s", > archive_error_string(ext)); > + setlocale(LC_CTYPE, "C"); > pthread_exit((void *)-1); > } > } > @@ -130,6 +138,7 @@ extract(void *p) > > archive_read_close(a); > archive_read_free(a); > + setlocale(LC_CTYPE, "C"); > pthread_exit((void *)0); > } > > Best regards, Stefano Babic
On Tue, Dec 25, 2018 at 6:25 AM Stefano Babic <sbabic@denx.de> wrote: > > Hi James, > > (you get a Christmas' review): > > On 23/12/18 05:35, james.hilliard1@gmail.com wrote: > > From: James Hilliard <james.hilliard1@gmail.com> > > > > Signed-off-by: James Hilliard <james.hilliard1@gmail.com> > > There is not a full description of the problem in the commit message. > > Anyway, I am not sure if the patch is really addressing the issue. The > cause as reported by several sources is that libarchive does not handle > correctly the filenames. The expectation is that the issue is fixed than > in libarchive and not in SWUpdate. Well it seems that unless libarchive is significantly refactored one will need to have a valid UTF-8 system locale in order for this to work. Since it's apparently not a good idea to call setlocale in a library I don't expect it to be fixed there any time soon. > > Anyway, it will help if there is a real example with filenames that > cause the issue to happens. Can you provide such a case for this? It appears to always happen in my builds and I don't think I had any non-ascii filenames either. I'm using buildroot to generate everything. > > Any program is started with an implicit set to "C" as locale. If > SWUpdate should able to work with locale settings, I guess it is enough > to drop the LC_ALL set. > > Can you check if this small patch solve your issue ? I had already tried that exact same thing before and it didn't work although I do agree that the docs indicate it should have, for some reason that I haven't figured out yet swupdate fails to pick up the system locale when doing "setlocale(LC_ALL, "")" by itself so it's required to check the environment first(other applications such as busybox do exactly this as well). > > diff --git a/core/swupdate.c b/core/swupdate.c > index c59a258..7ff8138 100644 > --- a/core/swupdate.c > +++ b/core/swupdate.c > @@ -24,6 +24,7 @@ > #include <signal.h> > #include <sys/wait.h> > #include <ftw.h> > +#include <locale.h> > > #include "bsdqueue.h" > #include "cpiohdr.h" > @@ -630,6 +631,15 @@ int main(int argc, char **argv) > */ > notify_init(); > > + /* > + * Enable system locale - change from the standard (C) to system > locale. > + * This allows libarchive (in case it is activated) to handle > filenames. > + * See on libarchive Websiete for a more complete description of > the issue: > + * https://github.com/libarchive/libarchive/issues/587 > + * https://github.com/libarchive/libarchive/wiki/Filenames > + */ > + setlocale(LC_ALL, ""); > + > /* > * Check if there is a configuration file and parse it > * Parse once the command line just to find if a > > > > --- > > handlers/archive_handler.c | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/handlers/archive_handler.c b/handlers/archive_handler.c > > index a44819d..42bbfbb 100644 > > --- a/handlers/archive_handler.c > > +++ b/handlers/archive_handler.c > > @@ -6,6 +6,7 @@ > > */ > > > > #include <sys/types.h> > > +#include <locale.h> > > #include <stdio.h> > > #include <sys/stat.h> > > #include <unistd.h> > > @@ -75,6 +76,10 @@ extract(void *p) > > struct extract_data *data = (struct extract_data *)p; > > flags = data->flags; > > > > + char *LANG = getenv("LC_ALL"); > > + if (!LANG) LANG = getenv("LC_CTYPE"); > > + if (!LANG) LANG = getenv("LANG"); > > + setlocale(LC_CTYPE, LANG ? LANG : ""); > > a = archive_read_new(); > > ext = archive_write_disk_new(); > > archive_write_disk_set_options(ext, flags); > > @@ -97,6 +102,7 @@ extract(void *p) > > if ((r = archive_read_open_filename(a, FIFO, 4096))) { > > ERROR("archive_read_open_filename(): %s %d", > > archive_error_string(a), r); > > + setlocale(LC_CTYPE, "C"); > > This does not set back if it was set before. It works for you because > LC_TYPE *was* "C". I didn't notice any issues when not setting it back but did this to reduce the chance of bugs caused by having an unexpected locale in other parts of the code. Doesn't swupdate always start with a "C" local initially? > > > pthread_exit((void *)-1); > > } > > for (;;) { > > @@ -106,6 +112,7 @@ extract(void *p) > > if (r != ARCHIVE_OK) { > > ERROR("archive_read_next_header(): %s %d", > > archive_error_string(a), 1); > > + setlocale(LC_CTYPE, "C"); > > pthread_exit((void *)-1); > > } > > > > @@ -122,6 +129,7 @@ extract(void *p) > > if (r != ARCHIVE_OK) { > > ERROR("archive_write_finish_entry(): %s", > > archive_error_string(ext)); > > + setlocale(LC_CTYPE, "C"); > > pthread_exit((void *)-1); > > } > > } > > @@ -130,6 +138,7 @@ extract(void *p) > > > > archive_read_close(a); > > archive_read_free(a); > > + setlocale(LC_CTYPE, "C"); > > pthread_exit((void *)0); > > } > > > > > > Best regards, > Stefano Babic > > > -- > ===================================================================== > DENX Software Engineering GmbH, Managing Director: Wolfgang Denk > HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany > Phone: +49-8142-66989-53 Fax: +49-8142-66989-80 Email: sbabic@denx.de > =====================================================================
On Tue, Dec 25, 2018 at 2:51 PM James Hilliard <james.hilliard1@gmail.com> wrote: > > On Tue, Dec 25, 2018 at 6:25 AM Stefano Babic <sbabic@denx.de> wrote: > > > > Hi James, > > > > (you get a Christmas' review): > > > > On 23/12/18 05:35, james.hilliard1@gmail.com wrote: > > > From: James Hilliard <james.hilliard1@gmail.com> > > > > > > Signed-off-by: James Hilliard <james.hilliard1@gmail.com> > > > > There is not a full description of the problem in the commit message. > > > > Anyway, I am not sure if the patch is really addressing the issue. The > > cause as reported by several sources is that libarchive does not handle > > correctly the filenames. The expectation is that the issue is fixed than > > in libarchive and not in SWUpdate. > Well it seems that unless libarchive is significantly refactored one will > need to have a valid UTF-8 system locale in order for this to work. > Since it's apparently not a good idea to call setlocale in a library I don't > expect it to be fixed there any time soon. > > > > Anyway, it will help if there is a real example with filenames that > > cause the issue to happens. Can you provide such a case for this? > It appears to always happen in my builds and I don't think I had any > non-ascii filenames either. I'm using buildroot to generate everything. > > > > Any program is started with an implicit set to "C" as locale. If > > SWUpdate should able to work with locale settings, I guess it is enough > > to drop the LC_ALL set. > > > > Can you check if this small patch solve your issue ? > I had already tried that exact same thing before and it didn't work although > I do agree that the docs indicate it should have, for some reason that I > haven't figured out yet swupdate fails to pick up the system locale when > doing "setlocale(LC_ALL, "")" by itself so it's required to check the > environment first(other applications such as busybox do exactly this as well). Here's the example from busybox for reference https://github.com/mirror/busybox/blob/349d72c19ced4fae64e8fdd5792b37e78ac2f616/libbb/unicode.c#L57-L60 > > > > diff --git a/core/swupdate.c b/core/swupdate.c > > index c59a258..7ff8138 100644 > > --- a/core/swupdate.c > > +++ b/core/swupdate.c > > @@ -24,6 +24,7 @@ > > #include <signal.h> > > #include <sys/wait.h> > > #include <ftw.h> > > +#include <locale.h> > > > > #include "bsdqueue.h" > > #include "cpiohdr.h" > > @@ -630,6 +631,15 @@ int main(int argc, char **argv) > > */ > > notify_init(); > > > > + /* > > + * Enable system locale - change from the standard (C) to system > > locale. > > + * This allows libarchive (in case it is activated) to handle > > filenames. > > + * See on libarchive Websiete for a more complete description of > > the issue: > > + * https://github.com/libarchive/libarchive/issues/587 > > + * https://github.com/libarchive/libarchive/wiki/Filenames > > + */ > > + setlocale(LC_ALL, ""); > > + > > /* > > * Check if there is a configuration file and parse it > > * Parse once the command line just to find if a > > > > > > > --- > > > handlers/archive_handler.c | 9 +++++++++ > > > 1 file changed, 9 insertions(+) > > > > > > diff --git a/handlers/archive_handler.c b/handlers/archive_handler.c > > > index a44819d..42bbfbb 100644 > > > --- a/handlers/archive_handler.c > > > +++ b/handlers/archive_handler.c > > > @@ -6,6 +6,7 @@ > > > */ > > > > > > #include <sys/types.h> > > > +#include <locale.h> > > > #include <stdio.h> > > > #include <sys/stat.h> > > > #include <unistd.h> > > > @@ -75,6 +76,10 @@ extract(void *p) > > > struct extract_data *data = (struct extract_data *)p; > > > flags = data->flags; > > > > > > + char *LANG = getenv("LC_ALL"); > > > + if (!LANG) LANG = getenv("LC_CTYPE"); > > > + if (!LANG) LANG = getenv("LANG"); > > > + setlocale(LC_CTYPE, LANG ? LANG : ""); > > > a = archive_read_new(); > > > ext = archive_write_disk_new(); > > > archive_write_disk_set_options(ext, flags); > > > @@ -97,6 +102,7 @@ extract(void *p) > > > if ((r = archive_read_open_filename(a, FIFO, 4096))) { > > > ERROR("archive_read_open_filename(): %s %d", > > > archive_error_string(a), r); > > > + setlocale(LC_CTYPE, "C"); > > > > This does not set back if it was set before. It works for you because > > LC_TYPE *was* "C". > I didn't notice any issues when not setting it back but did this to reduce the > chance of bugs caused by having an unexpected locale in other parts of the > code. Doesn't swupdate always start with a "C" local initially? > > > > > pthread_exit((void *)-1); > > > } > > > for (;;) { > > > @@ -106,6 +112,7 @@ extract(void *p) > > > if (r != ARCHIVE_OK) { > > > ERROR("archive_read_next_header(): %s %d", > > > archive_error_string(a), 1); > > > + setlocale(LC_CTYPE, "C"); > > > pthread_exit((void *)-1); > > > } > > > > > > @@ -122,6 +129,7 @@ extract(void *p) > > > if (r != ARCHIVE_OK) { > > > ERROR("archive_write_finish_entry(): %s", > > > archive_error_string(ext)); > > > + setlocale(LC_CTYPE, "C"); > > > pthread_exit((void *)-1); > > > } > > > } > > > @@ -130,6 +138,7 @@ extract(void *p) > > > > > > archive_read_close(a); > > > archive_read_free(a); > > > + setlocale(LC_CTYPE, "C"); > > > pthread_exit((void *)0); > > > } > > > > > > > > > > Best regards, > > Stefano Babic > > > > > > -- > > ===================================================================== > > DENX Software Engineering GmbH, Managing Director: Wolfgang Denk > > HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany > > Phone: +49-8142-66989-53 Fax: +49-8142-66989-80 Email: sbabic@denx.de > > =====================================================================
Hi James, On 25/12/18 21:51, James Hilliard wrote: > On Tue, Dec 25, 2018 at 6:25 AM Stefano Babic <sbabic@denx.de> wrote: >> >> Hi James, >> >> (you get a Christmas' review): >> >> On 23/12/18 05:35, james.hilliard1@gmail.com wrote: >>> From: James Hilliard <james.hilliard1@gmail.com> >>> >>> Signed-off-by: James Hilliard <james.hilliard1@gmail.com> >> >> There is not a full description of the problem in the commit message. >> >> Anyway, I am not sure if the patch is really addressing the issue. The >> cause as reported by several sources is that libarchive does not handle >> correctly the filenames. The expectation is that the issue is fixed than >> in libarchive and not in SWUpdate. > Well it seems that unless libarchive is significantly refactored one will > need to have a valid UTF-8 system locale in order for this to work. > Since it's apparently not a good idea to call setlocale in a library I don't > expect it to be fixed there any time soon. >> >> Anyway, it will help if there is a real example with filenames that >> cause the issue to happens. Can you provide such a case for this? > It appears to always happen in my builds and I don't think I had any > non-ascii filenames either. I'm using buildroot to generate everything. Then the cause could be something else - I tested this issue by installing a rootfs (built as core-image-full-cmdline in yocto, libarchive 3.3.3), and I have no issue. There are no filenames with not standard characters. If you are sure that your files have not any non-ascii chars, it should be necessary to deep investigate and check why libarchive stops. There is no reason that it does not work with ascii filenames. >> >> Any program is started with an implicit set to "C" as locale. If >> SWUpdate should able to work with locale settings, I guess it is enough >> to drop the LC_ALL set. >> >> Can you check if this small patch solve your issue ? > I had already tried that exact same thing before and it didn't work although That let me think that there is a more hidden reason. > I do agree that the docs indicate it should have, for some reason that I > haven't figured out yet swupdate fails to pick up the system locale when > doing "setlocale(LC_ALL, "")" by itself so it's required to check the > environment first(other applications such as busybox do exactly this as well). Same issue is with any application - calling setlocale(LC_ALL,"") relies that locale works on the system. In a minimal system with busybox and no locale settings, setlocale() does nothing. You can call setlocale(LC_ALL, Null) to check which locale are set. >> >> diff --git a/core/swupdate.c b/core/swupdate.c >> index c59a258..7ff8138 100644 >> --- a/core/swupdate.c >> +++ b/core/swupdate.c >> @@ -24,6 +24,7 @@ >> #include <signal.h> >> #include <sys/wait.h> >> #include <ftw.h> >> +#include <locale.h> >> >> #include "bsdqueue.h" >> #include "cpiohdr.h" >> @@ -630,6 +631,15 @@ int main(int argc, char **argv) >> */ >> notify_init(); >> >> + /* >> + * Enable system locale - change from the standard (C) to system >> locale. >> + * This allows libarchive (in case it is activated) to handle >> filenames. >> + * See on libarchive Websiete for a more complete description of >> the issue: >> + * https://github.com/libarchive/libarchive/issues/587 >> + * https://github.com/libarchive/libarchive/wiki/Filenames >> + */ >> + setlocale(LC_ALL, ""); >> + >> /* >> * Check if there is a configuration file and parse it >> * Parse once the command line just to find if a >> >> >>> --- >>> handlers/archive_handler.c | 9 +++++++++ >>> 1 file changed, 9 insertions(+) >>> >>> diff --git a/handlers/archive_handler.c b/handlers/archive_handler.c >>> index a44819d..42bbfbb 100644 >>> --- a/handlers/archive_handler.c >>> +++ b/handlers/archive_handler.c >>> @@ -6,6 +6,7 @@ >>> */ >>> >>> #include <sys/types.h> >>> +#include <locale.h> >>> #include <stdio.h> >>> #include <sys/stat.h> >>> #include <unistd.h> >>> @@ -75,6 +76,10 @@ extract(void *p) >>> struct extract_data *data = (struct extract_data *)p; >>> flags = data->flags; >>> >>> + char *LANG = getenv("LC_ALL"); >>> + if (!LANG) LANG = getenv("LC_CTYPE"); >>> + if (!LANG) LANG = getenv("LANG"); >>> + setlocale(LC_CTYPE, LANG ? LANG : ""); >>> a = archive_read_new(); >>> ext = archive_write_disk_new(); >>> archive_write_disk_set_options(ext, flags); >>> @@ -97,6 +102,7 @@ extract(void *p) >>> if ((r = archive_read_open_filename(a, FIFO, 4096))) { >>> ERROR("archive_read_open_filename(): %s %d", >>> archive_error_string(a), r); >>> + setlocale(LC_CTYPE, "C"); >> >> This does not set back if it was set before. It works for you because >> LC_TYPE *was* "C". > I didn't notice any issues when not setting it back Just because the setting is the same as when SWUpdate is started. But if it was different, you set it to "C" without knowing the previous state. That means: it works on your system, but it could not work on other systems. > but did this to reduce the > chance of bugs caused by having an unexpected locale in other parts of the > code. Doesn't swupdate always start with a "C" local initially? This is how application are commonly started if they do not call explicitly setlocale() >> >>> pthread_exit((void *)-1); >>> } >>> for (;;) { >>> @@ -106,6 +112,7 @@ extract(void *p) >>> if (r != ARCHIVE_OK) { >>> ERROR("archive_read_next_header(): %s %d", >>> archive_error_string(a), 1); >>> + setlocale(LC_CTYPE, "C"); >>> pthread_exit((void *)-1); >>> } >>> >>> @@ -122,6 +129,7 @@ extract(void *p) >>> if (r != ARCHIVE_OK) { >>> ERROR("archive_write_finish_entry(): %s", >>> archive_error_string(ext)); >>> + setlocale(LC_CTYPE, "C"); >>> pthread_exit((void *)-1); >>> } >>> } >>> @@ -130,6 +138,7 @@ extract(void *p) >>> >>> archive_read_close(a); >>> archive_read_free(a); >>> + setlocale(LC_CTYPE, "C"); >>> pthread_exit((void *)0); >>> } >>> >>> >> Best regards, Stefano Babic
Hi James, On 25/12/18 22:51, James Hilliard wrote: > On Tue, Dec 25, 2018 at 2:51 PM James Hilliard > <james.hilliard1@gmail.com> wrote: >> >> On Tue, Dec 25, 2018 at 6:25 AM Stefano Babic <sbabic@denx.de> wrote: >>> >>> Hi James, >>> >>> (you get a Christmas' review): >>> >>> On 23/12/18 05:35, james.hilliard1@gmail.com wrote: >>>> From: James Hilliard <james.hilliard1@gmail.com> >>>> >>>> Signed-off-by: James Hilliard <james.hilliard1@gmail.com> >>> >>> There is not a full description of the problem in the commit message. >>> >>> Anyway, I am not sure if the patch is really addressing the issue. The >>> cause as reported by several sources is that libarchive does not handle >>> correctly the filenames. The expectation is that the issue is fixed than >>> in libarchive and not in SWUpdate. >> Well it seems that unless libarchive is significantly refactored one will >> need to have a valid UTF-8 system locale in order for this to work. >> Since it's apparently not a good idea to call setlocale in a library I don't >> expect it to be fixed there any time soon. >>> >>> Anyway, it will help if there is a real example with filenames that >>> cause the issue to happens. Can you provide such a case for this? >> It appears to always happen in my builds and I don't think I had any >> non-ascii filenames either. I'm using buildroot to generate everything. >>> >>> Any program is started with an implicit set to "C" as locale. If >>> SWUpdate should able to work with locale settings, I guess it is enough >>> to drop the LC_ALL set. >>> >>> Can you check if this small patch solve your issue ? >> I had already tried that exact same thing before and it didn't work although >> I do agree that the docs indicate it should have, for some reason that I >> haven't figured out yet swupdate fails to pick up the system locale when >> doing "setlocale(LC_ALL, "")" by itself so it's required to check the >> environment first(other applications such as busybox do exactly this as well). > Here's the example from busybox for reference > https://github.com/mirror/busybox/blob/349d72c19ced4fae64e8fdd5792b37e78ac2f616/libbb/unicode.c#L57-L60 Each busyboc command is an appet and the initialization of the applet happens in lbb_prepare() 303 void lbb_prepare(const char *applet 304 IF_FEATURE_INDIVIDUAL(, char **argv)) 305 { 306 #ifdef __GLIBC__ 307 (*(int **)&bb_errno) = __errno_location(); 308 barrier(); 309 #endif 310 applet_name = applet; 311 312 if (ENABLE_LOCALE_SUPPORT) 313 setlocale(LC_ALL, ""); Best regards, Stefano Babic
On Thu, Dec 27, 2018 at 11:30 AM Stefano Babic <sbabic@denx.de> wrote: > > Hi James, > > On 25/12/18 21:51, James Hilliard wrote: > > On Tue, Dec 25, 2018 at 6:25 AM Stefano Babic <sbabic@denx.de> wrote: > >> > >> Hi James, > >> > >> (you get a Christmas' review): > >> > >> On 23/12/18 05:35, james.hilliard1@gmail.com wrote: > >>> From: James Hilliard <james.hilliard1@gmail.com> > >>> > >>> Signed-off-by: James Hilliard <james.hilliard1@gmail.com> > >> > >> There is not a full description of the problem in the commit message. > >> > >> Anyway, I am not sure if the patch is really addressing the issue. The > >> cause as reported by several sources is that libarchive does not handle > >> correctly the filenames. The expectation is that the issue is fixed than > >> in libarchive and not in SWUpdate. > > Well it seems that unless libarchive is significantly refactored one will > > need to have a valid UTF-8 system locale in order for this to work. > > Since it's apparently not a good idea to call setlocale in a library I don't > > expect it to be fixed there any time soon. > >> > >> Anyway, it will help if there is a real example with filenames that > >> cause the issue to happens. Can you provide such a case for this? > > It appears to always happen in my builds and I don't think I had any > > non-ascii filenames either. I'm using buildroot to generate everything. > > Then the cause could be something else - I tested this issue by > installing a rootfs (built as core-image-full-cmdline in yocto, > libarchive 3.3.3), and I have no issue. There are no filenames with not > standard characters. Does yocto provide a C.UTF-8 locale? Buildroot does not provide that so I have to use en_US.UTF-8. From my understanding if a system provides a C.UTF-8 locale setlocale is not required. > > If you are sure that your files have not any non-ascii chars, it should > be necessary to deep investigate and check why libarchive stops. There > is no reason that it does not work with ascii filenames. > > >> > >> Any program is started with an implicit set to "C" as locale. If > >> SWUpdate should able to work with locale settings, I guess it is enough > >> to drop the LC_ALL set. > >> > >> Can you check if this small patch solve your issue ? > > I had already tried that exact same thing before and it didn't work although > > That let me think that there is a more hidden reason. > > > I do agree that the docs indicate it should have, for some reason that I > > haven't figured out yet swupdate fails to pick up the system locale when > > doing "setlocale(LC_ALL, "")" by itself so it's required to check the > > environment first(other applications such as busybox do exactly this as well). > > Same issue is with any application - calling setlocale(LC_ALL,"") relies > that locale works on the system. > > In a minimal system with busybox and no locale settings, setlocale() > does nothing. You can call setlocale(LC_ALL, Null) to check which locale > are set. I was debugging using nl_langinfo(CODESET) to verify if the locale was being set correctly to a UTF-8 locale, using getenv seemed to be required. > > >> > >> diff --git a/core/swupdate.c b/core/swupdate.c > >> index c59a258..7ff8138 100644 > >> --- a/core/swupdate.c > >> +++ b/core/swupdate.c > >> @@ -24,6 +24,7 @@ > >> #include <signal.h> > >> #include <sys/wait.h> > >> #include <ftw.h> > >> +#include <locale.h> > >> > >> #include "bsdqueue.h" > >> #include "cpiohdr.h" > >> @@ -630,6 +631,15 @@ int main(int argc, char **argv) > >> */ > >> notify_init(); > >> > >> + /* > >> + * Enable system locale - change from the standard (C) to system > >> locale. > >> + * This allows libarchive (in case it is activated) to handle > >> filenames. > >> + * See on libarchive Websiete for a more complete description of > >> the issue: > >> + * https://github.com/libarchive/libarchive/issues/587 > >> + * https://github.com/libarchive/libarchive/wiki/Filenames > >> + */ > >> + setlocale(LC_ALL, ""); > >> + > >> /* > >> * Check if there is a configuration file and parse it > >> * Parse once the command line just to find if a > >> > >> > >>> --- > >>> handlers/archive_handler.c | 9 +++++++++ > >>> 1 file changed, 9 insertions(+) > >>> > >>> diff --git a/handlers/archive_handler.c b/handlers/archive_handler.c > >>> index a44819d..42bbfbb 100644 > >>> --- a/handlers/archive_handler.c > >>> +++ b/handlers/archive_handler.c > >>> @@ -6,6 +6,7 @@ > >>> */ > >>> > >>> #include <sys/types.h> > >>> +#include <locale.h> > >>> #include <stdio.h> > >>> #include <sys/stat.h> > >>> #include <unistd.h> > >>> @@ -75,6 +76,10 @@ extract(void *p) > >>> struct extract_data *data = (struct extract_data *)p; > >>> flags = data->flags; > >>> > >>> + char *LANG = getenv("LC_ALL"); > >>> + if (!LANG) LANG = getenv("LC_CTYPE"); > >>> + if (!LANG) LANG = getenv("LANG"); > >>> + setlocale(LC_CTYPE, LANG ? LANG : ""); > >>> a = archive_read_new(); > >>> ext = archive_write_disk_new(); > >>> archive_write_disk_set_options(ext, flags); > >>> @@ -97,6 +102,7 @@ extract(void *p) > >>> if ((r = archive_read_open_filename(a, FIFO, 4096))) { > >>> ERROR("archive_read_open_filename(): %s %d", > >>> archive_error_string(a), r); > >>> + setlocale(LC_CTYPE, "C"); > >> > >> This does not set back if it was set before. It works for you because > >> LC_TYPE *was* "C". > > I didn't notice any issues when not setting it back > > Just because the setting is the same as when SWUpdate is started. But if > it was different, you set it to "C" without knowing the previous state. > That means: it works on your system, but it could not work on other systems. > > > but did this to reduce the > > chance of bugs caused by having an unexpected locale in other parts of the > > code. Doesn't swupdate always start with a "C" local initially? > > This is how application are commonly started if they do not call > explicitly setlocale() > > >> > >>> pthread_exit((void *)-1); > >>> } > >>> for (;;) { > >>> @@ -106,6 +112,7 @@ extract(void *p) > >>> if (r != ARCHIVE_OK) { > >>> ERROR("archive_read_next_header(): %s %d", > >>> archive_error_string(a), 1); > >>> + setlocale(LC_CTYPE, "C"); > >>> pthread_exit((void *)-1); > >>> } > >>> > >>> @@ -122,6 +129,7 @@ extract(void *p) > >>> if (r != ARCHIVE_OK) { > >>> ERROR("archive_write_finish_entry(): %s", > >>> archive_error_string(ext)); > >>> + setlocale(LC_CTYPE, "C"); > >>> pthread_exit((void *)-1); > >>> } > >>> } > >>> @@ -130,6 +138,7 @@ extract(void *p) > >>> > >>> archive_read_close(a); > >>> archive_read_free(a); > >>> + setlocale(LC_CTYPE, "C"); > >>> pthread_exit((void *)0); > >>> } > >>> > >>> > >> > > Best regards, > Stefano Babic > > -- > ===================================================================== > DENX Software Engineering GmbH, Managing Director: Wolfgang Denk > HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany > Phone: +49-8142-66989-53 Fax: +49-8142-66989-80 Email: sbabic@denx.de > =====================================================================
On Thu, Dec 27, 2018 at 11:32 AM Stefano Babic <sbabic@denx.de> wrote: > > Hi James, > > On 25/12/18 22:51, James Hilliard wrote: > > On Tue, Dec 25, 2018 at 2:51 PM James Hilliard > > <james.hilliard1@gmail.com> wrote: > >> > >> On Tue, Dec 25, 2018 at 6:25 AM Stefano Babic <sbabic@denx.de> wrote: > >>> > >>> Hi James, > >>> > >>> (you get a Christmas' review): > >>> > >>> On 23/12/18 05:35, james.hilliard1@gmail.com wrote: > >>>> From: James Hilliard <james.hilliard1@gmail.com> > >>>> > >>>> Signed-off-by: James Hilliard <james.hilliard1@gmail.com> > >>> > >>> There is not a full description of the problem in the commit message. > >>> > >>> Anyway, I am not sure if the patch is really addressing the issue. The > >>> cause as reported by several sources is that libarchive does not handle > >>> correctly the filenames. The expectation is that the issue is fixed than > >>> in libarchive and not in SWUpdate. > >> Well it seems that unless libarchive is significantly refactored one will > >> need to have a valid UTF-8 system locale in order for this to work. > >> Since it's apparently not a good idea to call setlocale in a library I don't > >> expect it to be fixed there any time soon. > >>> > >>> Anyway, it will help if there is a real example with filenames that > >>> cause the issue to happens. Can you provide such a case for this? > >> It appears to always happen in my builds and I don't think I had any > >> non-ascii filenames either. I'm using buildroot to generate everything. > >>> > >>> Any program is started with an implicit set to "C" as locale. If > >>> SWUpdate should able to work with locale settings, I guess it is enough > >>> to drop the LC_ALL set. > >>> > >>> Can you check if this small patch solve your issue ? > >> I had already tried that exact same thing before and it didn't work although > >> I do agree that the docs indicate it should have, for some reason that I > >> haven't figured out yet swupdate fails to pick up the system locale when > >> doing "setlocale(LC_ALL, "")" by itself so it's required to check the > >> environment first(other applications such as busybox do exactly this as well). > > Here's the example from busybox for reference > > https://github.com/mirror/busybox/blob/349d72c19ced4fae64e8fdd5792b37e78ac2f616/libbb/unicode.c#L57-L60 > > > Each busyboc command is an appet and the initialization of the applet > happens in lbb_prepare() > > > 303 void lbb_prepare(const char *applet > 304 IF_FEATURE_INDIVIDUAL(, char **argv)) > 305 { > 306 #ifdef __GLIBC__ > 307 (*(int **)&bb_errno) = __errno_location(); > 308 barrier(); > 309 #endif > 310 applet_name = applet; > 311 > 312 if (ENABLE_LOCALE_SUPPORT) > 313 setlocale(LC_ALL, ""); That seems to be only if busybox has locale support enabled, I tried both ways when I was testing and it didn't work either way for me without pulling locale using getenv. > > Best regards, > Stefano Babic > > -- > ===================================================================== > DENX Software Engineering GmbH, Managing Director: Wolfgang Denk > HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany > Phone: +49-8142-66989-53 Fax: +49-8142-66989-80 Email: sbabic@denx.de > =====================================================================
On Tuesday, December 25, 2018 at 6:25:29 AM UTC-6, Stefano Babic wrote: > Hi James, > > (you get a Christmas' review): > > On 23/12/18 05:35, james.hilliard1@gmail.com wrote: > > From: James Hilliard <james.hilliard1@gmail.com> > > > > Signed-off-by: James Hilliard <james.hilliard1@gmail.com> > > There is not a full description of the problem in the commit message. > > Anyway, I am not sure if the patch is really addressing the issue. The > cause as reported by several sources is that libarchive does not handle > correctly the filenames. The expectation is that the issue is fixed than > in libarchive and not in SWUpdate. > > Anyway, it will help if there is a real example with filenames that > cause the issue to happens. Can you provide such a case for this? > > Any program is started with an implicit set to "C" as locale. If > SWUpdate should able to work with locale settings, I guess it is enough > to drop the LC_ALL set. > > Can you check if this small patch solve your issue ? > > diff --git a/core/swupdate.c b/core/swupdate.c > index c59a258..7ff8138 100644 > --- a/core/swupdate.c > +++ b/core/swupdate.c > @@ -24,6 +24,7 @@ > #include <signal.h> > #include <sys/wait.h> > #include <ftw.h> > +#include <locale.h> > > #include "bsdqueue.h" > #include "cpiohdr.h" > @@ -630,6 +631,15 @@ int main(int argc, char **argv) > */ > notify_init(); > > + /* > + * Enable system locale - change from the standard (C) to system > locale. > + * This allows libarchive (in case it is activated) to handle > filenames. > + * See on libarchive Websiete for a more complete description of > the issue: > + * https://github.com/libarchive/libarchive/issues/587 > + * https://github.com/libarchive/libarchive/wiki/Filenames > + */ > + setlocale(LC_ALL, ""); I did some further testing and determined that LC_CTYPE works correctly while LC_ALL does not on my system. From my reading of the libarchive code only LC_CTYPE should need to be set. I'm not entirely sure why LC_ALL does not work but it shouldn't be needed regardless. > + > /* > * Check if there is a configuration file and parse it > * Parse once the command line just to find if a > > > > --- > > handlers/archive_handler.c | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/handlers/archive_handler.c b/handlers/archive_handler.c > > index a44819d..42bbfbb 100644 > > --- a/handlers/archive_handler.c > > +++ b/handlers/archive_handler.c > > @@ -6,6 +6,7 @@ > > */ > > > > #include <sys/types.h> > > +#include <locale.h> > > #include <stdio.h> > > #include <sys/stat.h> > > #include <unistd.h> > > @@ -75,6 +76,10 @@ extract(void *p) > > struct extract_data *data = (struct extract_data *)p; > > flags = data->flags; > > > > + char *LANG = getenv("LC_ALL"); > > + if (!LANG) LANG = getenv("LC_CTYPE"); > > + if (!LANG) LANG = getenv("LANG"); > > + setlocale(LC_CTYPE, LANG ? LANG : ""); > > a = archive_read_new(); > > ext = archive_write_disk_new(); > > archive_write_disk_set_options(ext, flags); > > @@ -97,6 +102,7 @@ extract(void *p) > > if ((r = archive_read_open_filename(a, FIFO, 4096))) { > > ERROR("archive_read_open_filename(): %s %d", > > archive_error_string(a), r); > > + setlocale(LC_CTYPE, "C"); > > This does not set back if it was set before. It works for you because > LC_TYPE *was* "C". > > > pthread_exit((void *)-1); > > } > > for (;;) { > > @@ -106,6 +112,7 @@ extract(void *p) > > if (r != ARCHIVE_OK) { > > ERROR("archive_read_next_header(): %s %d", > > archive_error_string(a), 1); > > + setlocale(LC_CTYPE, "C"); > > pthread_exit((void *)-1); > > } > > > > @@ -122,6 +129,7 @@ extract(void *p) > > if (r != ARCHIVE_OK) { > > ERROR("archive_write_finish_entry(): %s", > > archive_error_string(ext)); > > + setlocale(LC_CTYPE, "C"); > > pthread_exit((void *)-1); > > } > > } > > @@ -130,6 +138,7 @@ extract(void *p) > > > > archive_read_close(a); > > archive_read_free(a); > > + setlocale(LC_CTYPE, "C"); > > pthread_exit((void *)0); > > } > > > > > > Best regards, > Stefano Babic > > > -- > ===================================================================== > DENX Software Engineering GmbH, Managing Director: Wolfgang Denk > HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany > Phone: +49-8142-66989-53 Fax: +49-8142-66989-80 Email: sbabic@denx.de > =====================================================================
On Thursday, December 27, 2018 at 11:30:08 AM UTC-6, Stefano Babic wrote: > Hi James, > > On 25/12/18 21:51, James Hilliard wrote: > > On Tue, Dec 25, 2018 at 6:25 AM Stefano Babic <sbabic@denx.de> wrote: > >> > >> Hi James, > >> > >> (you get a Christmas' review): > >> > >> On 23/12/18 05:35, james.hilliard1@gmail.com wrote: > >>> From: James Hilliard <james.hilliard1@gmail.com> > >>> > >>> Signed-off-by: James Hilliard <james.hilliard1@gmail.com> > >> > >> There is not a full description of the problem in the commit message. > >> > >> Anyway, I am not sure if the patch is really addressing the issue. The > >> cause as reported by several sources is that libarchive does not handle > >> correctly the filenames. The expectation is that the issue is fixed than > >> in libarchive and not in SWUpdate. > > Well it seems that unless libarchive is significantly refactored one will > > need to have a valid UTF-8 system locale in order for this to work. > > Since it's apparently not a good idea to call setlocale in a library I don't > > expect it to be fixed there any time soon. > >> > >> Anyway, it will help if there is a real example with filenames that > >> cause the issue to happens. Can you provide such a case for this? > > It appears to always happen in my builds and I don't think I had any > > non-ascii filenames either. I'm using buildroot to generate everything. > > Then the cause could be something else - I tested this issue by > installing a rootfs (built as core-image-full-cmdline in yocto, > libarchive 3.3.3), and I have no issue. There are no filenames with not > standard characters. > > If you are sure that your files have not any non-ascii chars, it should > be necessary to deep investigate and check why libarchive stops. There > is no reason that it does not work with ascii filenames. > > >> > >> Any program is started with an implicit set to "C" as locale. If > >> SWUpdate should able to work with locale settings, I guess it is enough > >> to drop the LC_ALL set. > >> > >> Can you check if this small patch solve your issue ? > > I had already tried that exact same thing before and it didn't work although > > That let me think that there is a more hidden reason. > > > I do agree that the docs indicate it should have, for some reason that I > > haven't figured out yet swupdate fails to pick up the system locale when > > doing "setlocale(LC_ALL, "")" by itself so it's required to check the > > environment first(other applications such as busybox do exactly this as well). > > Same issue is with any application - calling setlocale(LC_ALL,"") relies > that locale works on the system. > > In a minimal system with busybox and no locale settings, setlocale() > does nothing. You can call setlocale(LC_ALL, Null) to check which locale > are set. > > >> > >> diff --git a/core/swupdate.c b/core/swupdate.c > >> index c59a258..7ff8138 100644 > >> --- a/core/swupdate.c > >> +++ b/core/swupdate.c > >> @@ -24,6 +24,7 @@ > >> #include <signal.h> > >> #include <sys/wait.h> > >> #include <ftw.h> > >> +#include <locale.h> > >> > >> #include "bsdqueue.h" > >> #include "cpiohdr.h" > >> @@ -630,6 +631,15 @@ int main(int argc, char **argv) > >> */ > >> notify_init(); > >> > >> + /* > >> + * Enable system locale - change from the standard (C) to system > >> locale. > >> + * This allows libarchive (in case it is activated) to handle > >> filenames. > >> + * See on libarchive Websiete for a more complete description of > >> the issue: > >> + * https://github.com/libarchive/libarchive/issues/587 > >> + * https://github.com/libarchive/libarchive/wiki/Filenames > >> + */ > >> + setlocale(LC_ALL, ""); > >> + > >> /* > >> * Check if there is a configuration file and parse it > >> * Parse once the command line just to find if a > >> > >> > >>> --- > >>> handlers/archive_handler.c | 9 +++++++++ > >>> 1 file changed, 9 insertions(+) > >>> > >>> diff --git a/handlers/archive_handler.c b/handlers/archive_handler.c > >>> index a44819d..42bbfbb 100644 > >>> --- a/handlers/archive_handler.c > >>> +++ b/handlers/archive_handler.c > >>> @@ -6,6 +6,7 @@ > >>> */ > >>> > >>> #include <sys/types.h> > >>> +#include <locale.h> > >>> #include <stdio.h> > >>> #include <sys/stat.h> > >>> #include <unistd.h> > >>> @@ -75,6 +76,10 @@ extract(void *p) > >>> struct extract_data *data = (struct extract_data *)p; > >>> flags = data->flags; > >>> > >>> + char *LANG = getenv("LC_ALL"); > >>> + if (!LANG) LANG = getenv("LC_CTYPE"); > >>> + if (!LANG) LANG = getenv("LANG"); > >>> + setlocale(LC_CTYPE, LANG ? LANG : ""); > >>> a = archive_read_new(); > >>> ext = archive_write_disk_new(); > >>> archive_write_disk_set_options(ext, flags); > >>> @@ -97,6 +102,7 @@ extract(void *p) > >>> if ((r = archive_read_open_filename(a, FIFO, 4096))) { > >>> ERROR("archive_read_open_filename(): %s %d", > >>> archive_error_string(a), r); > >>> + setlocale(LC_CTYPE, "C"); > >> > >> This does not set back if it was set before. It works for you because > >> LC_TYPE *was* "C". > > I didn't notice any issues when not setting it back > > Just because the setting is the same as when SWUpdate is started. But if > it was different, you set it to "C" without knowing the previous state. > That means: it works on your system, but it could not work on other systems. I've fixed this in my v2 patch, it now should save the previous locale and restore it once extraction is complete. I also changed it to use the thread local uselocale as opposed to setlocale(which is global to the entire process), that should further reduce the chance of unintended side effects by isolating the locale to only the libarchive extraction thread. > > > but did this to reduce the > > chance of bugs caused by having an unexpected locale in other parts of the > > code. Doesn't swupdate always start with a "C" local initially? > > This is how application are commonly started if they do not call > explicitly setlocale() > > >> > >>> pthread_exit((void *)-1); > >>> } > >>> for (;;) { > >>> @@ -106,6 +112,7 @@ extract(void *p) > >>> if (r != ARCHIVE_OK) { > >>> ERROR("archive_read_next_header(): %s %d", > >>> archive_error_string(a), 1); > >>> + setlocale(LC_CTYPE, "C"); > >>> pthread_exit((void *)-1); > >>> } > >>> > >>> @@ -122,6 +129,7 @@ extract(void *p) > >>> if (r != ARCHIVE_OK) { > >>> ERROR("archive_write_finish_entry(): %s", > >>> archive_error_string(ext)); > >>> + setlocale(LC_CTYPE, "C"); > >>> pthread_exit((void *)-1); > >>> } > >>> } > >>> @@ -130,6 +138,7 @@ extract(void *p) > >>> > >>> archive_read_close(a); > >>> archive_read_free(a); > >>> + setlocale(LC_CTYPE, "C"); > >>> pthread_exit((void *)0); > >>> } > >>> > >>> > >> > > Best regards, > Stefano Babic > > -- > ===================================================================== > DENX Software Engineering GmbH, Managing Director: Wolfgang Denk > HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany > Phone: +49-8142-66989-53 Fax: +49-8142-66989-80 Email: sbabic@denx.de > =====================================================================
diff --git a/handlers/archive_handler.c b/handlers/archive_handler.c index a44819d..42bbfbb 100644 --- a/handlers/archive_handler.c +++ b/handlers/archive_handler.c @@ -6,6 +6,7 @@ */ #include <sys/types.h> +#include <locale.h> #include <stdio.h> #include <sys/stat.h> #include <unistd.h> @@ -75,6 +76,10 @@ extract(void *p) struct extract_data *data = (struct extract_data *)p; flags = data->flags; + char *LANG = getenv("LC_ALL"); + if (!LANG) LANG = getenv("LC_CTYPE"); + if (!LANG) LANG = getenv("LANG"); + setlocale(LC_CTYPE, LANG ? LANG : ""); a = archive_read_new(); ext = archive_write_disk_new(); archive_write_disk_set_options(ext, flags); @@ -97,6 +102,7 @@ extract(void *p) if ((r = archive_read_open_filename(a, FIFO, 4096))) { ERROR("archive_read_open_filename(): %s %d", archive_error_string(a), r); + setlocale(LC_CTYPE, "C"); pthread_exit((void *)-1); } for (;;) { @@ -106,6 +112,7 @@ extract(void *p) if (r != ARCHIVE_OK) { ERROR("archive_read_next_header(): %s %d", archive_error_string(a), 1); + setlocale(LC_CTYPE, "C"); pthread_exit((void *)-1); } @@ -122,6 +129,7 @@ extract(void *p) if (r != ARCHIVE_OK) { ERROR("archive_write_finish_entry(): %s", archive_error_string(ext)); + setlocale(LC_CTYPE, "C"); pthread_exit((void *)-1); } } @@ -130,6 +138,7 @@ extract(void *p) archive_read_close(a); archive_read_free(a); + setlocale(LC_CTYPE, "C"); pthread_exit((void *)0); }