Message ID | 20221115005623.3774099-1-bruno@clisp.org |
---|---|
State | New |
Headers | show |
Series | intl: Treat C.UTF-8 locale like C locale (BZ# 16621) | expand |
* Bruno Haible: > The wiki page https://sourceware.org/glibc/wiki/Proposals/C.UTF-8 > says that "Setting LC_ALL=C.UTF-8 will ignore LANGUAGE just like it > does with LC_ALL=C." This patch implements it. > > * intl/dcigettext.c (guess_category_value): Treat C.<encoding> locale > like the C locale. > --- > intl/dcigettext.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/intl/dcigettext.c b/intl/dcigettext.c > index 1fc074a414..6a3c248e68 100644 > --- a/intl/dcigettext.c > +++ b/intl/dcigettext.c > @@ -1564,8 +1564,12 @@ guess_category_value (int category, const char *categoryname) > 2. The precise output of some programs in the "C" locale is specified > by POSIX and should not depend on environment variables like > "LANGUAGE" or system-dependent information. We allow such programs > - to use gettext(). */ > - if (strcmp (locale, "C") == 0) > + to use gettext(). > + Ignore LANGUAGE and its system-dependent analogon also if the locale is > + set to "C.UTF-8" or, more generally, to "C.<encoding>", because that's > + the by-design behaviour for glibc, see > + <https://sourceware.org/glibc/wiki/Proposals/C.UTF-8>. */ > + if (locale[0] == 'C' && (locale[1] == '\0' || locale[1] == '.')) > return locale; > > /* The highest priority value is the value of the 'LANGUAGE' environment Reviewed-by: Florian Weimer <fweimer@redhat.com> Fix pushed. I've posted my test case as well: [PATCH] intl: Add test case for bug 16621 <https://inbox.sourceware.org/libc-alpha/87o7iiukpt.fsf@oldenburg3.str.redhat.com/T/#u> Thanks, Florian
Florian Weimer wrote: > > * intl/dcigettext.c (guess_category_value): Treat C.<encoding> locale > > like the C locale. > > Reviewed-by: Florian Weimer <fweimer@redhat.com> > > Fix pushed. Thanks! > I've posted my test case as well: > > [PATCH] intl: Add test case for bug 16621 > <https://inbox.sourceware.org/libc-alpha/87o7iiukpt.fsf@oldenburg3.str.redhat.com/T/#u> Now that the main patch is in glibc, I added it also to GNU gettext, together with a unit test. My unit test [1][2] happens to be stricter than what I had manually tested in Dec. 2022: It adds a .mo file at <LOCALEDIR>/C/LC_MESSAGES/<domain>.mo . And the test fails. A second patch is needed, basically the same change at a different place in dcigettext.c. I'm posting it separately. Bruno [1] https://git.savannah.gnu.org/gitweb/?p=gettext.git;a=blob;f=gettext-tools/tests/intl-0;h=9977cfe2e5d645c3a20fbfe891974720aacb488d;hb=HEAD [2] https://git.savannah.gnu.org/gitweb/?p=gettext.git;a=blob;f=gettext-tools/tests/intl-1-prg.c;h=cda076140b4d60d2a9535d4fa1d769f26c580c20;hb=HEAD
diff --git a/intl/dcigettext.c b/intl/dcigettext.c index 1fc074a414..6a3c248e68 100644 --- a/intl/dcigettext.c +++ b/intl/dcigettext.c @@ -1564,8 +1564,12 @@ guess_category_value (int category, const char *categoryname) 2. The precise output of some programs in the "C" locale is specified by POSIX and should not depend on environment variables like "LANGUAGE" or system-dependent information. We allow such programs - to use gettext(). */ - if (strcmp (locale, "C") == 0) + to use gettext(). + Ignore LANGUAGE and its system-dependent analogon also if the locale is + set to "C.UTF-8" or, more generally, to "C.<encoding>", because that's + the by-design behaviour for glibc, see + <https://sourceware.org/glibc/wiki/Proposals/C.UTF-8>. */ + if (locale[0] == 'C' && (locale[1] == '\0' || locale[1] == '.')) return locale; /* The highest priority value is the value of the 'LANGUAGE' environment