Message ID | 20240716120436.2135312-1-jwakely@redhat.com |
---|---|
State | New |
Headers | show |
Series | [v3,1/6] libstdc++: Handle encodings in localized chrono formatting [PR109162] | expand |
On Tue, 16 Jul 2024 at 13:05, Jonathan Wakely <jwakely@redhat.com> wrote: > > On Fri, 12 Jul 2024 at 00:23, I wrote: > > > > I sent v1 of this patch in February, and it added the new symbols to > > libstdc++exp.a which meant users needed to use -lstdc++exp to format > > chrono types in C++23 mode. That was less than ideal. > > > > This v2 patch adds the new symbols to the main library, which means no > > extra step to get the new features, and we can enable them as a DR for > > C++20 mode. But that means we need new exports in the shared library, > > and so need to be more confident that the feature is stable and ready to > > go into the lib. > > > > I'm not 100% confident that we want to add a new, private facet to the > > std::locale, but it seems reasonable. And that's not exposed to users at > > all, as the two new symbols added to the library hide the creation and > > use of that facet. > > Here's v3, which fixes a missing export of the __sso_string constructors > and destructors, needed so that the old ABI can use the new function to > transcode a locale-specific string to UTF-8, with a std::string buffer. > > I haven't done so here, but we could keep a least recently used cache of > __encoding facets, so that repeatedly calling std::format with the same > locale doesn't need to keep re-checking the locale's encoding and then > re-opening the same iconv_t descriptor. > > This v3 patch also tweaks the commented out parts of > include/bits/version.def in preparation for enabling the C++26 <format> > features in the following patches in this series. > > Tested x86_64-linux. I think this is ready to push now, but I'll wait a > bit for any comments on it. > > -- >8 -- > > This implements the C++23 paper P2419R2 (Clarify handling of encodings > in localized formatting of chrono types). The requirement is that when > the literal encoding is "a Unicode encoding form" and the formatting > locale uses a different encoding, any locale-specific strings such as > "août" for std::chrono::August should be converted to the literal > encoding. > > Using the recently-added std::locale::encoding() function we can check > the locale's encoding and then use iconv if a conversion is needed. > Because nl_langinfo_l and iconv_open both allocate memory, a naive > implementation would perform multiple allocations and deallocations for > every snippet of locale-specific text that needs to be converted to > UTF-8. To avoid that, a new internal locale::facet is defined to store > the text_encoding and an iconv_t descriptor, which are then cached in > the formatting locale. This requires access to the internals of a > std::locale object in src/c++20/format.cc, so that new file needs to be > compiled with -fno-access-control, as well as -std=gnu++26 in order to > use std::text_encoding. > > Because the new std::text_encoding and std::locale::encoding() symbols > are only in the libstdc++exp.a archive, we need to include > src/c++26/text_encoding.cc in the main library, but not export its > symbols yet. This means they can be used by the two new functions which > are exported from the main library. > > The encoding conversions are done for C++20, treating it as a DR that > resolves LWG 3656. > > With this change we can increase the value of the __cpp_lib_format macro > for C++23. The value should be 202207 for P2419R2, but we already > implement P2510R3 (Formatting pointers) so can use the value 202304. > > libstdc++-v3/ChangeLog: > > PR libstdc++/109162 > * acinclude.m4 (libtool_VERSION): Update to 6:34:0. > * config/abi/pre/gnu.ver: Disambiguate old patters. Add new > GLIBCXX_3.4.34 symbol version and new exports. > * configure: Regenerate. > * include/bits/chrono_io.h (_ChronoSpec::_M_locale_specific): > Add new accessor functions to use a reserved bit in _Spec. > (__formatter_chrono::_M_parse): Use _M_locale_specific(true) > when chrono-specs contains locale-dependent conversion > specifiers. > (__formatter_chrono::_M_format): Open iconv descriptor if > conversion to UTF-8 will be needed. > (__formatter_chrono::_M_write): New function to write a > localized string with possible character conversion. > (__formatter_chrono::_M_a_A, __formatter_chrono::_M_b_B) > (__formatter_chrono::_M_p, __formatter_chrono::_M_r) > (__formatter_chrono::_M_x, __formatter_chrono::_M_X) > (__formatter_chrono::_M_locale_fmt): Use _M_write. > * include/bits/version.def (format): Update value. > * include/bits/version.h: Regenerate. > * include/std/format (_GLIBCXX_P2518R3): Check feature test > macro instead of __cplusplus. > (basic_format_context): Declare __formatter_chrono as friend. > * src/c++20/Makefile.am: Add new file. > * src/c++20/Makefile.in: Regenerate. > * src/c++20/format.cc: New file. > * testsuite/std/time/format_localized.cc: New test. > * testsuite/util/testsuite_abi.cc: Add new symbol version. > --- > libstdc++-v3/acinclude.m4 | 2 +- > libstdc++-v3/config/abi/pre/gnu.ver | 18 +- > libstdc++-v3/configure | 2 +- > libstdc++-v3/include/bits/chrono_io.h | 96 ++++++++-- > libstdc++-v3/include/bits/version.def | 29 ++- > libstdc++-v3/include/bits/version.h | 4 +- > libstdc++-v3/include/std/format | 16 +- > libstdc++-v3/src/c++20/Makefile.am | 8 +- > libstdc++-v3/src/c++20/Makefile.in | 10 +- > libstdc++-v3/src/c++20/format.cc | 174 ++++++++++++++++++ > .../testsuite/std/time/format_localized.cc | 47 +++++ > libstdc++-v3/testsuite/util/testsuite_abi.cc | 1 + > 12 files changed, 378 insertions(+), 29 deletions(-) > create mode 100644 libstdc++-v3/src/c++20/format.cc > create mode 100644 libstdc++-v3/testsuite/std/time/format_localized.cc > > diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4 > index e04aae25360..e4ed583b3ae 100644 > --- a/libstdc++-v3/acinclude.m4 > +++ b/libstdc++-v3/acinclude.m4 > @@ -4230,7 +4230,7 @@ changequote([,])dnl > fi > > # For libtool versioning info, format is CURRENT:REVISION:AGE > -libtool_VERSION=6:33:0 > +libtool_VERSION=6:34:0 > > # Everything parsed; figure out what files and settings to use. > case $enable_symvers in > diff --git a/libstdc++-v3/config/abi/pre/gnu.ver b/libstdc++-v3/config/abi/pre/gnu.ver > index 31449b5b87b..ae79b371d80 100644 > --- a/libstdc++-v3/config/abi/pre/gnu.ver > +++ b/libstdc++-v3/config/abi/pre/gnu.ver > @@ -109,7 +109,11 @@ GLIBCXX_3.4 { > std::[j-k]*; > # std::length_error::l*; > # std::length_error::~l*; > - std::locale::[A-Za-e]*; > + # std::locale::[A-Za-d]*; > + std::locale::all; > + std::locale::classic*; > + std::locale::collate; > + std::locale::ctype; > std::locale::facet::[A-Za-z]*; > std::locale::facet::_S_get_c_locale*; > std::locale::facet::_S_clone_c_locale*; > @@ -168,7 +172,7 @@ GLIBCXX_3.4 { > std::strstream*; > std::strstreambuf*; > # std::t[a-q]*; > - std::t[a-g]*; > + std::terminate*; > std::th[a-h]*; > std::th[j-q]*; > std::th[s-z]*; > @@ -2528,6 +2532,16 @@ GLIBCXX_3.4.33 { > _ZNKSt12__basic_fileIcE13native_handleEv; > } GLIBCXX_3.4.32; > > +# GCC 15.1.0 > +GLIBCXX_3.4.34 { > + # std::__format::__with_encoding_conversion > + _ZNSt8__format26__with_encoding_conversionERKSt6locale; > + # std::__format::__locale_encoding_to_utf8 > + _ZNSt8__format25__locale_encoding_to_utf8ERKSt6localeSt17basic_string_viewIcSt11char_traitsIcEEPv; > + # __sso_string constructor and destructor > + _ZNSt12__sso_string[CD][12]Ev; > +} GLIBCXX_3.4.33; > + > # Symbols in the support library (libsupc++) have their own tag. > CXXABI_1.3 { > > diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure > index 5645e991af7..fe525308ae2 100755 > --- a/libstdc++-v3/configure > +++ b/libstdc++-v3/configure > @@ -51040,7 +51040,7 @@ $as_echo "$as_me: WARNING: === Symbol versioning will be disabled." >&2;} > fi > > # For libtool versioning info, format is CURRENT:REVISION:AGE > -libtool_VERSION=6:33:0 > +libtool_VERSION=6:34:0 > > # Everything parsed; figure out what files and settings to use. > case $enable_symvers in > diff --git a/libstdc++-v3/include/bits/chrono_io.h b/libstdc++-v3/include/bits/chrono_io.h > index 72c66a0fef0..2f3ba89de61 100644 > --- a/libstdc++-v3/include/bits/chrono_io.h > +++ b/libstdc++-v3/include/bits/chrono_io.h > @@ -38,8 +38,10 @@ > #include <iomanip> // setw, setfill > #include <format> > #include <charconv> // from_chars > +#include <stdexcept> // __sso_string > > #include <bits/streambuf_iterator.h> > +#include <bits/unique_ptr.h> > > namespace std _GLIBCXX_VISIBILITY(default) > { > @@ -211,6 +213,20 @@ namespace __format > struct _ChronoSpec : _Spec<_CharT> > { > basic_string_view<_CharT> _M_chrono_specs; > + > + // Use one of the reserved bits in __format::_Spec<C>. > + // This indicates that a locale-dependent conversion specifier such as > + // %a is used in the chrono-specs. This is not the same as the > + // _Spec<C>::_M_localized member which indicates that "L" was present > + // in the format-spec, e.g. "{:L%a}" is localized and locale-specific, > + // but "{:L}" is only localized and "{:%a}" is only locale-specific. > + constexpr bool > + _M_locale_specific() const noexcept > + { return this->_M_reserved; } > + > + constexpr void > + _M_locale_specific(bool __b) noexcept > + { this->_M_reserved = __b; } > }; > > // Represents the information provided by a chrono type. > @@ -305,11 +321,12 @@ namespace __format > const auto __chrono_specs = __first++; // Skip leading '%' > if (*__chrono_specs != '%') > __throw_format_error("chrono format error: no '%' at start of " > - "chrono-specs"); > + "chrono-specs"); > > _CharT __mod{}; > bool __conv = true; > int __needed = 0; > + bool __locale_specific = false; > > while (__first != __last) > { > @@ -322,15 +339,18 @@ namespace __format > case 'a': > case 'A': > __needed = _Weekday; > + __locale_specific = true; > break; > case 'b': > case 'h': > case 'B': > __needed = _Month; > + __locale_specific = true; > break; > case 'c': > __needed = _DateTime; > __allowed_mods = _Mod_E; > + __locale_specific = true; > break; > case 'C': > __needed = _Year; > @@ -368,6 +388,8 @@ namespace __format > break; > case 'p': > case 'r': > + __locale_specific = true; > + [[fallthrough]]; > case 'R': > case 'T': > __needed = _TimeOfDay; > @@ -393,10 +415,12 @@ namespace __format > break; > case 'x': > __needed = _Date; > + __locale_specific = true; > __allowed_mods = _Mod_E; > break; > case 'X': > __needed = _TimeOfDay; > + __locale_specific = true; > __allowed_mods = _Mod_E; > break; > case 'y': > @@ -436,6 +460,8 @@ namespace __format > || (__mod == 'O' && !(__allowed_mods & _Mod_O))) > __throw_format_error("chrono format error: invalid " > " modifier in chrono-specs"); > + if (__mod && __c != 'z') > + __locale_specific = true; > __mod = _CharT(); > > if ((__parts & __needed) != __needed) > @@ -467,6 +493,7 @@ namespace __format > _M_spec = __spec; > _M_spec._M_chrono_specs > = __string_view(__chrono_specs, __first - __chrono_specs); > + _M_spec._M_locale_specific(__locale_specific); > > return __first; > } > @@ -486,6 +513,24 @@ namespace __format > if (__first == __last) > return _M_format_to_ostream(__t, __fc, __is_neg); > > +#if __glibcxx_format >= 202207L // C++ >= 23 > + // _GLIBCXX_RESOLVE_LIB_DEFECTS > + // 3565. Handling of encodings in localized formatting > + // of chrono types is underspecified > + if constexpr (is_same_v<_CharT, char>) > + if constexpr (__unicode::__literal_encoding_is_utf8()) > + if (_M_spec._M_localized && _M_spec._M_locale_specific()) > + { > + extern locale __with_encoding_conversion(const locale&); > + > + // Allocate and cache the necessary state to convert strings > + // in the locale's encoding to UTF-8. > + locale __loc = __fc.locale(); > + if (__loc != locale::classic()) > + __fc._M_loc = __with_encoding_conversion(__loc); > + } > +#endif > + > _Sink_iter<_CharT> __out; > __format::_Str_sink<_CharT> __sink; > bool __write_direct = false; > @@ -742,6 +787,30 @@ namespace __format > static constexpr _CharT _S_space = _S_chars[14]; > static constexpr const _CharT* _S_empty_spec = _S_chars + 15; > > + template<typename _OutIter> > + _OutIter > + _M_write(_OutIter __out, const locale& __loc, __string_view __s) const > + { > +#if __glibcxx_format >= 202207L // C++ >= 20 > + __sso_string __buf; > + // _GLIBCXX_RESOLVE_LIB_DEFECTS > + // 3565. Handling of encodings in localized formatting > + // of chrono types is underspecified > + if constexpr (is_same_v<_CharT, char>) > + if constexpr (__unicode::__literal_encoding_is_utf8()) > + if (_M_spec._M_localized && _M_spec._M_locale_specific() > + && __loc != locale::classic()) > + { > + extern string_view > + __locale_encoding_to_utf8(const std::locale&, string_view, > + void*); > + > + __s = __locale_encoding_to_utf8(__loc, __s, &__buf); > + } > +#endif > + return __format::__write(std::move(__out), __s); > + } > + > template<typename _Tp, typename _FormatContext> > typename _FormatContext::iterator > _M_a_A(const _Tp& __t, typename _FormatContext::iterator __out, > @@ -761,7 +830,7 @@ namespace __format > else > __tp._M_days_abbreviated(__days); > __string_view __str(__days[__wd.c_encoding()]); > - return __format::__write(std::move(__out), __str); > + return _M_write(std::move(__out), __loc, __str); > } > > template<typename _Tp, typename _FormatContext> > @@ -782,7 +851,7 @@ namespace __format > else > __tp._M_months_abbreviated(__months); > __string_view __str(__months[(unsigned)__m - 1]); > - return __format::__write(std::move(__out), __str); > + return _M_write(std::move(__out), __loc, __str); > } > > template<typename _Tp, typename _FormatContext> > @@ -1059,8 +1128,8 @@ namespace __format > const auto& __tp = use_facet<__timepunct<_CharT>>(__loc); > const _CharT* __ampm[2]; > __tp._M_am_pm(__ampm); > - return std::format_to(std::move(__out), _S_empty_spec, > - __ampm[__hms.hours().count() >= 12]); > + return _M_write(std::move(__out), __loc, > + __ampm[__hms.hours().count() >= 12]); > } > > template<typename _Tp, typename _FormatContext> > @@ -1095,8 +1164,9 @@ namespace __format > basic_string<_CharT> __fmt(_S_empty_spec); > __fmt.insert(1u, 1u, _S_colon); > __fmt.insert(2u, __ampm_fmt); > - return std::vformat_to(std::move(__out), __fmt, > - std::make_format_args<_FormatContext>(__t)); > + using _FmtStr = _Runtime_format_string<_CharT>; > + return _M_write(std::move(__out), __loc, > + std::format(__loc, _FmtStr(__fmt), __t)); > } > > template<typename _Tp, typename _FormatContext> > @@ -1279,8 +1349,9 @@ namespace __format > basic_string<_CharT> __fmt(_S_empty_spec); > __fmt.insert(1u, 1u, _S_colon); > __fmt.insert(2u, __rep); > - return std::vformat_to(std::move(__out), __fmt, > - std::make_format_args<_FormatContext>(__t)); > + using _FmtStr = _Runtime_format_string<_CharT>; > + return _M_write(std::move(__out), __loc, > + std::format(__loc, _FmtStr(__fmt), __t)); > } > > template<typename _Tp, typename _FormatContext> > @@ -1302,8 +1373,9 @@ namespace __format > basic_string<_CharT> __fmt(_S_empty_spec); > __fmt.insert(1u, 1u, _S_colon); > __fmt.insert(2u, __rep); > - return std::vformat_to(std::move(__out), __fmt, > - std::make_format_args<_FormatContext>(__t)); > + using _FmtStr = _Runtime_format_string<_CharT>; > + return _M_write(std::move(__out), __loc, > + std::format(__loc, _FmtStr(__fmt), __t)); > } > > template<typename _Tp, typename _FormatContext> > @@ -1580,7 +1652,7 @@ namespace __format > const auto& __tp = use_facet<time_put<_CharT>>(__loc); > __tp.put(__os, __os, _S_space, &__tm, __fmt, __mod); > if (__os) > - __out = __format::__write(std::move(__out), __os.view()); > + __out = _M_write(std::move(__out), __loc, __os.view()); > return __out; > } > }; > diff --git a/libstdc++-v3/include/bits/version.def b/libstdc++-v3/include/bits/version.def > index 42cdef2f526..74947301760 100644 > --- a/libstdc++-v3/include/bits/version.def > +++ b/libstdc++-v3/include/bits/version.def > @@ -1161,16 +1161,22 @@ ftms = { > }; > > ftms = { > + name = format; > + // 202304 P2510R3 Formatting pointers > + // 202305 P2757R3 Type checking format args > + // 202306 P2637R3 Member visit > + // 202311 P2918R2 Runtime format strings II > + // values = { > + // v = 202304; > + // cxxmin = 26; > + // hosted = yes; > + // }; > // 201907 Text Formatting, Integration of chrono, printf corner cases. > // 202106 std::format improvements. > // 202110 Fixing locale handling in chrono formatters, generator-like types. > // 202207 Encodings in localized formatting of chrono, basic-format-string. > - // 202207 P2286R8 Formatting Ranges > - // 202207 P2585R1 Improving default container formatting > - // TODO: #define __cpp_lib_format_ranges 202207L > - name = format; > values = { > - v = 202110; > + v = 202207; > cxxmin = 20; > hosted = yes; > }; > @@ -1374,6 +1380,19 @@ ftms = { > }; > }; > > +// ftms = { > + // name = format_ranges; > + // 202207 P2286R8 Formatting Ranges > + // 202207 P2585R1 Improving default container formatting > + // LWG3750 Too many papers bump __cpp_lib_format > + // TODO: #define __cpp_lib_format_ranges 202207L > + // values = { > + // v = 202207; > + // cxxmin = 23; > + // hosted = yes; > + // }; > +// }; > + > ftms = { > name = freestanding_algorithm; > values = { > diff --git a/libstdc++-v3/include/bits/version.h b/libstdc++-v3/include/bits/version.h > index 1eaf3733bc2..9f8673395da 100644 > --- a/libstdc++-v3/include/bits/version.h > +++ b/libstdc++-v3/include/bits/version.h > @@ -1305,9 +1305,9 @@ > > #if !defined(__cpp_lib_format) > # if (__cplusplus >= 202002L) && _GLIBCXX_HOSTED > -# define __glibcxx_format 202110L > +# define __glibcxx_format 202207L > # if defined(__glibcxx_want_all) || defined(__glibcxx_want_format) > -# define __cpp_lib_format 202110L > +# define __cpp_lib_format 202207L > # endif > # endif > #endif /* !defined(__cpp_lib_format) && defined(__glibcxx_want_format) */ > diff --git a/libstdc++-v3/include/std/format b/libstdc++-v3/include/std/format > index 16cee0d3c74..a4921ce391b 100644 > --- a/libstdc++-v3/include/std/format > +++ b/libstdc++-v3/include/std/format > @@ -2342,10 +2342,10 @@ namespace __format > > // _GLIBCXX_RESOLVE_LIB_DEFECTS > // P2510R3 Formatting pointers > -#if __cplusplus > 202302L || ! defined __STRICT_ANSI__ > -#define _GLIBCXX_P2518R3 1 > +#if __glibcxx_format >= 202304L || ! defined __STRICT_ANSI__ > +# define _GLIBCXX_P2518R3 1 > #else > -#define _GLIBCXX_P2518R3 0 > +# define _GLIBCXX_P2518R3 0 > #endif > > #if _GLIBCXX_P2518R3 > @@ -3821,6 +3821,9 @@ namespace __format > __do_vformat_to(_Out, basic_string_view<_CharT>, > const basic_format_args<_Context>&, > const locale* = nullptr); > + > + template<typename _CharT> struct __formatter_chrono; > + > } // namespace __format > /// @endcond > > @@ -3831,6 +3834,11 @@ namespace __format > * this class template explicitly. For typical uses of `std::format` the > * library will use the specializations `std::format_context` (for `char`) > * and `std::wformat_context` (for `wchar_t`). > + * > + * You are not allowed to define partial or explicit specializations of > + * this class template. > + * > + * @since C++20 > */ > template<typename _Out, typename _CharT> > class basic_format_context > @@ -3863,6 +3871,8 @@ namespace __format > const basic_format_args<_Context2>&, > const locale*); > > + friend __format::__formatter_chrono<_CharT>; > + > public: > ~basic_format_context() = default; > > diff --git a/libstdc++-v3/src/c++20/Makefile.am b/libstdc++-v3/src/c++20/Makefile.am > index a24505e5141..d0f7859290c 100644 > --- a/libstdc++-v3/src/c++20/Makefile.am > +++ b/libstdc++-v3/src/c++20/Makefile.am > @@ -36,7 +36,7 @@ else > inst_sources = > endif > > -sources = tzdb.cc > +sources = tzdb.cc format.cc > > vpath % $(top_srcdir)/src/c++20 > > @@ -53,6 +53,12 @@ tzdb.o: tzdb.cc tzdata.zi.h > $(CXXCOMPILE) -I. -c $< > endif > > +# This needs access to std::text_encoding and to the internals of std::locale. > +format.lo: format.cc > + $(LTCXXCOMPILE) -std=gnu++26 -fno-access-control -c $< > +format.o: format.cc > + $(CXXCOMPILE) -std=gnu++26 -fno-access-control -c $< > + > if GLIBCXX_HOSTED > libc__20convenience_la_SOURCES = $(sources) $(inst_sources) > else > diff --git a/libstdc++-v3/src/c++20/Makefile.in b/libstdc++-v3/src/c++20/Makefile.in > index 3ec8c5ce804..d759b8dcc7c 100644 > --- a/libstdc++-v3/src/c++20/Makefile.in > +++ b/libstdc++-v3/src/c++20/Makefile.in > @@ -121,7 +121,7 @@ CONFIG_CLEAN_FILES = > CONFIG_CLEAN_VPATH_FILES = > LTLIBRARIES = $(noinst_LTLIBRARIES) > libc__20convenience_la_LIBADD = > -am__objects_1 = tzdb.lo > +am__objects_1 = tzdb.lo format.lo > @ENABLE_EXTERN_TEMPLATE_TRUE@am__objects_2 = sstream-inst.lo > @GLIBCXX_HOSTED_TRUE@am_libc__20convenience_la_OBJECTS = \ > @GLIBCXX_HOSTED_TRUE@ $(am__objects_1) $(am__objects_2) > @@ -432,7 +432,7 @@ headers = > @ENABLE_EXTERN_TEMPLATE_TRUE@inst_sources = \ > @ENABLE_EXTERN_TEMPLATE_TRUE@ sstream-inst.cc > > -sources = tzdb.cc > +sources = tzdb.cc format.cc > @GLIBCXX_HOSTED_FALSE@libc__20convenience_la_SOURCES = > @GLIBCXX_HOSTED_TRUE@libc__20convenience_la_SOURCES = $(sources) $(inst_sources) > > @@ -755,6 +755,12 @@ vpath % $(top_srcdir)/src/c++20 > @USE_STATIC_TZDATA_TRUE@tzdb.o: tzdb.cc tzdata.zi.h > @USE_STATIC_TZDATA_TRUE@ $(CXXCOMPILE) -I. -c $< > > +# This needs access to std::text_encoding and to the internals of std::locale. > +format.lo: format.cc > + $(LTCXXCOMPILE) -std=gnu++26 -fno-access-control -c $< > +format.o: format.cc > + $(CXXCOMPILE) -std=gnu++26 -fno-access-control -c $< > + > # Tell versions [3.59,3.63) of GNU make to not export all variables. > # Otherwise a system limit (for SysV at least) may be exceeded. > .NOEXPORT: > diff --git a/libstdc++-v3/src/c++20/format.cc b/libstdc++-v3/src/c++20/format.cc > new file mode 100644 > index 00000000000..507bac79e95 > --- /dev/null > +++ b/libstdc++-v3/src/c++20/format.cc > @@ -0,0 +1,174 @@ > +// Definitions for <chrono> formatting -*- C++ -*- > + > +// Copyright The GNU Toolchain Authors. > +// > +// This file is part of the GNU ISO C++ Library. This library is free > +// software; you can redistribute it and/or modify it under the > +// terms of the GNU General Public License as published by the > +// Free Software Foundation; either version 3, or (at your option) > +// any later version. > + > +// This library is distributed in the hope that it will be useful, > +// but WITHOUT ANY WARRANTY; without even the implied warranty of > +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +// GNU General Public License for more details. > + > +// Under Section 7 of GPL version 3, you are granted additional > +// permissions described in the GCC Runtime Library Exception, version > +// 3.1, as published by the Free Software Foundation. > + > +// You should have received a copy of the GNU General Public License and > +// a copy of the GCC Runtime Library Exception along with this program; > +// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see > +// <http://www.gnu.org/licenses/>. > + > +#define _GLIBCXX_USE_CXX11_ABI 1 > +#include "../c++26/text_encoding.cc" > + > +#if defined _GLIBCXX_USE_NL_LANGINFO_L && defined _GLIBCXX_HAVE_ICONV > +# include <format> > +# include <chrono> > +# include <memory> // make_unique > +# include <string.h> // strlen, strcpy > +# include <iconv.h> > +# include <errno.h> > +#endif > + > +namespace std > +{ > +_GLIBCXX_BEGIN_NAMESPACE_VERSION > +namespace __format > +{ > +// Helpers for P2419R2 > +// (Clarify handling of encodings in localized formatting of chrono types) > +// Convert a string from the locale's charset to UTF-8. > + > +namespace > +{ > +// A non-standard locale::facet that caches the locale's std::text_encoding > +// and an iconv descriptor for converting from that encoding to UTF-8. > +struct __encoding : locale::facet > +{ > + static locale::id id; > + > + explicit > + __encoding(const text_encoding& enc, size_t refs = 0) > + : facet(refs), _M_enc(enc) > + { > +#if defined _GLIBCXX_HAVE_ICONV > + if (enc != text_encoding::UTF8 && enc != text_encoding::ASCII) > + _M_cd = ::iconv_open("UTF-8", enc.name()); > +#endif > + } > + > + ~__encoding() > + { > +#if defined _GLIBCXX_HAVE_ICONV > + if (_M_has_desc()) > + ::iconv_close(_M_cd); > +#endif > + } > + > + bool _M_has_desc() const > + { > +#if defined _GLIBCXX_HAVE_ICONV > + return _M_cd != (::iconv_t)-1; > +#else > + return false; > +#endif > + } > + > + text_encoding _M_enc; > +#if defined _GLIBCXX_HAVE_ICONV > + ::iconv_t _M_cd = (::iconv_t)-1; > +#endif > +}; > + > +locale::id __encoding::id; > +} // namespace > + > +std::locale > +__with_encoding_conversion(const std::locale& loc) > +{ > +#if defined _GLIBCXX_USE_NL_LANGINFO_L && __CHAR_BIT__ == 8 > + if (std::__try_use_facet<__encoding>(loc)) > + return loc; > + > + string name = loc.name(); > + if (name == "C" || name == "*") > + return loc; > + > + text_encoding locenc = __locale_encoding(name.c_str()); > + > + if (locenc == text_encoding::UTF8 || locenc == text_encoding::ASCII > + || locenc == text_encoding::unknown) > + return loc; > + > + auto impl = std::make_unique<locale::_Impl>(*loc._M_impl, 1); While looking into implementing the LRU cache mentioned above, I realised that this impl variable is unused. That's a leftover from an earlier attempt to solve this. I'll remove it. > + auto facetp = std::make_unique<__encoding>(locenc); > + locale loc2(loc, facetp.get()); // FIXME: PR libstdc++/113704 > + facetp.release(); > + // FIXME: Ideally we wouldn't need to reallocate this string again, > + // just don't delete[] it in the locale(locale, Facet*) constructor. > + if (const char* name = loc._M_impl->_M_names[0]) > + { > + loc2._M_impl->_M_names[0] = new char[strlen(name) + 1]; > + strcpy(loc2._M_impl->_M_names[0], name); > + } > + return loc2; > +#else > + return loc; > +#endif > +} > + > +string_view > +__locale_encoding_to_utf8(const std::locale& loc, string_view str, > + void* poutbuf) > +{ > +#if defined _GLIBCXX_USE_NL_LANGINFO_L && __CHAR_BIT__ == 8 \ > + && _GLIBCXX_HAVE_ICONV > + string& outbuf = *static_cast<string*>(poutbuf); > + // Don't need to use __try_use_facet with its dynamic_cast<__encoding*>, > + // since we know there are no types derived from __encoding. If the array > + // element is non-null, we have the facet. > + auto id = __encoding::id._M_id(); > + auto enc_facet = static_cast<const __encoding*>(loc._M_impl->_M_facets[id]); > + if (!enc_facet || !enc_facet->_M_has_desc()) > + return str; > + > + size_t inbytesleft = str.size(); > + size_t written = 0; > + bool done = false; > + > + auto overwrite = [&](char* p, size_t n) { > + auto inbytes = const_cast<char*>(str.data()) + str.size() - inbytesleft; > + char* outbytes = p + written; > + size_t outbytesleft = n - written; > + size_t res = ::iconv(enc_facet->_M_cd, &inbytes, &inbytesleft, > + &outbytes, &outbytesleft); > + if (res == (size_t)-1) > + { > + if (errno != E2BIG) > + { > + done = true; > + return 0zu; > + } > + } > + else > + done = true; > + written = outbytes - p; > + return written; > + }; > + do > + outbuf.resize_and_overwrite(outbuf.capacity() + (inbytesleft * 3 / 2), > + overwrite); > + while (!done); > + if (outbuf.size()) > + str = outbuf; > +#endif // USE_NL_LANGINFO_L && CHAR_BIT == 8 && HAVE_ICONV > + > + return str; > +} > +} // namespace __format > +_GLIBCXX_END_NAMESPACE_VERSION > +} // namespace std > diff --git a/libstdc++-v3/testsuite/std/time/format_localized.cc b/libstdc++-v3/testsuite/std/time/format_localized.cc > new file mode 100644 > index 00000000000..2e553110f03 > --- /dev/null > +++ b/libstdc++-v3/testsuite/std/time/format_localized.cc > @@ -0,0 +1,47 @@ > +// { dg-do run { target c++20 } } > +// { dg-require-namedlocale "ru_UA.koi8u" } > +// { dg-require-namedlocale "es_ES.ISO8859-1" } > +// { dg-require-namedlocale "fr_FR.ISO8859-1" } > +// { dg-require-effective-target cxx11_abi } > + > +// P2419R2 > +// Clarify handling of encodings in localized formatting of chrono types > + > +// Localized date-time strings such as "février" should be converted to UTF-8 > +// if the locale uses a different encoding. > + > +#include <chrono> > +#include <format> > +#include <testsuite_hooks.h> > + > +void > +test_ru() > +{ > + std::locale loc("ru_UA.koi8u"); > + auto s = std::format(loc, "День недели: {:L}", std::chrono::Monday); > + VERIFY( s == "День недели: Пн" ); > +} > + > +void > +test_es() > +{ > + std::locale loc(ISO_8859(1,es_ES)); > + auto s = std::format(loc, "Día de la semana: {:L%A %a}", std::chrono::Wednesday); > + VERIFY( s == "Día de la semana: miércoles mié" ); > +} > + > +void > +test_fr() > +{ > + std::locale loc(ISO_8859(1,fr_FR)); > + auto s = std::format(loc, "Six mois après {0:L%b}, c'est {1:L%B}.", > + std::chrono::February, std::chrono::August); > + VERIFY( s == "Six mois après févr., c'est août." ); > +} > + > +int main() > +{ > + test_ru(); > + test_es(); > + test_fr(); > +} > diff --git a/libstdc++-v3/testsuite/util/testsuite_abi.cc b/libstdc++-v3/testsuite/util/testsuite_abi.cc > index ec7c3df9ecc..ce9cda660fa 100644 > --- a/libstdc++-v3/testsuite/util/testsuite_abi.cc > +++ b/libstdc++-v3/testsuite/util/testsuite_abi.cc > @@ -215,6 +215,7 @@ check_version(symbol& test, bool added) > known_versions.push_back("GLIBCXX_3.4.31"); > known_versions.push_back("GLIBCXX_3.4.32"); > known_versions.push_back("GLIBCXX_3.4.33"); > + known_versions.push_back("GLIBCXX_3.4.34"); > known_versions.push_back("GLIBCXX_LDBL_3.4.31"); > known_versions.push_back("GLIBCXX_IEEE128_3.4.29"); > known_versions.push_back("GLIBCXX_IEEE128_3.4.30"); > -- > 2.45.2 >
On Tue, 16 Jul 2024 at 13:34, Jonathan Wakely <jwakely@redhat.com> wrote: > > On Tue, 16 Jul 2024 at 13:05, Jonathan Wakely <jwakely@redhat.com> wrote: > > > > On Fri, 12 Jul 2024 at 00:23, I wrote: > > > > > > I sent v1 of this patch in February, and it added the new symbols to > > > libstdc++exp.a which meant users needed to use -lstdc++exp to format > > > chrono types in C++23 mode. That was less than ideal. > > > > > > This v2 patch adds the new symbols to the main library, which means no > > > extra step to get the new features, and we can enable them as a DR for > > > C++20 mode. But that means we need new exports in the shared library, > > > and so need to be more confident that the feature is stable and ready to > > > go into the lib. > > > > > > I'm not 100% confident that we want to add a new, private facet to the > > > std::locale, but it seems reasonable. And that's not exposed to users at > > > all, as the two new symbols added to the library hide the creation and > > > use of that facet. > > > > Here's v3, which fixes a missing export of the __sso_string constructors > > and destructors, needed so that the old ABI can use the new function to > > transcode a locale-specific string to UTF-8, with a std::string buffer. > > > > I haven't done so here, but we could keep a least recently used cache of > > __encoding facets, so that repeatedly calling std::format with the same > > locale doesn't need to keep re-checking the locale's encoding and then > > re-opening the same iconv_t descriptor. > > > > This v3 patch also tweaks the commented out parts of > > include/bits/version.def in preparation for enabling the C++26 <format> > > features in the following patches in this series. > > > > Tested x86_64-linux. I think this is ready to push now, but I'll wait a > > bit for any comments on it. > > > > -- >8 -- > > > > This implements the C++23 paper P2419R2 (Clarify handling of encodings > > in localized formatting of chrono types). The requirement is that when > > the literal encoding is "a Unicode encoding form" and the formatting > > locale uses a different encoding, any locale-specific strings such as > > "août" for std::chrono::August should be converted to the literal > > encoding. > > > > Using the recently-added std::locale::encoding() function we can check > > the locale's encoding and then use iconv if a conversion is needed. > > Because nl_langinfo_l and iconv_open both allocate memory, a naive > > implementation would perform multiple allocations and deallocations for > > every snippet of locale-specific text that needs to be converted to > > UTF-8. To avoid that, a new internal locale::facet is defined to store > > the text_encoding and an iconv_t descriptor, which are then cached in > > the formatting locale. This requires access to the internals of a > > std::locale object in src/c++20/format.cc, so that new file needs to be > > compiled with -fno-access-control, as well as -std=gnu++26 in order to > > use std::text_encoding. > > > > Because the new std::text_encoding and std::locale::encoding() symbols > > are only in the libstdc++exp.a archive, we need to include > > src/c++26/text_encoding.cc in the main library, but not export its > > symbols yet. This means they can be used by the two new functions which > > are exported from the main library. > > > > The encoding conversions are done for C++20, treating it as a DR that > > resolves LWG 3656. > > > > With this change we can increase the value of the __cpp_lib_format macro > > for C++23. The value should be 202207 for P2419R2, but we already > > implement P2510R3 (Formatting pointers) so can use the value 202304. > > > > libstdc++-v3/ChangeLog: > > > > PR libstdc++/109162 > > * acinclude.m4 (libtool_VERSION): Update to 6:34:0. > > * config/abi/pre/gnu.ver: Disambiguate old patters. Add new > > GLIBCXX_3.4.34 symbol version and new exports. > > * configure: Regenerate. > > * include/bits/chrono_io.h (_ChronoSpec::_M_locale_specific): > > Add new accessor functions to use a reserved bit in _Spec. > > (__formatter_chrono::_M_parse): Use _M_locale_specific(true) > > when chrono-specs contains locale-dependent conversion > > specifiers. > > (__formatter_chrono::_M_format): Open iconv descriptor if > > conversion to UTF-8 will be needed. > > (__formatter_chrono::_M_write): New function to write a > > localized string with possible character conversion. > > (__formatter_chrono::_M_a_A, __formatter_chrono::_M_b_B) > > (__formatter_chrono::_M_p, __formatter_chrono::_M_r) > > (__formatter_chrono::_M_x, __formatter_chrono::_M_X) > > (__formatter_chrono::_M_locale_fmt): Use _M_write. > > * include/bits/version.def (format): Update value. > > * include/bits/version.h: Regenerate. > > * include/std/format (_GLIBCXX_P2518R3): Check feature test > > macro instead of __cplusplus. > > (basic_format_context): Declare __formatter_chrono as friend. > > * src/c++20/Makefile.am: Add new file. > > * src/c++20/Makefile.in: Regenerate. > > * src/c++20/format.cc: New file. > > * testsuite/std/time/format_localized.cc: New test. > > * testsuite/util/testsuite_abi.cc: Add new symbol version. > > --- > > libstdc++-v3/acinclude.m4 | 2 +- > > libstdc++-v3/config/abi/pre/gnu.ver | 18 +- > > libstdc++-v3/configure | 2 +- > > libstdc++-v3/include/bits/chrono_io.h | 96 ++++++++-- > > libstdc++-v3/include/bits/version.def | 29 ++- > > libstdc++-v3/include/bits/version.h | 4 +- > > libstdc++-v3/include/std/format | 16 +- > > libstdc++-v3/src/c++20/Makefile.am | 8 +- > > libstdc++-v3/src/c++20/Makefile.in | 10 +- > > libstdc++-v3/src/c++20/format.cc | 174 ++++++++++++++++++ > > .../testsuite/std/time/format_localized.cc | 47 +++++ > > libstdc++-v3/testsuite/util/testsuite_abi.cc | 1 + > > 12 files changed, 378 insertions(+), 29 deletions(-) > > create mode 100644 libstdc++-v3/src/c++20/format.cc > > create mode 100644 libstdc++-v3/testsuite/std/time/format_localized.cc > > > > diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4 > > index e04aae25360..e4ed583b3ae 100644 > > --- a/libstdc++-v3/acinclude.m4 > > +++ b/libstdc++-v3/acinclude.m4 > > @@ -4230,7 +4230,7 @@ changequote([,])dnl > > fi > > > > # For libtool versioning info, format is CURRENT:REVISION:AGE > > -libtool_VERSION=6:33:0 > > +libtool_VERSION=6:34:0 > > > > # Everything parsed; figure out what files and settings to use. > > case $enable_symvers in > > diff --git a/libstdc++-v3/config/abi/pre/gnu.ver b/libstdc++-v3/config/abi/pre/gnu.ver > > index 31449b5b87b..ae79b371d80 100644 > > --- a/libstdc++-v3/config/abi/pre/gnu.ver > > +++ b/libstdc++-v3/config/abi/pre/gnu.ver > > @@ -109,7 +109,11 @@ GLIBCXX_3.4 { > > std::[j-k]*; > > # std::length_error::l*; > > # std::length_error::~l*; > > - std::locale::[A-Za-e]*; > > + # std::locale::[A-Za-d]*; > > + std::locale::all; > > + std::locale::classic*; > > + std::locale::collate; > > + std::locale::ctype; > > std::locale::facet::[A-Za-z]*; > > std::locale::facet::_S_get_c_locale*; > > std::locale::facet::_S_clone_c_locale*; > > @@ -168,7 +172,7 @@ GLIBCXX_3.4 { > > std::strstream*; > > std::strstreambuf*; > > # std::t[a-q]*; > > - std::t[a-g]*; > > + std::terminate*; > > std::th[a-h]*; > > std::th[j-q]*; > > std::th[s-z]*; > > @@ -2528,6 +2532,16 @@ GLIBCXX_3.4.33 { > > _ZNKSt12__basic_fileIcE13native_handleEv; > > } GLIBCXX_3.4.32; > > > > +# GCC 15.1.0 > > +GLIBCXX_3.4.34 { > > + # std::__format::__with_encoding_conversion > > + _ZNSt8__format26__with_encoding_conversionERKSt6locale; > > + # std::__format::__locale_encoding_to_utf8 > > + _ZNSt8__format25__locale_encoding_to_utf8ERKSt6localeSt17basic_string_viewIcSt11char_traitsIcEEPv; > > + # __sso_string constructor and destructor > > + _ZNSt12__sso_string[CD][12]Ev; > > +} GLIBCXX_3.4.33; > > + > > # Symbols in the support library (libsupc++) have their own tag. > > CXXABI_1.3 { > > > > diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure > > index 5645e991af7..fe525308ae2 100755 > > --- a/libstdc++-v3/configure > > +++ b/libstdc++-v3/configure > > @@ -51040,7 +51040,7 @@ $as_echo "$as_me: WARNING: === Symbol versioning will be disabled." >&2;} > > fi > > > > # For libtool versioning info, format is CURRENT:REVISION:AGE > > -libtool_VERSION=6:33:0 > > +libtool_VERSION=6:34:0 > > > > # Everything parsed; figure out what files and settings to use. > > case $enable_symvers in > > diff --git a/libstdc++-v3/include/bits/chrono_io.h b/libstdc++-v3/include/bits/chrono_io.h > > index 72c66a0fef0..2f3ba89de61 100644 > > --- a/libstdc++-v3/include/bits/chrono_io.h > > +++ b/libstdc++-v3/include/bits/chrono_io.h > > @@ -38,8 +38,10 @@ > > #include <iomanip> // setw, setfill > > #include <format> > > #include <charconv> // from_chars > > +#include <stdexcept> // __sso_string > > > > #include <bits/streambuf_iterator.h> > > +#include <bits/unique_ptr.h> > > > > namespace std _GLIBCXX_VISIBILITY(default) > > { > > @@ -211,6 +213,20 @@ namespace __format > > struct _ChronoSpec : _Spec<_CharT> > > { > > basic_string_view<_CharT> _M_chrono_specs; > > + > > + // Use one of the reserved bits in __format::_Spec<C>. > > + // This indicates that a locale-dependent conversion specifier such as > > + // %a is used in the chrono-specs. This is not the same as the > > + // _Spec<C>::_M_localized member which indicates that "L" was present > > + // in the format-spec, e.g. "{:L%a}" is localized and locale-specific, > > + // but "{:L}" is only localized and "{:%a}" is only locale-specific. > > + constexpr bool > > + _M_locale_specific() const noexcept > > + { return this->_M_reserved; } > > + > > + constexpr void > > + _M_locale_specific(bool __b) noexcept > > + { this->_M_reserved = __b; } > > }; > > > > // Represents the information provided by a chrono type. > > @@ -305,11 +321,12 @@ namespace __format > > const auto __chrono_specs = __first++; // Skip leading '%' > > if (*__chrono_specs != '%') > > __throw_format_error("chrono format error: no '%' at start of " > > - "chrono-specs"); > > + "chrono-specs"); > > > > _CharT __mod{}; > > bool __conv = true; > > int __needed = 0; > > + bool __locale_specific = false; > > > > while (__first != __last) > > { > > @@ -322,15 +339,18 @@ namespace __format > > case 'a': > > case 'A': > > __needed = _Weekday; > > + __locale_specific = true; > > break; > > case 'b': > > case 'h': > > case 'B': > > __needed = _Month; > > + __locale_specific = true; > > break; > > case 'c': > > __needed = _DateTime; > > __allowed_mods = _Mod_E; > > + __locale_specific = true; > > break; > > case 'C': > > __needed = _Year; > > @@ -368,6 +388,8 @@ namespace __format > > break; > > case 'p': > > case 'r': > > + __locale_specific = true; > > + [[fallthrough]]; > > case 'R': > > case 'T': > > __needed = _TimeOfDay; > > @@ -393,10 +415,12 @@ namespace __format > > break; > > case 'x': > > __needed = _Date; > > + __locale_specific = true; > > __allowed_mods = _Mod_E; > > break; > > case 'X': > > __needed = _TimeOfDay; > > + __locale_specific = true; > > __allowed_mods = _Mod_E; > > break; > > case 'y': > > @@ -436,6 +460,8 @@ namespace __format > > || (__mod == 'O' && !(__allowed_mods & _Mod_O))) > > __throw_format_error("chrono format error: invalid " > > " modifier in chrono-specs"); > > + if (__mod && __c != 'z') > > + __locale_specific = true; > > __mod = _CharT(); > > > > if ((__parts & __needed) != __needed) > > @@ -467,6 +493,7 @@ namespace __format > > _M_spec = __spec; > > _M_spec._M_chrono_specs > > = __string_view(__chrono_specs, __first - __chrono_specs); > > + _M_spec._M_locale_specific(__locale_specific); > > > > return __first; > > } > > @@ -486,6 +513,24 @@ namespace __format > > if (__first == __last) > > return _M_format_to_ostream(__t, __fc, __is_neg); > > > > +#if __glibcxx_format >= 202207L // C++ >= 23 > > + // _GLIBCXX_RESOLVE_LIB_DEFECTS > > + // 3565. Handling of encodings in localized formatting > > + // of chrono types is underspecified > > + if constexpr (is_same_v<_CharT, char>) > > + if constexpr (__unicode::__literal_encoding_is_utf8()) > > + if (_M_spec._M_localized && _M_spec._M_locale_specific()) > > + { > > + extern locale __with_encoding_conversion(const locale&); > > + > > + // Allocate and cache the necessary state to convert strings > > + // in the locale's encoding to UTF-8. > > + locale __loc = __fc.locale(); > > + if (__loc != locale::classic()) > > + __fc._M_loc = __with_encoding_conversion(__loc); > > + } > > +#endif > > + > > _Sink_iter<_CharT> __out; > > __format::_Str_sink<_CharT> __sink; > > bool __write_direct = false; > > @@ -742,6 +787,30 @@ namespace __format > > static constexpr _CharT _S_space = _S_chars[14]; > > static constexpr const _CharT* _S_empty_spec = _S_chars + 15; > > > > + template<typename _OutIter> > > + _OutIter > > + _M_write(_OutIter __out, const locale& __loc, __string_view __s) const > > + { > > +#if __glibcxx_format >= 202207L // C++ >= 20 > > + __sso_string __buf; > > + // _GLIBCXX_RESOLVE_LIB_DEFECTS > > + // 3565. Handling of encodings in localized formatting > > + // of chrono types is underspecified > > + if constexpr (is_same_v<_CharT, char>) > > + if constexpr (__unicode::__literal_encoding_is_utf8()) > > + if (_M_spec._M_localized && _M_spec._M_locale_specific() > > + && __loc != locale::classic()) > > + { > > + extern string_view > > + __locale_encoding_to_utf8(const std::locale&, string_view, > > + void*); > > + > > + __s = __locale_encoding_to_utf8(__loc, __s, &__buf); > > + } > > +#endif > > + return __format::__write(std::move(__out), __s); > > + } > > + > > template<typename _Tp, typename _FormatContext> > > typename _FormatContext::iterator > > _M_a_A(const _Tp& __t, typename _FormatContext::iterator __out, > > @@ -761,7 +830,7 @@ namespace __format > > else > > __tp._M_days_abbreviated(__days); > > __string_view __str(__days[__wd.c_encoding()]); > > - return __format::__write(std::move(__out), __str); > > + return _M_write(std::move(__out), __loc, __str); > > } > > > > template<typename _Tp, typename _FormatContext> > > @@ -782,7 +851,7 @@ namespace __format > > else > > __tp._M_months_abbreviated(__months); > > __string_view __str(__months[(unsigned)__m - 1]); > > - return __format::__write(std::move(__out), __str); > > + return _M_write(std::move(__out), __loc, __str); > > } > > > > template<typename _Tp, typename _FormatContext> > > @@ -1059,8 +1128,8 @@ namespace __format > > const auto& __tp = use_facet<__timepunct<_CharT>>(__loc); > > const _CharT* __ampm[2]; > > __tp._M_am_pm(__ampm); > > - return std::format_to(std::move(__out), _S_empty_spec, > > - __ampm[__hms.hours().count() >= 12]); > > + return _M_write(std::move(__out), __loc, > > + __ampm[__hms.hours().count() >= 12]); > > } > > > > template<typename _Tp, typename _FormatContext> > > @@ -1095,8 +1164,9 @@ namespace __format > > basic_string<_CharT> __fmt(_S_empty_spec); > > __fmt.insert(1u, 1u, _S_colon); > > __fmt.insert(2u, __ampm_fmt); > > - return std::vformat_to(std::move(__out), __fmt, > > - std::make_format_args<_FormatContext>(__t)); > > + using _FmtStr = _Runtime_format_string<_CharT>; > > + return _M_write(std::move(__out), __loc, > > + std::format(__loc, _FmtStr(__fmt), __t)); > > } > > > > template<typename _Tp, typename _FormatContext> > > @@ -1279,8 +1349,9 @@ namespace __format > > basic_string<_CharT> __fmt(_S_empty_spec); > > __fmt.insert(1u, 1u, _S_colon); > > __fmt.insert(2u, __rep); > > - return std::vformat_to(std::move(__out), __fmt, > > - std::make_format_args<_FormatContext>(__t)); > > + using _FmtStr = _Runtime_format_string<_CharT>; > > + return _M_write(std::move(__out), __loc, > > + std::format(__loc, _FmtStr(__fmt), __t)); > > } > > > > template<typename _Tp, typename _FormatContext> > > @@ -1302,8 +1373,9 @@ namespace __format > > basic_string<_CharT> __fmt(_S_empty_spec); > > __fmt.insert(1u, 1u, _S_colon); > > __fmt.insert(2u, __rep); > > - return std::vformat_to(std::move(__out), __fmt, > > - std::make_format_args<_FormatContext>(__t)); > > + using _FmtStr = _Runtime_format_string<_CharT>; > > + return _M_write(std::move(__out), __loc, > > + std::format(__loc, _FmtStr(__fmt), __t)); > > } > > > > template<typename _Tp, typename _FormatContext> > > @@ -1580,7 +1652,7 @@ namespace __format > > const auto& __tp = use_facet<time_put<_CharT>>(__loc); > > __tp.put(__os, __os, _S_space, &__tm, __fmt, __mod); > > if (__os) > > - __out = __format::__write(std::move(__out), __os.view()); > > + __out = _M_write(std::move(__out), __loc, __os.view()); > > return __out; > > } > > }; > > diff --git a/libstdc++-v3/include/bits/version.def b/libstdc++-v3/include/bits/version.def > > index 42cdef2f526..74947301760 100644 > > --- a/libstdc++-v3/include/bits/version.def > > +++ b/libstdc++-v3/include/bits/version.def > > @@ -1161,16 +1161,22 @@ ftms = { > > }; > > > > ftms = { > > + name = format; > > + // 202304 P2510R3 Formatting pointers > > + // 202305 P2757R3 Type checking format args > > + // 202306 P2637R3 Member visit > > + // 202311 P2918R2 Runtime format strings II > > + // values = { > > + // v = 202304; > > + // cxxmin = 26; > > + // hosted = yes; > > + // }; > > // 201907 Text Formatting, Integration of chrono, printf corner cases. > > // 202106 std::format improvements. > > // 202110 Fixing locale handling in chrono formatters, generator-like types. > > // 202207 Encodings in localized formatting of chrono, basic-format-string. > > - // 202207 P2286R8 Formatting Ranges > > - // 202207 P2585R1 Improving default container formatting > > - // TODO: #define __cpp_lib_format_ranges 202207L > > - name = format; > > values = { > > - v = 202110; > > + v = 202207; > > cxxmin = 20; > > hosted = yes; > > }; > > @@ -1374,6 +1380,19 @@ ftms = { > > }; > > }; > > > > +// ftms = { > > + // name = format_ranges; > > + // 202207 P2286R8 Formatting Ranges > > + // 202207 P2585R1 Improving default container formatting > > + // LWG3750 Too many papers bump __cpp_lib_format > > + // TODO: #define __cpp_lib_format_ranges 202207L > > + // values = { > > + // v = 202207; > > + // cxxmin = 23; > > + // hosted = yes; > > + // }; > > +// }; > > + > > ftms = { > > name = freestanding_algorithm; > > values = { > > diff --git a/libstdc++-v3/include/bits/version.h b/libstdc++-v3/include/bits/version.h > > index 1eaf3733bc2..9f8673395da 100644 > > --- a/libstdc++-v3/include/bits/version.h > > +++ b/libstdc++-v3/include/bits/version.h > > @@ -1305,9 +1305,9 @@ > > > > #if !defined(__cpp_lib_format) > > # if (__cplusplus >= 202002L) && _GLIBCXX_HOSTED > > -# define __glibcxx_format 202110L > > +# define __glibcxx_format 202207L > > # if defined(__glibcxx_want_all) || defined(__glibcxx_want_format) > > -# define __cpp_lib_format 202110L > > +# define __cpp_lib_format 202207L > > # endif > > # endif > > #endif /* !defined(__cpp_lib_format) && defined(__glibcxx_want_format) */ > > diff --git a/libstdc++-v3/include/std/format b/libstdc++-v3/include/std/format > > index 16cee0d3c74..a4921ce391b 100644 > > --- a/libstdc++-v3/include/std/format > > +++ b/libstdc++-v3/include/std/format > > @@ -2342,10 +2342,10 @@ namespace __format > > > > // _GLIBCXX_RESOLVE_LIB_DEFECTS > > // P2510R3 Formatting pointers > > -#if __cplusplus > 202302L || ! defined __STRICT_ANSI__ > > -#define _GLIBCXX_P2518R3 1 > > +#if __glibcxx_format >= 202304L || ! defined __STRICT_ANSI__ > > +# define _GLIBCXX_P2518R3 1 > > #else > > -#define _GLIBCXX_P2518R3 0 > > +# define _GLIBCXX_P2518R3 0 > > #endif > > > > #if _GLIBCXX_P2518R3 > > @@ -3821,6 +3821,9 @@ namespace __format > > __do_vformat_to(_Out, basic_string_view<_CharT>, > > const basic_format_args<_Context>&, > > const locale* = nullptr); > > + > > + template<typename _CharT> struct __formatter_chrono; > > + > > } // namespace __format > > /// @endcond > > > > @@ -3831,6 +3834,11 @@ namespace __format > > * this class template explicitly. For typical uses of `std::format` the > > * library will use the specializations `std::format_context` (for `char`) > > * and `std::wformat_context` (for `wchar_t`). > > + * > > + * You are not allowed to define partial or explicit specializations of > > + * this class template. > > + * > > + * @since C++20 > > */ > > template<typename _Out, typename _CharT> > > class basic_format_context > > @@ -3863,6 +3871,8 @@ namespace __format > > const basic_format_args<_Context2>&, > > const locale*); > > > > + friend __format::__formatter_chrono<_CharT>; > > + > > public: > > ~basic_format_context() = default; > > > > diff --git a/libstdc++-v3/src/c++20/Makefile.am b/libstdc++-v3/src/c++20/Makefile.am > > index a24505e5141..d0f7859290c 100644 > > --- a/libstdc++-v3/src/c++20/Makefile.am > > +++ b/libstdc++-v3/src/c++20/Makefile.am > > @@ -36,7 +36,7 @@ else > > inst_sources = > > endif > > > > -sources = tzdb.cc > > +sources = tzdb.cc format.cc > > > > vpath % $(top_srcdir)/src/c++20 > > > > @@ -53,6 +53,12 @@ tzdb.o: tzdb.cc tzdata.zi.h > > $(CXXCOMPILE) -I. -c $< > > endif > > > > +# This needs access to std::text_encoding and to the internals of std::locale. > > +format.lo: format.cc > > + $(LTCXXCOMPILE) -std=gnu++26 -fno-access-control -c $< > > +format.o: format.cc > > + $(CXXCOMPILE) -std=gnu++26 -fno-access-control -c $< > > + > > if GLIBCXX_HOSTED > > libc__20convenience_la_SOURCES = $(sources) $(inst_sources) > > else > > diff --git a/libstdc++-v3/src/c++20/Makefile.in b/libstdc++-v3/src/c++20/Makefile.in > > index 3ec8c5ce804..d759b8dcc7c 100644 > > --- a/libstdc++-v3/src/c++20/Makefile.in > > +++ b/libstdc++-v3/src/c++20/Makefile.in > > @@ -121,7 +121,7 @@ CONFIG_CLEAN_FILES = > > CONFIG_CLEAN_VPATH_FILES = > > LTLIBRARIES = $(noinst_LTLIBRARIES) > > libc__20convenience_la_LIBADD = > > -am__objects_1 = tzdb.lo > > +am__objects_1 = tzdb.lo format.lo > > @ENABLE_EXTERN_TEMPLATE_TRUE@am__objects_2 = sstream-inst.lo > > @GLIBCXX_HOSTED_TRUE@am_libc__20convenience_la_OBJECTS = \ > > @GLIBCXX_HOSTED_TRUE@ $(am__objects_1) $(am__objects_2) > > @@ -432,7 +432,7 @@ headers = > > @ENABLE_EXTERN_TEMPLATE_TRUE@inst_sources = \ > > @ENABLE_EXTERN_TEMPLATE_TRUE@ sstream-inst.cc > > > > -sources = tzdb.cc > > +sources = tzdb.cc format.cc > > @GLIBCXX_HOSTED_FALSE@libc__20convenience_la_SOURCES = > > @GLIBCXX_HOSTED_TRUE@libc__20convenience_la_SOURCES = $(sources) $(inst_sources) > > > > @@ -755,6 +755,12 @@ vpath % $(top_srcdir)/src/c++20 > > @USE_STATIC_TZDATA_TRUE@tzdb.o: tzdb.cc tzdata.zi.h > > @USE_STATIC_TZDATA_TRUE@ $(CXXCOMPILE) -I. -c $< > > > > +# This needs access to std::text_encoding and to the internals of std::locale. > > +format.lo: format.cc > > + $(LTCXXCOMPILE) -std=gnu++26 -fno-access-control -c $< > > +format.o: format.cc > > + $(CXXCOMPILE) -std=gnu++26 -fno-access-control -c $< > > + > > # Tell versions [3.59,3.63) of GNU make to not export all variables. > > # Otherwise a system limit (for SysV at least) may be exceeded. > > .NOEXPORT: > > diff --git a/libstdc++-v3/src/c++20/format.cc b/libstdc++-v3/src/c++20/format.cc > > new file mode 100644 > > index 00000000000..507bac79e95 > > --- /dev/null > > +++ b/libstdc++-v3/src/c++20/format.cc > > @@ -0,0 +1,174 @@ > > +// Definitions for <chrono> formatting -*- C++ -*- > > + > > +// Copyright The GNU Toolchain Authors. > > +// > > +// This file is part of the GNU ISO C++ Library. This library is free > > +// software; you can redistribute it and/or modify it under the > > +// terms of the GNU General Public License as published by the > > +// Free Software Foundation; either version 3, or (at your option) > > +// any later version. > > + > > +// This library is distributed in the hope that it will be useful, > > +// but WITHOUT ANY WARRANTY; without even the implied warranty of > > +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > > +// GNU General Public License for more details. > > + > > +// Under Section 7 of GPL version 3, you are granted additional > > +// permissions described in the GCC Runtime Library Exception, version > > +// 3.1, as published by the Free Software Foundation. > > + > > +// You should have received a copy of the GNU General Public License and > > +// a copy of the GCC Runtime Library Exception along with this program; > > +// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see > > +// <http://www.gnu.org/licenses/>. > > + > > +#define _GLIBCXX_USE_CXX11_ABI 1 > > +#include "../c++26/text_encoding.cc" > > + > > +#if defined _GLIBCXX_USE_NL_LANGINFO_L && defined _GLIBCXX_HAVE_ICONV > > +# include <format> > > +# include <chrono> > > +# include <memory> // make_unique > > +# include <string.h> // strlen, strcpy > > +# include <iconv.h> > > +# include <errno.h> > > +#endif > > + > > +namespace std > > +{ > > +_GLIBCXX_BEGIN_NAMESPACE_VERSION > > +namespace __format > > +{ > > +// Helpers for P2419R2 > > +// (Clarify handling of encodings in localized formatting of chrono types) > > +// Convert a string from the locale's charset to UTF-8. > > + > > +namespace > > +{ > > +// A non-standard locale::facet that caches the locale's std::text_encoding > > +// and an iconv descriptor for converting from that encoding to UTF-8. > > +struct __encoding : locale::facet > > +{ > > + static locale::id id; > > + > > + explicit > > + __encoding(const text_encoding& enc, size_t refs = 0) > > + : facet(refs), _M_enc(enc) > > + { > > +#if defined _GLIBCXX_HAVE_ICONV > > + if (enc != text_encoding::UTF8 && enc != text_encoding::ASCII) > > + _M_cd = ::iconv_open("UTF-8", enc.name()); > > +#endif > > + } > > + > > + ~__encoding() > > + { > > +#if defined _GLIBCXX_HAVE_ICONV > > + if (_M_has_desc()) > > + ::iconv_close(_M_cd); > > +#endif > > + } > > + > > + bool _M_has_desc() const > > + { > > +#if defined _GLIBCXX_HAVE_ICONV > > + return _M_cd != (::iconv_t)-1; > > +#else > > + return false; > > +#endif > > + } > > + > > + text_encoding _M_enc; > > +#if defined _GLIBCXX_HAVE_ICONV > > + ::iconv_t _M_cd = (::iconv_t)-1; > > +#endif > > +}; > > + > > +locale::id __encoding::id; > > +} // namespace > > + > > +std::locale > > +__with_encoding_conversion(const std::locale& loc) > > +{ > > +#if defined _GLIBCXX_USE_NL_LANGINFO_L && __CHAR_BIT__ == 8 > > + if (std::__try_use_facet<__encoding>(loc)) > > + return loc; > > + > > + string name = loc.name(); > > + if (name == "C" || name == "*") > > + return loc; > > + > > + text_encoding locenc = __locale_encoding(name.c_str()); > > + > > + if (locenc == text_encoding::UTF8 || locenc == text_encoding::ASCII > > + || locenc == text_encoding::unknown) > > + return loc; > > + > > + auto impl = std::make_unique<locale::_Impl>(*loc._M_impl, 1); > > While looking into implementing the LRU cache mentioned above, I > realised that this impl variable is unused. That's a leftover from an > earlier attempt to solve this. I'll remove it. We could make it much more efficient by caching the transcoded strings, not just the iconv descriptor. While these routines are only used for chrono formatting, there is a fixed number of possible inputs. The days of the week and the months of the year, each in full and abbreviated form, are only 38 strings. There are a few more like the ampm strings, but not many. But the more caching we do, the more synchronization we need to avoid data races. In fact, I should probably add a mutex around the iconv descriptor in case a std::locale with the new facet is shared across threads and then used for formatting concurrently. We can also just not optimize this, and tell people to stop using non-UTF-8 locales if they want good performance.
diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4 index e04aae25360..e4ed583b3ae 100644 --- a/libstdc++-v3/acinclude.m4 +++ b/libstdc++-v3/acinclude.m4 @@ -4230,7 +4230,7 @@ changequote([,])dnl fi # For libtool versioning info, format is CURRENT:REVISION:AGE -libtool_VERSION=6:33:0 +libtool_VERSION=6:34:0 # Everything parsed; figure out what files and settings to use. case $enable_symvers in diff --git a/libstdc++-v3/config/abi/pre/gnu.ver b/libstdc++-v3/config/abi/pre/gnu.ver index 31449b5b87b..ae79b371d80 100644 --- a/libstdc++-v3/config/abi/pre/gnu.ver +++ b/libstdc++-v3/config/abi/pre/gnu.ver @@ -109,7 +109,11 @@ GLIBCXX_3.4 { std::[j-k]*; # std::length_error::l*; # std::length_error::~l*; - std::locale::[A-Za-e]*; + # std::locale::[A-Za-d]*; + std::locale::all; + std::locale::classic*; + std::locale::collate; + std::locale::ctype; std::locale::facet::[A-Za-z]*; std::locale::facet::_S_get_c_locale*; std::locale::facet::_S_clone_c_locale*; @@ -168,7 +172,7 @@ GLIBCXX_3.4 { std::strstream*; std::strstreambuf*; # std::t[a-q]*; - std::t[a-g]*; + std::terminate*; std::th[a-h]*; std::th[j-q]*; std::th[s-z]*; @@ -2528,6 +2532,16 @@ GLIBCXX_3.4.33 { _ZNKSt12__basic_fileIcE13native_handleEv; } GLIBCXX_3.4.32; +# GCC 15.1.0 +GLIBCXX_3.4.34 { + # std::__format::__with_encoding_conversion + _ZNSt8__format26__with_encoding_conversionERKSt6locale; + # std::__format::__locale_encoding_to_utf8 + _ZNSt8__format25__locale_encoding_to_utf8ERKSt6localeSt17basic_string_viewIcSt11char_traitsIcEEPv; + # __sso_string constructor and destructor + _ZNSt12__sso_string[CD][12]Ev; +} GLIBCXX_3.4.33; + # Symbols in the support library (libsupc++) have their own tag. CXXABI_1.3 { diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure index 5645e991af7..fe525308ae2 100755 --- a/libstdc++-v3/configure +++ b/libstdc++-v3/configure @@ -51040,7 +51040,7 @@ $as_echo "$as_me: WARNING: === Symbol versioning will be disabled." >&2;} fi # For libtool versioning info, format is CURRENT:REVISION:AGE -libtool_VERSION=6:33:0 +libtool_VERSION=6:34:0 # Everything parsed; figure out what files and settings to use. case $enable_symvers in diff --git a/libstdc++-v3/include/bits/chrono_io.h b/libstdc++-v3/include/bits/chrono_io.h index 72c66a0fef0..2f3ba89de61 100644 --- a/libstdc++-v3/include/bits/chrono_io.h +++ b/libstdc++-v3/include/bits/chrono_io.h @@ -38,8 +38,10 @@ #include <iomanip> // setw, setfill #include <format> #include <charconv> // from_chars +#include <stdexcept> // __sso_string #include <bits/streambuf_iterator.h> +#include <bits/unique_ptr.h> namespace std _GLIBCXX_VISIBILITY(default) { @@ -211,6 +213,20 @@ namespace __format struct _ChronoSpec : _Spec<_CharT> { basic_string_view<_CharT> _M_chrono_specs; + + // Use one of the reserved bits in __format::_Spec<C>. + // This indicates that a locale-dependent conversion specifier such as + // %a is used in the chrono-specs. This is not the same as the + // _Spec<C>::_M_localized member which indicates that "L" was present + // in the format-spec, e.g. "{:L%a}" is localized and locale-specific, + // but "{:L}" is only localized and "{:%a}" is only locale-specific. + constexpr bool + _M_locale_specific() const noexcept + { return this->_M_reserved; } + + constexpr void + _M_locale_specific(bool __b) noexcept + { this->_M_reserved = __b; } }; // Represents the information provided by a chrono type. @@ -305,11 +321,12 @@ namespace __format const auto __chrono_specs = __first++; // Skip leading '%' if (*__chrono_specs != '%') __throw_format_error("chrono format error: no '%' at start of " - "chrono-specs"); + "chrono-specs"); _CharT __mod{}; bool __conv = true; int __needed = 0; + bool __locale_specific = false; while (__first != __last) { @@ -322,15 +339,18 @@ namespace __format case 'a': case 'A': __needed = _Weekday; + __locale_specific = true; break; case 'b': case 'h': case 'B': __needed = _Month; + __locale_specific = true; break; case 'c': __needed = _DateTime; __allowed_mods = _Mod_E; + __locale_specific = true; break; case 'C': __needed = _Year; @@ -368,6 +388,8 @@ namespace __format break; case 'p': case 'r': + __locale_specific = true; + [[fallthrough]]; case 'R': case 'T': __needed = _TimeOfDay; @@ -393,10 +415,12 @@ namespace __format break; case 'x': __needed = _Date; + __locale_specific = true; __allowed_mods = _Mod_E; break; case 'X': __needed = _TimeOfDay; + __locale_specific = true; __allowed_mods = _Mod_E; break; case 'y': @@ -436,6 +460,8 @@ namespace __format || (__mod == 'O' && !(__allowed_mods & _Mod_O))) __throw_format_error("chrono format error: invalid " " modifier in chrono-specs"); + if (__mod && __c != 'z') + __locale_specific = true; __mod = _CharT(); if ((__parts & __needed) != __needed) @@ -467,6 +493,7 @@ namespace __format _M_spec = __spec; _M_spec._M_chrono_specs = __string_view(__chrono_specs, __first - __chrono_specs); + _M_spec._M_locale_specific(__locale_specific); return __first; } @@ -486,6 +513,24 @@ namespace __format if (__first == __last) return _M_format_to_ostream(__t, __fc, __is_neg); +#if __glibcxx_format >= 202207L // C++ >= 23 + // _GLIBCXX_RESOLVE_LIB_DEFECTS + // 3565. Handling of encodings in localized formatting + // of chrono types is underspecified + if constexpr (is_same_v<_CharT, char>) + if constexpr (__unicode::__literal_encoding_is_utf8()) + if (_M_spec._M_localized && _M_spec._M_locale_specific()) + { + extern locale __with_encoding_conversion(const locale&); + + // Allocate and cache the necessary state to convert strings + // in the locale's encoding to UTF-8. + locale __loc = __fc.locale(); + if (__loc != locale::classic()) + __fc._M_loc = __with_encoding_conversion(__loc); + } +#endif + _Sink_iter<_CharT> __out; __format::_Str_sink<_CharT> __sink; bool __write_direct = false; @@ -742,6 +787,30 @@ namespace __format static constexpr _CharT _S_space = _S_chars[14]; static constexpr const _CharT* _S_empty_spec = _S_chars + 15; + template<typename _OutIter> + _OutIter + _M_write(_OutIter __out, const locale& __loc, __string_view __s) const + { +#if __glibcxx_format >= 202207L // C++ >= 20 + __sso_string __buf; + // _GLIBCXX_RESOLVE_LIB_DEFECTS + // 3565. Handling of encodings in localized formatting + // of chrono types is underspecified + if constexpr (is_same_v<_CharT, char>) + if constexpr (__unicode::__literal_encoding_is_utf8()) + if (_M_spec._M_localized && _M_spec._M_locale_specific() + && __loc != locale::classic()) + { + extern string_view + __locale_encoding_to_utf8(const std::locale&, string_view, + void*); + + __s = __locale_encoding_to_utf8(__loc, __s, &__buf); + } +#endif + return __format::__write(std::move(__out), __s); + } + template<typename _Tp, typename _FormatContext> typename _FormatContext::iterator _M_a_A(const _Tp& __t, typename _FormatContext::iterator __out, @@ -761,7 +830,7 @@ namespace __format else __tp._M_days_abbreviated(__days); __string_view __str(__days[__wd.c_encoding()]); - return __format::__write(std::move(__out), __str); + return _M_write(std::move(__out), __loc, __str); } template<typename _Tp, typename _FormatContext> @@ -782,7 +851,7 @@ namespace __format else __tp._M_months_abbreviated(__months); __string_view __str(__months[(unsigned)__m - 1]); - return __format::__write(std::move(__out), __str); + return _M_write(std::move(__out), __loc, __str); } template<typename _Tp, typename _FormatContext> @@ -1059,8 +1128,8 @@ namespace __format const auto& __tp = use_facet<__timepunct<_CharT>>(__loc); const _CharT* __ampm[2]; __tp._M_am_pm(__ampm); - return std::format_to(std::move(__out), _S_empty_spec, - __ampm[__hms.hours().count() >= 12]); + return _M_write(std::move(__out), __loc, + __ampm[__hms.hours().count() >= 12]); } template<typename _Tp, typename _FormatContext> @@ -1095,8 +1164,9 @@ namespace __format basic_string<_CharT> __fmt(_S_empty_spec); __fmt.insert(1u, 1u, _S_colon); __fmt.insert(2u, __ampm_fmt); - return std::vformat_to(std::move(__out), __fmt, - std::make_format_args<_FormatContext>(__t)); + using _FmtStr = _Runtime_format_string<_CharT>; + return _M_write(std::move(__out), __loc, + std::format(__loc, _FmtStr(__fmt), __t)); } template<typename _Tp, typename _FormatContext> @@ -1279,8 +1349,9 @@ namespace __format basic_string<_CharT> __fmt(_S_empty_spec); __fmt.insert(1u, 1u, _S_colon); __fmt.insert(2u, __rep); - return std::vformat_to(std::move(__out), __fmt, - std::make_format_args<_FormatContext>(__t)); + using _FmtStr = _Runtime_format_string<_CharT>; + return _M_write(std::move(__out), __loc, + std::format(__loc, _FmtStr(__fmt), __t)); } template<typename _Tp, typename _FormatContext> @@ -1302,8 +1373,9 @@ namespace __format basic_string<_CharT> __fmt(_S_empty_spec); __fmt.insert(1u, 1u, _S_colon); __fmt.insert(2u, __rep); - return std::vformat_to(std::move(__out), __fmt, - std::make_format_args<_FormatContext>(__t)); + using _FmtStr = _Runtime_format_string<_CharT>; + return _M_write(std::move(__out), __loc, + std::format(__loc, _FmtStr(__fmt), __t)); } template<typename _Tp, typename _FormatContext> @@ -1580,7 +1652,7 @@ namespace __format const auto& __tp = use_facet<time_put<_CharT>>(__loc); __tp.put(__os, __os, _S_space, &__tm, __fmt, __mod); if (__os) - __out = __format::__write(std::move(__out), __os.view()); + __out = _M_write(std::move(__out), __loc, __os.view()); return __out; } }; diff --git a/libstdc++-v3/include/bits/version.def b/libstdc++-v3/include/bits/version.def index 42cdef2f526..74947301760 100644 --- a/libstdc++-v3/include/bits/version.def +++ b/libstdc++-v3/include/bits/version.def @@ -1161,16 +1161,22 @@ ftms = { }; ftms = { + name = format; + // 202304 P2510R3 Formatting pointers + // 202305 P2757R3 Type checking format args + // 202306 P2637R3 Member visit + // 202311 P2918R2 Runtime format strings II + // values = { + // v = 202304; + // cxxmin = 26; + // hosted = yes; + // }; // 201907 Text Formatting, Integration of chrono, printf corner cases. // 202106 std::format improvements. // 202110 Fixing locale handling in chrono formatters, generator-like types. // 202207 Encodings in localized formatting of chrono, basic-format-string. - // 202207 P2286R8 Formatting Ranges - // 202207 P2585R1 Improving default container formatting - // TODO: #define __cpp_lib_format_ranges 202207L - name = format; values = { - v = 202110; + v = 202207; cxxmin = 20; hosted = yes; }; @@ -1374,6 +1380,19 @@ ftms = { }; }; +// ftms = { + // name = format_ranges; + // 202207 P2286R8 Formatting Ranges + // 202207 P2585R1 Improving default container formatting + // LWG3750 Too many papers bump __cpp_lib_format + // TODO: #define __cpp_lib_format_ranges 202207L + // values = { + // v = 202207; + // cxxmin = 23; + // hosted = yes; + // }; +// }; + ftms = { name = freestanding_algorithm; values = { diff --git a/libstdc++-v3/include/bits/version.h b/libstdc++-v3/include/bits/version.h index 1eaf3733bc2..9f8673395da 100644 --- a/libstdc++-v3/include/bits/version.h +++ b/libstdc++-v3/include/bits/version.h @@ -1305,9 +1305,9 @@ #if !defined(__cpp_lib_format) # if (__cplusplus >= 202002L) && _GLIBCXX_HOSTED -# define __glibcxx_format 202110L +# define __glibcxx_format 202207L # if defined(__glibcxx_want_all) || defined(__glibcxx_want_format) -# define __cpp_lib_format 202110L +# define __cpp_lib_format 202207L # endif # endif #endif /* !defined(__cpp_lib_format) && defined(__glibcxx_want_format) */ diff --git a/libstdc++-v3/include/std/format b/libstdc++-v3/include/std/format index 16cee0d3c74..a4921ce391b 100644 --- a/libstdc++-v3/include/std/format +++ b/libstdc++-v3/include/std/format @@ -2342,10 +2342,10 @@ namespace __format // _GLIBCXX_RESOLVE_LIB_DEFECTS // P2510R3 Formatting pointers -#if __cplusplus > 202302L || ! defined __STRICT_ANSI__ -#define _GLIBCXX_P2518R3 1 +#if __glibcxx_format >= 202304L || ! defined __STRICT_ANSI__ +# define _GLIBCXX_P2518R3 1 #else -#define _GLIBCXX_P2518R3 0 +# define _GLIBCXX_P2518R3 0 #endif #if _GLIBCXX_P2518R3 @@ -3821,6 +3821,9 @@ namespace __format __do_vformat_to(_Out, basic_string_view<_CharT>, const basic_format_args<_Context>&, const locale* = nullptr); + + template<typename _CharT> struct __formatter_chrono; + } // namespace __format /// @endcond @@ -3831,6 +3834,11 @@ namespace __format * this class template explicitly. For typical uses of `std::format` the * library will use the specializations `std::format_context` (for `char`) * and `std::wformat_context` (for `wchar_t`). + * + * You are not allowed to define partial or explicit specializations of + * this class template. + * + * @since C++20 */ template<typename _Out, typename _CharT> class basic_format_context @@ -3863,6 +3871,8 @@ namespace __format const basic_format_args<_Context2>&, const locale*); + friend __format::__formatter_chrono<_CharT>; + public: ~basic_format_context() = default; diff --git a/libstdc++-v3/src/c++20/Makefile.am b/libstdc++-v3/src/c++20/Makefile.am index a24505e5141..d0f7859290c 100644 --- a/libstdc++-v3/src/c++20/Makefile.am +++ b/libstdc++-v3/src/c++20/Makefile.am @@ -36,7 +36,7 @@ else inst_sources = endif -sources = tzdb.cc +sources = tzdb.cc format.cc vpath % $(top_srcdir)/src/c++20 @@ -53,6 +53,12 @@ tzdb.o: tzdb.cc tzdata.zi.h $(CXXCOMPILE) -I. -c $< endif +# This needs access to std::text_encoding and to the internals of std::locale. +format.lo: format.cc + $(LTCXXCOMPILE) -std=gnu++26 -fno-access-control -c $< +format.o: format.cc + $(CXXCOMPILE) -std=gnu++26 -fno-access-control -c $< + if GLIBCXX_HOSTED libc__20convenience_la_SOURCES = $(sources) $(inst_sources) else diff --git a/libstdc++-v3/src/c++20/Makefile.in b/libstdc++-v3/src/c++20/Makefile.in index 3ec8c5ce804..d759b8dcc7c 100644 --- a/libstdc++-v3/src/c++20/Makefile.in +++ b/libstdc++-v3/src/c++20/Makefile.in @@ -121,7 +121,7 @@ CONFIG_CLEAN_FILES = CONFIG_CLEAN_VPATH_FILES = LTLIBRARIES = $(noinst_LTLIBRARIES) libc__20convenience_la_LIBADD = -am__objects_1 = tzdb.lo +am__objects_1 = tzdb.lo format.lo @ENABLE_EXTERN_TEMPLATE_TRUE@am__objects_2 = sstream-inst.lo @GLIBCXX_HOSTED_TRUE@am_libc__20convenience_la_OBJECTS = \ @GLIBCXX_HOSTED_TRUE@ $(am__objects_1) $(am__objects_2) @@ -432,7 +432,7 @@ headers = @ENABLE_EXTERN_TEMPLATE_TRUE@inst_sources = \ @ENABLE_EXTERN_TEMPLATE_TRUE@ sstream-inst.cc -sources = tzdb.cc +sources = tzdb.cc format.cc @GLIBCXX_HOSTED_FALSE@libc__20convenience_la_SOURCES = @GLIBCXX_HOSTED_TRUE@libc__20convenience_la_SOURCES = $(sources) $(inst_sources) @@ -755,6 +755,12 @@ vpath % $(top_srcdir)/src/c++20 @USE_STATIC_TZDATA_TRUE@tzdb.o: tzdb.cc tzdata.zi.h @USE_STATIC_TZDATA_TRUE@ $(CXXCOMPILE) -I. -c $< +# This needs access to std::text_encoding and to the internals of std::locale. +format.lo: format.cc + $(LTCXXCOMPILE) -std=gnu++26 -fno-access-control -c $< +format.o: format.cc + $(CXXCOMPILE) -std=gnu++26 -fno-access-control -c $< + # Tell versions [3.59,3.63) of GNU make to not export all variables. # Otherwise a system limit (for SysV at least) may be exceeded. .NOEXPORT: diff --git a/libstdc++-v3/src/c++20/format.cc b/libstdc++-v3/src/c++20/format.cc new file mode 100644 index 00000000000..507bac79e95 --- /dev/null +++ b/libstdc++-v3/src/c++20/format.cc @@ -0,0 +1,174 @@ +// Definitions for <chrono> formatting -*- C++ -*- + +// Copyright The GNU Toolchain Authors. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// Under Section 7 of GPL version 3, you are granted additional +// permissions described in the GCC Runtime Library Exception, version +// 3.1, as published by the Free Software Foundation. + +// You should have received a copy of the GNU General Public License and +// a copy of the GCC Runtime Library Exception along with this program; +// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +// <http://www.gnu.org/licenses/>. + +#define _GLIBCXX_USE_CXX11_ABI 1 +#include "../c++26/text_encoding.cc" + +#if defined _GLIBCXX_USE_NL_LANGINFO_L && defined _GLIBCXX_HAVE_ICONV +# include <format> +# include <chrono> +# include <memory> // make_unique +# include <string.h> // strlen, strcpy +# include <iconv.h> +# include <errno.h> +#endif + +namespace std +{ +_GLIBCXX_BEGIN_NAMESPACE_VERSION +namespace __format +{ +// Helpers for P2419R2 +// (Clarify handling of encodings in localized formatting of chrono types) +// Convert a string from the locale's charset to UTF-8. + +namespace +{ +// A non-standard locale::facet that caches the locale's std::text_encoding +// and an iconv descriptor for converting from that encoding to UTF-8. +struct __encoding : locale::facet +{ + static locale::id id; + + explicit + __encoding(const text_encoding& enc, size_t refs = 0) + : facet(refs), _M_enc(enc) + { +#if defined _GLIBCXX_HAVE_ICONV + if (enc != text_encoding::UTF8 && enc != text_encoding::ASCII) + _M_cd = ::iconv_open("UTF-8", enc.name()); +#endif + } + + ~__encoding() + { +#if defined _GLIBCXX_HAVE_ICONV + if (_M_has_desc()) + ::iconv_close(_M_cd); +#endif + } + + bool _M_has_desc() const + { +#if defined _GLIBCXX_HAVE_ICONV + return _M_cd != (::iconv_t)-1; +#else + return false; +#endif + } + + text_encoding _M_enc; +#if defined _GLIBCXX_HAVE_ICONV + ::iconv_t _M_cd = (::iconv_t)-1; +#endif +}; + +locale::id __encoding::id; +} // namespace + +std::locale +__with_encoding_conversion(const std::locale& loc) +{ +#if defined _GLIBCXX_USE_NL_LANGINFO_L && __CHAR_BIT__ == 8 + if (std::__try_use_facet<__encoding>(loc)) + return loc; + + string name = loc.name(); + if (name == "C" || name == "*") + return loc; + + text_encoding locenc = __locale_encoding(name.c_str()); + + if (locenc == text_encoding::UTF8 || locenc == text_encoding::ASCII + || locenc == text_encoding::unknown) + return loc; + + auto impl = std::make_unique<locale::_Impl>(*loc._M_impl, 1); + auto facetp = std::make_unique<__encoding>(locenc); + locale loc2(loc, facetp.get()); // FIXME: PR libstdc++/113704 + facetp.release(); + // FIXME: Ideally we wouldn't need to reallocate this string again, + // just don't delete[] it in the locale(locale, Facet*) constructor. + if (const char* name = loc._M_impl->_M_names[0]) + { + loc2._M_impl->_M_names[0] = new char[strlen(name) + 1]; + strcpy(loc2._M_impl->_M_names[0], name); + } + return loc2; +#else + return loc; +#endif +} + +string_view +__locale_encoding_to_utf8(const std::locale& loc, string_view str, + void* poutbuf) +{ +#if defined _GLIBCXX_USE_NL_LANGINFO_L && __CHAR_BIT__ == 8 \ + && _GLIBCXX_HAVE_ICONV + string& outbuf = *static_cast<string*>(poutbuf); + // Don't need to use __try_use_facet with its dynamic_cast<__encoding*>, + // since we know there are no types derived from __encoding. If the array + // element is non-null, we have the facet. + auto id = __encoding::id._M_id(); + auto enc_facet = static_cast<const __encoding*>(loc._M_impl->_M_facets[id]); + if (!enc_facet || !enc_facet->_M_has_desc()) + return str; + + size_t inbytesleft = str.size(); + size_t written = 0; + bool done = false; + + auto overwrite = [&](char* p, size_t n) { + auto inbytes = const_cast<char*>(str.data()) + str.size() - inbytesleft; + char* outbytes = p + written; + size_t outbytesleft = n - written; + size_t res = ::iconv(enc_facet->_M_cd, &inbytes, &inbytesleft, + &outbytes, &outbytesleft); + if (res == (size_t)-1) + { + if (errno != E2BIG) + { + done = true; + return 0zu; + } + } + else + done = true; + written = outbytes - p; + return written; + }; + do + outbuf.resize_and_overwrite(outbuf.capacity() + (inbytesleft * 3 / 2), + overwrite); + while (!done); + if (outbuf.size()) + str = outbuf; +#endif // USE_NL_LANGINFO_L && CHAR_BIT == 8 && HAVE_ICONV + + return str; +} +} // namespace __format +_GLIBCXX_END_NAMESPACE_VERSION +} // namespace std diff --git a/libstdc++-v3/testsuite/std/time/format_localized.cc b/libstdc++-v3/testsuite/std/time/format_localized.cc new file mode 100644 index 00000000000..2e553110f03 --- /dev/null +++ b/libstdc++-v3/testsuite/std/time/format_localized.cc @@ -0,0 +1,47 @@ +// { dg-do run { target c++20 } } +// { dg-require-namedlocale "ru_UA.koi8u" } +// { dg-require-namedlocale "es_ES.ISO8859-1" } +// { dg-require-namedlocale "fr_FR.ISO8859-1" } +// { dg-require-effective-target cxx11_abi } + +// P2419R2 +// Clarify handling of encodings in localized formatting of chrono types + +// Localized date-time strings such as "février" should be converted to UTF-8 +// if the locale uses a different encoding. + +#include <chrono> +#include <format> +#include <testsuite_hooks.h> + +void +test_ru() +{ + std::locale loc("ru_UA.koi8u"); + auto s = std::format(loc, "День недели: {:L}", std::chrono::Monday); + VERIFY( s == "День недели: Пн" ); +} + +void +test_es() +{ + std::locale loc(ISO_8859(1,es_ES)); + auto s = std::format(loc, "Día de la semana: {:L%A %a}", std::chrono::Wednesday); + VERIFY( s == "Día de la semana: miércoles mié" ); +} + +void +test_fr() +{ + std::locale loc(ISO_8859(1,fr_FR)); + auto s = std::format(loc, "Six mois après {0:L%b}, c'est {1:L%B}.", + std::chrono::February, std::chrono::August); + VERIFY( s == "Six mois après févr., c'est août." ); +} + +int main() +{ + test_ru(); + test_es(); + test_fr(); +} diff --git a/libstdc++-v3/testsuite/util/testsuite_abi.cc b/libstdc++-v3/testsuite/util/testsuite_abi.cc index ec7c3df9ecc..ce9cda660fa 100644 --- a/libstdc++-v3/testsuite/util/testsuite_abi.cc +++ b/libstdc++-v3/testsuite/util/testsuite_abi.cc @@ -215,6 +215,7 @@ check_version(symbol& test, bool added) known_versions.push_back("GLIBCXX_3.4.31"); known_versions.push_back("GLIBCXX_3.4.32"); known_versions.push_back("GLIBCXX_3.4.33"); + known_versions.push_back("GLIBCXX_3.4.34"); known_versions.push_back("GLIBCXX_LDBL_3.4.31"); known_versions.push_back("GLIBCXX_IEEE128_3.4.29"); known_versions.push_back("GLIBCXX_IEEE128_3.4.30");