diff options
Diffstat (limited to 'lib/libc/locale')
147 files changed, 17449 insertions, 0 deletions
diff --git a/lib/libc/locale/DESIGN.xlocale b/lib/libc/locale/DESIGN.xlocale new file mode 100644 index 0000000000000..5d998d32314d6 --- /dev/null +++ b/lib/libc/locale/DESIGN.xlocale @@ -0,0 +1,159 @@ +$FreeBSD$ + +Design of xlocale +================= + +The xlocale APIs come from Darwin, although a subset is now part of POSIX 2008. +They fall into two broad categories: + +- Manipulation of per-thread locales (POSIX) +- Locale-aware functions taking an explicit locale argument (Darwin) + +This document describes the implementation of these APIs for FreeBSD. + +Goals +----- + +The overall goal of this implementation is to be compatible with the Darwin +version. Additionally, it should include minimal changes to the existing +locale code. A lot of the existing locale code originates with 4BSD or earlier +and has had over a decade of testing. Replacing this code, unless absolutely +necessary, gives us the potential for more bugs without much benefit. + +With this in mind, various libc-private functions have been modified to take a +locale_t parameter. This causes a compiler error if they are accidentally +called without a locale. This approach was taken, rather than adding _l +variants of these functions, to make it harder for accidental uses of the +global-locale versions to slip in. + +Locale Objects +-------------- + +A locale is encapsulated in a `locale_t`, which is an opaque type: a pointer to +a `struct _xlocale`. The name `_xlocale` is unfortunate, as it does not fit +well with existing conventions, but is used because this is the name the Darwin +implementation gives to this structure and so may be used by existing (bad) code. + +This structure should include all of the information corresponding to a locale. +A locale_t is almost immutable after creation. There are no functions that modify it, +and it can therefore be used without locking. It is the responsibility of the +caller to ensure that a locale is not deallocated during a call that uses it. + +Each locale contains a number of components, one for each of the categories +supported by `setlocale()`. These are likewise immutable after creation. This +differs from the Darwin implementation, which includes a deprecated +`setinvalidrune()` function that can modify the rune locale. + +The exception to these mutability rules is a set of `mbstate_t` flags stored +with each locale. These are used by various functions that previously had a +static local `mbstate_t` variable. + +The components are reference counted, and so can be aliased between locale +objects. This makes copying locales very cheap. + +The Global Locale +----------------- + +All locales and locale components are reference counted. The global locale, +however, is special. It, and all of its components, are static and so no +malloc() memory is required when using a single locale. + +This means that threads using the global locale are subject to the same +constraints as with the pre-xlocale libc. Calls to any locale-aware functions +in threads using the global locale, while modifying the global locale, have +undefined behaviour. + +Because of this, we have to ensure that we always copy the components of the +global locale, rather than alias them. + +It would be cleaner to simply remove the special treatment of the global locale +and have a locale_t lazily allocated for the global context. This would cost a +little more `malloc()` memory, so is not done in the initial version. + +Caching +------- + +The existing locale implementation included several ad-hoc caching layers. +None of these were thread safe. Caching is only really of use for supporting +the pattern where the locale is briefly changed to something and then changed +back. + +The current xlocale implementation removes the caching entirely. This pattern +is not one that should be encouraged. If you need to make some calls with a +modified locale, then you should use the _l suffix versions of the calls, +rather than switch the global locale. If you do need to temporarily switch the +locale and then switch it back, `uselocale()` provides a way of doing this very +easily: It returns the old locale, which can then be passed to a subsequent +call to `uselocale()` to restore it, without the need to load any locale data +from the disk. + +If, in the future, it is determined that caching is beneficial, it can be added +quite easily in xlocale.c. Given, however, that any locale-aware call is going +to be a preparation for presenting data to the user, and so is invariably going +to be part of an I/O operation, this seems like a case of premature +optimisation. + +localeconv +---------- + +The `localeconv()` function is an exception to the immutable-after-creation +rule. In the classic implementation, this function returns a pointer to some +global storage, which is initialised with the data from the current locale. +This is not possible in a multithreaded environment, with multiple locales. + +Instead, each locale contains a `struct lconv` that is lazily initialised on +calls to `localeconv()`. This is not protected by any locking, however this is +still safe on any machine where word-sized stores are atomic: two concurrent +calls will write the same data into the structure. + +Explicit Locale Calls +--------------------- + +A large number of functions have been modified to take an explicit `locale_t` +parameter. The old APIs are then reimplemented with a call to `__get_locale()` +to supply the `locale_t` parameter. This is in line with the Darwin public +APIs, but also simplifies the modifications to these functions. The +`__get_locale()` function is now the only way to access the current locale +within libc. All of the old globals have gone, so there is now a linker error +if any functions attempt to use them. + +The ctype.h functions are a little different. These are not implemented in +terms of their locale-aware versions, for performance reasons. Each of these +is implemented as a short inline function. + +Differences to Darwin APIs +-------------------------- + +`strtoq_l()` and `strtouq_l() `are not provided. These are extensions to +deprecated functions - we should not be encouraging people to use deprecated +interfaces. + +Locale Placeholders +------------------- + +The pointer values 0 and -1 have special meanings as `locale_t` values. Any +public function that accepts a `locale_t` parameter must use the `FIX_LOCALE()` +macro on it before using it. For efficiency, this can be emitted in functions +which *only* use their locale parameter as an argument to another public +function, as the callee will do the `FIX_LOCALE()` itself. + +Potential Improvements +---------------------- + +Currently, the current rune set is accessed via a function call. This makes it +fairly expensive to use any of the ctype.h functions. We could improve this +quite a lot by storing the rune locale data in a __thread-qualified variable. + +Several of the existing FreeBSD locale-aware functions appear to be wrong. For +example, most of the `strto*()` family should probably use `digittoint_l()`, +but instead they assume ASCII. These will break if using a character encoding +that does not put numbers and the letters A-F in the same location as ASCII. +Some functions, like `strcoll()` only work on single-byte encodings. No +attempt has been made to fix existing limitations in the libc functions other +than to add support for xlocale. + +Intuitively, setting a thread-local locale should ensure that all locale-aware +functions can be used safely from that thread. In fact, this is not the case +in either this implementation or the Darwin one. You must call `duplocale()` +or `newlocale()` before calling `uselocale()`. This is a bit ugly, and it +would be better if libc ensure that every thread had its own locale object. diff --git a/lib/libc/locale/Makefile.inc b/lib/libc/locale/Makefile.inc new file mode 100644 index 0000000000000..a8daa1ff284e3 --- /dev/null +++ b/lib/libc/locale/Makefile.inc @@ -0,0 +1,90 @@ +# from @(#)Makefile.inc 8.1 (Berkeley) 6/4/93 +# $FreeBSD$ + +# locale sources +.PATH: ${LIBC_SRCTOP}/${LIBC_ARCH}/locale ${LIBC_SRCTOP}/locale + +SRCS+= ascii.c big5.c btowc.c collate.c collcmp.c euc.c fix_grouping.c \ + gb18030.c gb2312.c gbk.c ctype.c isctype.c iswctype.c \ + ldpart.c lmessages.c lmonetary.c lnumeric.c localeconv.c mblen.c \ + mbrlen.c \ + mbrtowc.c mbsinit.c mbsnrtowcs.c \ + mbsrtowcs.c mbtowc.c mbstowcs.c \ + mskanji.c nextwctype.c nl_langinfo.c nomacros.c none.c rpmatch.c \ + rune.c \ + runetype.c setlocale.c setrunelocale.c \ + table.c \ + tolower.c toupper.c utf8.c wcrtomb.c wcsnrtombs.c \ + wcsrtombs.c wcsftime.c \ + wcstof.c wcstod.c \ + wcstoimax.c wcstol.c wcstold.c wcstoll.c \ + wcstombs.c \ + wcstoul.c wcstoull.c wcstoumax.c wctob.c wctomb.c wctrans.c wctype.c \ + wcwidth.c\ + xlocale.c + +.if ${MK_ICONV} != "no" +SRCS+= c16rtomb_iconv.c c32rtomb_iconv.c mbrtoc16_iconv.c mbrtoc32_iconv.c +.else +SRCS+= c16rtomb.c c32rtomb.c mbrtoc16.c mbrtoc32.c +.endif + +SYM_MAPS+=${LIBC_SRCTOP}/locale/Symbol.map + +MAN+= btowc.3 \ + ctype_l.3 \ + ctype.3 digittoint.3 isalnum.3 isalpha.3 isascii.3 isblank.3 iscntrl.3 \ + isdigit.3 isgraph.3 isideogram.3 islower.3 isphonogram.3 isprint.3 \ + ispunct.3 isrune.3 isspace.3 isspecial.3 \ + isupper.3 iswalnum.3 iswalnum_l.3 isxdigit.3 \ + localeconv.3 mblen.3 mbrlen.3 \ + mbrtowc.3 \ + mbsinit.3 \ + mbsrtowcs.3 mbstowcs.3 mbtowc.3 multibyte.3 \ + nextwctype.3 nl_langinfo.3 rpmatch.3 \ + setlocale.3 toascii.3 tolower.3 toupper.3 towlower.3 towupper.3 \ + wcsftime.3 \ + wcrtomb.3 \ + wcsrtombs.3 wcstod.3 wcstol.3 wcstombs.3 wctomb.3 \ + wctrans.3 wctype.3 wcwidth.3 \ + duplocale.3 freelocale.3 newlocale.3 querylocale.3 uselocale.3 xlocale.3 + +MAN+= big5.5 euc.5 gb18030.5 gb2312.5 gbk.5 mskanji.5 utf8.5 + +MLINKS+=btowc.3 wctob.3 +MLINKS+=isdigit.3 isnumber.3 +MLINKS+=isgraph.3 isgraph_l.3 +MLINKS+=islower.3 islower_l.3 +MLINKS+=ispunct.3 ispunct_l.3 +MLINKS+=isspace.3 isspace_l.3 +MLINKS+=nl_langinfo.3 nl_langinfo_l.3 +MLINKS+=iswalnum.3 iswalpha.3 iswalnum.3 iswascii.3 iswalnum.3 iswblank.3 \ + iswalnum.3 iswcntrl.3 iswalnum.3 iswdigit.3 iswalnum.3 iswgraph.3 \ + iswalnum.3 iswhexnumber.3 \ + iswalnum.3 iswideogram.3 iswalnum.3 iswlower.3 iswalnum.3 iswnumber.3 \ + iswalnum.3 iswphonogram.3 iswalnum.3 iswprint.3 iswalnum.3 iswpunct.3 \ + iswalnum.3 iswrune.3 iswalnum.3 iswspace.3 iswalnum.3 iswspecial.3 \ + iswalnum.3 iswupper.3 iswalnum.3 iswxdigit.3 +MLINKS+=iswalnum_l.3 iswalpha_l.3 iswalnum_l.3 iswcntrl_l.3 \ + iswalnum_l.3 iswctype_l.3 iswalnum_l.3 iswdigit_l.3 \ + iswalnum_l.3 iswgraph_l.3 iswalnum_l.3 iswlower_l.3 \ + iswalnum_l.3 iswprint_l.3 iswalnum_l.3 iswpunct_l.3 \ + iswalnum_l.3 iswspace_l.3 iswalnum_l.3 iswupper_l.3 \ + iswalnum_l.3 iswxdigit_l.3 iswalnum_l.3 towlower_l.3 \ + iswalnum_l.3 towupper_l.3 iswalnum_l.3 wctype_l.3 \ + iswalnum_l.3 iswblank_l.3 iswalnum_l.3 iswhexnumber_l.3 \ + iswalnum_l.3 iswideogram_l.3 iswalnum_l.3 iswnumber_l.3 \ + iswalnum_l.3 iswphonogram_l.3 iswalnum_l.3 iswrune_l.3 \ + iswalnum_l.3 iswspecial_l.3 iswalnum_l.3 nextwctype_l.3 \ + iswalnum_l.3 towctrans_l.3 iswalnum_l.3 wctrans_l.3 +MLINKS+=isxdigit.3 ishexnumber.3 +MLINKS+=localeconv.3 localeconv_l.3 +MLINKS+=mbrtowc.3 mbrtoc16.3 mbrtowc.3 mbrtoc32.3 +MLINKS+=mbsrtowcs.3 mbsnrtowcs.3 +MLINKS+=wcrtomb.3 c16rtomb.3 wcrtomb.3 c32rtomb.3 +MLINKS+=wcsrtombs.3 wcsnrtombs.3 +MLINKS+=wcstod.3 wcstof.3 wcstod.3 wcstold.3 +MLINKS+=wcstol.3 wcstoul.3 wcstol.3 wcstoll.3 wcstol.3 wcstoull.3 \ + wcstol.3 wcstoimax.3 wcstol.3 wcstoumax.3 +MLINKS+=wctrans.3 towctrans.3 +MLINKS+=wctype.3 iswctype.3 diff --git a/lib/libc/locale/Symbol.map b/lib/libc/locale/Symbol.map new file mode 100644 index 0000000000000..b2f2a35f2fe48 --- /dev/null +++ b/lib/libc/locale/Symbol.map @@ -0,0 +1,217 @@ +/* + * $FreeBSD$ + */ + +FBSD_1.0 { + btowc; + digittoint; + isalnum; + isalpha; + isascii; + isblank; + iscntrl; + isdigit; + isgraph; + ishexnumber; + isideogram; + islower; + isnumber; + isphonogram; + isprint; + ispunct; + isrune; + isspace; + isspecial; + isupper; + isxdigit; + toascii; + tolower; + toupper; + iswalnum; + iswalpha; + iswascii; + iswblank; + iswcntrl; + iswdigit; + iswgraph; + iswhexnumber; + iswideogram; + iswlower; + iswnumber; + iswphonogram; + iswprint; + iswpunct; + iswrune; + iswspace; + iswspecial; + iswupper; + iswxdigit; + towlower; + towupper; + localeconv; + mblen; + mbrlen; + mbrtowc; + mbsinit; + mbsnrtowcs; + mbsrtowcs; + mbstowcs; + mbtowc; + nextwctype; + nl_langinfo; + __maskrune; + __sbmaskrune; + __istype; + __sbistype; + __isctype; + __toupper; + __sbtoupper; + __tolower; + __sbtolower; + __wcwidth; + __mb_cur_max; + __mb_sb_limit; + rpmatch; + ___runetype; + setlocale; + _DefaultRuneLocale; + _CurrentRuneLocale; + ___tolower; + ___toupper; + wcrtomb; + wcsftime; + wcsnrtombs; + wcsrtombs; + wcstod; + wcstof; + wcstoimax; + wcstol; + wcstold; + wcstoll; + wcstombs; + wcstoul; + wcstoull; + wcstoumax; + wctob; + wctomb; + towctrans; + wctrans; + iswctype; + wctype; + wcwidth; +}; + +FBSD_1.3 { + newlocale; + duplocale; + freelocale; + querylocale; + uselocale; + __getCurrentRuneLocale; + btowc_l; + localeconv_l; + mblen_l; + mbrlen_l; + mbrtowc_l; + mbsinit_l; + mbsnrtowcs_l; + mbsrtowcs_l; + mbstowcs_l; + mbtowc_l; + nl_langinfo_l; + strcoll_l; + strfmon_l; + strftime_l; + strptime_l; + strxfrm_l; + wcrtomb_l; + wcscoll_l; + wcsnrtombs_l; + wcsrtombs_l; + wcstombs_l; + wcsxfrm_l; + wctob_l; + wctomb_l; + ___tolower_l; + ___toupper_l; + ___runetype_l; + digittoint_l; + isalnum_l; + isalpha_l; + isblank_l; + iscntrl_l; + isdigit_l; + isgraph_l; + ishexnumber_l; + isideogram_l; + islower_l; + isnumber_l; + isphonogram_l; + isprint_l; + ispunct_l; + isrune_l; + isspace_l; + isspecial_l; + isupper_l; + isxdigit_l; + tolower_l; + toupper_l; + iswalnum_l; + iswalpha_l; + iswblank_l; + iswcntrl_l; + iswdigit_l; + iswgraph_l; + iswhexnumber_l; + iswideogram_l; + iswlower_l; + iswnumber_l; + iswphonogram_l; + iswprint_l; + iswpunct_l; + iswrune_l; + iswspace_l; + iswspecial_l; + iswupper_l; + iswxdigit_l; + towlower_l; + towupper_l; + iswctype_l; + wctype_l; + nextwctype_l; + ___mb_cur_max; + ___mb_cur_max_l; + towctrans_l; + wctrans_l; + wcsftime_l; + wcstod_l; + wcstof_l; + wcstoimax_l; + wcstol_l; + wcstold_l; + wcstoll_l; + wcstoul_l; + wcstoull_l; + wcstoumax_l; + __sbistype_l; + __maskrune_l; + __sbmaskrune_l; + __istype_l; + __runes_for_locale; + _ThreadRuneLocale; + c16rtomb; + c16rtomb_l; + c32rtomb; + c32rtomb_l; + mbrtoc16; + mbrtoc16_l; + mbrtoc32; + mbrtoc32_l; +}; + +FBSDprivate_1.0 { + _PathLocale; + __detect_path_locale; + __collate_load_error; + __collate_range_cmp; +}; diff --git a/lib/libc/locale/ascii.c b/lib/libc/locale/ascii.c new file mode 100644 index 0000000000000..99bf94108c89a --- /dev/null +++ b/lib/libc/locale/ascii.c @@ -0,0 +1,196 @@ +/*- + * SPDX-License-Identifier: BSD-3-Clause + * + * Copyright (c) 2002-2004 Tim J. Robbins. All rights reserved. + * Copyright (c) 1993 + * The Regents of the University of California. All rights reserved. + * + * This code is derived from software contributed to Berkeley by + * Paul Borman at Krystal Technologies. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <errno.h> +#include <limits.h> +#include <runetype.h> +#include <stddef.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <wchar.h> +#include "mblocal.h" + +static size_t _ascii_mbrtowc(wchar_t * __restrict, const char * __restrict, + size_t, mbstate_t * __restrict); +static int _ascii_mbsinit(const mbstate_t *); +static size_t _ascii_mbsnrtowcs(wchar_t * __restrict dst, + const char ** __restrict src, size_t nms, size_t len, + mbstate_t * __restrict ps __unused); +static size_t _ascii_wcrtomb(char * __restrict, wchar_t, + mbstate_t * __restrict); +static size_t _ascii_wcsnrtombs(char * __restrict, const wchar_t ** __restrict, + size_t, size_t, mbstate_t * __restrict); + +int +_ascii_init(struct xlocale_ctype *l,_RuneLocale *rl) +{ + + l->__mbrtowc = _ascii_mbrtowc; + l->__mbsinit = _ascii_mbsinit; + l->__mbsnrtowcs = _ascii_mbsnrtowcs; + l->__wcrtomb = _ascii_wcrtomb; + l->__wcsnrtombs = _ascii_wcsnrtombs; + l->runes = rl; + l->__mb_cur_max = 1; + l->__mb_sb_limit = 128; + return(0); +} + +static int +_ascii_mbsinit(const mbstate_t *ps __unused) +{ + + /* + * Encoding is not state dependent - we are always in the + * initial state. + */ + return (1); +} + +static size_t +_ascii_mbrtowc(wchar_t * __restrict pwc, const char * __restrict s, size_t n, + mbstate_t * __restrict ps __unused) +{ + + if (s == NULL) + /* Reset to initial shift state (no-op) */ + return (0); + if (n == 0) + /* Incomplete multibyte sequence */ + return ((size_t)-2); + if (*s & 0x80) { + errno = EILSEQ; + return ((size_t)-1); + } + if (pwc != NULL) + *pwc = (unsigned char)*s; + return (*s == '\0' ? 0 : 1); +} + +static size_t +_ascii_wcrtomb(char * __restrict s, wchar_t wc, + mbstate_t * __restrict ps __unused) +{ + + if (s == NULL) + /* Reset to initial shift state (no-op) */ + return (1); + if (wc < 0 || wc > 127) { + errno = EILSEQ; + return ((size_t)-1); + } + *s = (unsigned char)wc; + return (1); +} + +static size_t +_ascii_mbsnrtowcs(wchar_t * __restrict dst, const char ** __restrict src, + size_t nms, size_t len, mbstate_t * __restrict ps __unused) +{ + const char *s; + size_t nchr; + + if (dst == NULL) { + for (s = *src; nms > 0 && *s != '\0'; s++, nms--) { + if (*s & 0x80) { + errno = EILSEQ; + return ((size_t)-1); + } + } + return (s - *src); + } + + s = *src; + nchr = 0; + while (len-- > 0 && nms-- > 0) { + if (*s & 0x80) { + *src = s; + errno = EILSEQ; + return ((size_t)-1); + } + if ((*dst++ = (unsigned char)*s++) == L'\0') { + *src = NULL; + return (nchr); + } + nchr++; + } + *src = s; + return (nchr); +} + +static size_t +_ascii_wcsnrtombs(char * __restrict dst, const wchar_t ** __restrict src, + size_t nwc, size_t len, mbstate_t * __restrict ps __unused) +{ + const wchar_t *s; + size_t nchr; + + if (dst == NULL) { + for (s = *src; nwc > 0 && *s != L'\0'; s++, nwc--) { + if (*s < 0 || *s > 127) { + errno = EILSEQ; + return ((size_t)-1); + } + } + return (s - *src); + } + + s = *src; + nchr = 0; + while (len-- > 0 && nwc-- > 0) { + if (*s < 0 || *s > 127) { + *src = s; + errno = EILSEQ; + return ((size_t)-1); + } + if ((*dst++ = *s++) == '\0') { + *src = NULL; + return (nchr); + } + nchr++; + } + *src = s; + return (nchr); +} + diff --git a/lib/libc/locale/big5.5 b/lib/libc/locale/big5.5 new file mode 100644 index 0000000000000..13d0c7b64e535 --- /dev/null +++ b/lib/libc/locale/big5.5 @@ -0,0 +1,51 @@ +.\" Copyright (c) 2002, 2003 Tim J. Robbins +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd August 7, 2003 +.Dt BIG5 5 +.Os +.Sh NAME +.Nm big5 +.Nd +.Dq "Big Five" +encoding for Traditional Chinese text +.Sh SYNOPSIS +.Nm ENCODING +.Qq BIG5 +.Sh DESCRIPTION +.Dq Big Five +is the de facto standard for encoding Traditional Chinese text. +Each character is represented by either one or two bytes. +Characters from the +.Tn ASCII +character set are represented as single bytes in the range 0x00 - 0x7F. +Traditional Chinese characters are represented by two bytes: +the first in the range 0xA1 - 0xFE, the second in the range +0x40 - 0xFE. +.Sh SEE ALSO +.Xr euc 5 , +.Xr gb18030 5 , +.Xr utf8 5 diff --git a/lib/libc/locale/big5.c b/lib/libc/locale/big5.c new file mode 100644 index 0000000000000..c1f94d39c7dac --- /dev/null +++ b/lib/libc/locale/big5.c @@ -0,0 +1,200 @@ +/*- + * Copyright 2013 Garrett D'Amore <garrett@damore.org> + * Copyright 2010 Nexenta Systems, Inc. All rights reserved. + * Copyright (c) 2002-2004 Tim J. Robbins. All rights reserved. + * Copyright (c) 1993 + * The Regents of the University of California. All rights reserved. + * + * This code is derived from software contributed to Berkeley by + * Paul Borman at Krystal Technologies. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#if defined(LIBC_SCCS) && !defined(lint) +static char sccsid[] = "@(#)big5.c 8.1 (Berkeley) 6/4/93"; +#endif /* LIBC_SCCS and not lint */ +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <sys/types.h> +#include <errno.h> +#include <runetype.h> +#include <stdlib.h> +#include <string.h> +#include <wchar.h> +#include "mblocal.h" + +extern int __mb_sb_limit; + +static size_t _BIG5_mbrtowc(wchar_t * __restrict, const char * __restrict, + size_t, mbstate_t * __restrict); +static int _BIG5_mbsinit(const mbstate_t *); +static size_t _BIG5_wcrtomb(char * __restrict, wchar_t, + mbstate_t * __restrict); +static size_t _BIG5_mbsnrtowcs(wchar_t * __restrict, + const char ** __restrict, size_t, size_t, + mbstate_t * __restrict); +static size_t _BIG5_wcsnrtombs(char * __restrict, + const wchar_t ** __restrict, size_t, size_t, + mbstate_t * __restrict); + +typedef struct { + wchar_t ch; +} _BIG5State; + +int +_BIG5_init(struct xlocale_ctype *l, _RuneLocale *rl) +{ + + l->__mbrtowc = _BIG5_mbrtowc; + l->__wcrtomb = _BIG5_wcrtomb; + l->__mbsnrtowcs = _BIG5_mbsnrtowcs; + l->__wcsnrtombs = _BIG5_wcsnrtombs; + l->__mbsinit = _BIG5_mbsinit; + l->runes = rl; + l->__mb_cur_max = 2; + l->__mb_sb_limit = 128; + return (0); +} + +static int +_BIG5_mbsinit(const mbstate_t *ps) +{ + + return (ps == NULL || ((const _BIG5State *)ps)->ch == 0); +} + +static __inline int +_big5_check(u_int c) +{ + + c &= 0xff; + return ((c >= 0xa1 && c <= 0xfe) ? 2 : 1); +} + +static size_t +_BIG5_mbrtowc(wchar_t * __restrict pwc, const char * __restrict s, size_t n, + mbstate_t * __restrict ps) +{ + _BIG5State *bs; + wchar_t wc; + size_t len; + + bs = (_BIG5State *)ps; + + if ((bs->ch & ~0xFF) != 0) { + /* Bad conversion state. */ + errno = EINVAL; + return ((size_t)-1); + } + + if (s == NULL) { + s = ""; + n = 1; + pwc = NULL; + } + + if (n == 0) + /* Incomplete multibyte sequence */ + return ((size_t)-2); + + if (bs->ch != 0) { + if (*s == '\0') { + errno = EILSEQ; + return ((size_t)-1); + } + wc = (bs->ch << 8) | (*s & 0xFF); + if (pwc != NULL) + *pwc = wc; + bs->ch = 0; + return (1); + } + + len = (size_t)_big5_check(*s); + wc = *s++ & 0xff; + if (len == 2) { + if (n < 2) { + /* Incomplete multibyte sequence */ + bs->ch = wc; + return ((size_t)-2); + } + if (*s == '\0') { + errno = EILSEQ; + return ((size_t)-1); + } + wc = (wc << 8) | (*s++ & 0xff); + if (pwc != NULL) + *pwc = wc; + return (2); + } else { + if (pwc != NULL) + *pwc = wc; + return (wc == L'\0' ? 0 : 1); + } +} + +static size_t +_BIG5_wcrtomb(char * __restrict s, wchar_t wc, mbstate_t * __restrict ps) +{ + _BIG5State *bs; + + bs = (_BIG5State *)ps; + + if (bs->ch != 0) { + errno = EINVAL; + return ((size_t)-1); + } + + if (s == NULL) + /* Reset to initial shift state (no-op) */ + return (1); + if (wc & 0x8000) { + *s++ = (wc >> 8) & 0xff; + *s = wc & 0xff; + return (2); + } + *s = wc & 0xff; + return (1); +} + +static size_t +_BIG5_mbsnrtowcs(wchar_t * __restrict dst, const char ** __restrict src, + size_t nms, size_t len, mbstate_t * __restrict ps) +{ + return (__mbsnrtowcs_std(dst, src, nms, len, ps, _BIG5_mbrtowc)); +} + +static size_t +_BIG5_wcsnrtombs(char * __restrict dst, const wchar_t ** __restrict src, + size_t nwc, size_t len, mbstate_t * __restrict ps) +{ + return (__wcsnrtombs_std(dst, src, nwc, len, ps, _BIG5_wcrtomb)); +} diff --git a/lib/libc/locale/btowc.3 b/lib/libc/locale/btowc.3 new file mode 100644 index 0000000000000..2d8cb493351b0 --- /dev/null +++ b/lib/libc/locale/btowc.3 @@ -0,0 +1,88 @@ +.\" Copyright (c) 2002 Tim J. Robbins +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd February 13, 2012 +.Dt BTOWC 3 +.Os +.Sh NAME +.Nm btowc , +.Nm wctob +.Nd "convert between wide and single-byte characters" +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In wchar.h +.Ft wint_t +.Fn btowc "int c" +.Ft int +.Fn wctob "wint_t c" +.In wchar.h +.In xlocale.h +.Ft wint_t +.Fn btowc_l "int c" "locale_t loc" +.Ft int +.Fn wctob_l "wint_t c" "locale_t loc" +.Sh DESCRIPTION +The +.Fn btowc +function converts a single-byte character into a corresponding wide character. +If the character is +.Dv EOF +or not valid in the initial shift state, +.Fn btowc +returns +.Dv WEOF . +.Pp +The +.Fn wctob +function converts a wide character into a corresponding single-byte character. +If the wide character is +.Dv WEOF +or not able to be represented as a single byte in the initial shift state, +.Fn wctob +returns +.Dv EOF . +.Pp +The _l-suffixed versions take an explicit locale argument, while the +non-suffixed versions use the current global or per-thread locale. +.Sh SEE ALSO +.Xr mbrtowc 3 , +.Xr multibyte 3 , +.Xr wcrtomb 3 +.Sh STANDARDS +The +.Fn btowc +and +.Fn wctob +functions conform to +.St -p1003.1-2001 . +.Sh HISTORY +The +.Fn btowc +and +.Fn wctob +functions first appeared in +.Fx 5.0 . diff --git a/lib/libc/locale/btowc.c b/lib/libc/locale/btowc.c new file mode 100644 index 0000000000000..a695da477d581 --- /dev/null +++ b/lib/libc/locale/btowc.c @@ -0,0 +1,66 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2002, 2003 Tim J. Robbins. + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <stdio.h> +#include <wchar.h> +#include "mblocal.h" + +wint_t +btowc_l(int c, locale_t l) +{ + static const mbstate_t initial; + mbstate_t mbs = initial; + char cc; + wchar_t wc; + FIX_LOCALE(l); + + if (c == EOF) + return (WEOF); + /* + * We expect mbrtowc() to return 0 or 1, hence the check for n > 1 + * which detects error return values as well as "impossible" byte + * counts. + */ + cc = (char)c; + if (XLOCALE_CTYPE(l)->__mbrtowc(&wc, &cc, 1, &mbs) > 1) + return (WEOF); + return (wc); +} +wint_t +btowc(int c) +{ + return btowc_l(c, __get_locale()); +} diff --git a/lib/libc/locale/c16rtomb.c b/lib/libc/locale/c16rtomb.c new file mode 100644 index 0000000000000..6f86b00746678 --- /dev/null +++ b/lib/libc/locale/c16rtomb.c @@ -0,0 +1,83 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2013 Ed Schouten <ed@FreeBSD.org> + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <errno.h> +#include <uchar.h> +#include "xlocale_private.h" + +typedef struct { + char16_t lead_surrogate; + mbstate_t c32_mbstate; +} _Char16State; + +size_t +c16rtomb_l(char * __restrict s, char16_t c16, mbstate_t * __restrict ps, + locale_t locale) +{ + _Char16State *cs; + char32_t c32; + + FIX_LOCALE(locale); + if (ps == NULL) + ps = &locale->c16rtomb; + cs = (_Char16State *)ps; + + /* If s is a null pointer, the value of parameter c16 is ignored. */ + if (s == NULL) { + c32 = 0; + } else if (cs->lead_surrogate >= 0xd800 && + cs->lead_surrogate <= 0xdbff) { + /* We should see a trail surrogate now. */ + if (c16 < 0xdc00 || c16 > 0xdfff) { + errno = EILSEQ; + return ((size_t)-1); + } + c32 = 0x10000 + ((cs->lead_surrogate & 0x3ff) << 10 | + (c16 & 0x3ff)); + } else if (c16 >= 0xd800 && c16 <= 0xdbff) { + /* Store lead surrogate for next invocation. */ + cs->lead_surrogate = c16; + return (0); + } else { + /* Regular character. */ + c32 = c16; + } + cs->lead_surrogate = 0; + + return (c32rtomb_l(s, c32, &cs->c32_mbstate, locale)); +} + +size_t +c16rtomb(char * __restrict s, char16_t c16, mbstate_t * __restrict ps) +{ + + return (c16rtomb_l(s, c16, ps, __get_locale())); +} diff --git a/lib/libc/locale/c16rtomb_iconv.c b/lib/libc/locale/c16rtomb_iconv.c new file mode 100644 index 0000000000000..86bd9dab2a52d --- /dev/null +++ b/lib/libc/locale/c16rtomb_iconv.c @@ -0,0 +1,8 @@ +/* $FreeBSD$ */ +#define charXX_t char16_t +#define cXXrtomb c16rtomb +#define cXXrtomb_l c16rtomb_l +#define SRCBUF_LEN 2 +#define UTF_XX_INTERNAL "UTF-16-INTERNAL" + +#include "cXXrtomb_iconv.h" diff --git a/lib/libc/locale/c32rtomb.c b/lib/libc/locale/c32rtomb.c new file mode 100644 index 0000000000000..31d7a280deda1 --- /dev/null +++ b/lib/libc/locale/c32rtomb.c @@ -0,0 +1,61 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2013 Ed Schouten <ed@FreeBSD.org> + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <errno.h> +#include <uchar.h> +#include <wchar.h> +#include "xlocale_private.h" + +size_t +c32rtomb_l(char * __restrict s, char32_t c32, mbstate_t * __restrict ps, + locale_t locale) +{ + + /* Unicode Standard 5.0, D90: ill-formed characters. */ + if ((c32 >= 0xd800 && c32 <= 0xdfff) || c32 > 0x10ffff) { + errno = EILSEQ; + return ((size_t)-1); + } + + FIX_LOCALE(locale); + if (ps == NULL) + ps = &locale->c32rtomb; + + /* Assume wchar_t uses UTF-32. */ + return (wcrtomb_l(s, c32, ps, locale)); +} + +size_t +c32rtomb(char * __restrict s, char32_t c32, mbstate_t * __restrict ps) +{ + + return (c32rtomb_l(s, c32, ps, __get_locale())); +} diff --git a/lib/libc/locale/c32rtomb_iconv.c b/lib/libc/locale/c32rtomb_iconv.c new file mode 100644 index 0000000000000..dabbfd7f7ab42 --- /dev/null +++ b/lib/libc/locale/c32rtomb_iconv.c @@ -0,0 +1,8 @@ +/* $FreeBSD$ */ +#define charXX_t char32_t +#define cXXrtomb c32rtomb +#define cXXrtomb_l c32rtomb_l +#define SRCBUF_LEN 1 +#define UTF_XX_INTERNAL "UTF-32-INTERNAL" + +#include "cXXrtomb_iconv.h" diff --git a/lib/libc/locale/cXXrtomb_iconv.h b/lib/libc/locale/cXXrtomb_iconv.h new file mode 100644 index 0000000000000..d0dadac6c3121 --- /dev/null +++ b/lib/libc/locale/cXXrtomb_iconv.h @@ -0,0 +1,117 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2013 Ed Schouten <ed@FreeBSD.org> + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <sys/queue.h> + +#include <assert.h> +#include <errno.h> +#include <langinfo.h> +#include <uchar.h> + +#include "../iconv/citrus_hash.h" +#include "../iconv/citrus_module.h" +#include "../iconv/citrus_iconv.h" +#include "xlocale_private.h" + +typedef struct { + bool initialized; + struct _citrus_iconv iconv; + union { + charXX_t widechar[SRCBUF_LEN]; + char bytes[sizeof(charXX_t) * SRCBUF_LEN]; + } srcbuf; + size_t srcbuf_len; +} _ConversionState; +_Static_assert(sizeof(_ConversionState) <= sizeof(mbstate_t), + "Size of _ConversionState must not exceed mbstate_t's size."); + +size_t +cXXrtomb_l(char * __restrict s, charXX_t c, mbstate_t * __restrict ps, + locale_t locale) +{ + _ConversionState *cs; + struct _citrus_iconv *handle; + char *src, *dst; + size_t srcleft, dstleft, invlen; + int err; + + FIX_LOCALE(locale); + if (ps == NULL) + ps = &locale->cXXrtomb; + cs = (_ConversionState *)ps; + handle = &cs->iconv; + + /* Reinitialize mbstate_t. */ + if (s == NULL || !cs->initialized) { + if (_citrus_iconv_open(&handle, UTF_XX_INTERNAL, + nl_langinfo_l(CODESET, locale)) != 0) { + cs->initialized = false; + errno = EINVAL; + return (-1); + } + handle->cv_shared->ci_discard_ilseq = true; + handle->cv_shared->ci_hooks = NULL; + cs->srcbuf_len = 0; + cs->initialized = true; + if (s == NULL) + return (1); + } + + assert(cs->srcbuf_len < sizeof(cs->srcbuf.widechar) / sizeof(charXX_t)); + cs->srcbuf.widechar[cs->srcbuf_len++] = c; + + /* Perform conversion. */ + src = cs->srcbuf.bytes; + srcleft = cs->srcbuf_len * sizeof(charXX_t); + dst = s; + dstleft = MB_CUR_MAX_L(locale); + err = _citrus_iconv_convert(handle, &src, &srcleft, &dst, &dstleft, + 0, &invlen); + + /* Character is part of a surrogate pair. We need more input. */ + if (err == EINVAL) + return (0); + cs->srcbuf_len = 0; + + /* Illegal sequence. */ + if (dst == s) { + errno = EILSEQ; + return ((size_t)-1); + } + return (dst - s); +} + +size_t +cXXrtomb(char * __restrict s, charXX_t c, mbstate_t * __restrict ps) +{ + + return (cXXrtomb_l(s, c, ps, __get_locale())); +} diff --git a/lib/libc/locale/collate.c b/lib/libc/locale/collate.c new file mode 100644 index 0000000000000..8d040c194860b --- /dev/null +++ b/lib/libc/locale/collate.c @@ -0,0 +1,710 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright 2014 Garrett D'Amore <garrett@damore.org> + * Copyright 2010 Nexenta Systems, Inc. All rights reserved. + * Copyright (c) 1995 Alex Tatmanjants <alex@elvisti.kiev.ua> + * at Electronni Visti IA, Kiev, Ukraine. + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * Adapted to xlocale by John Marino <draco@marino.st> + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include "namespace.h" + +#include <sys/types.h> +#include <sys/stat.h> +#include <sys/mman.h> + +#include <assert.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <wchar.h> +#include <errno.h> +#include <unistd.h> +#include <fcntl.h> +#include "un-namespace.h" + +#include "endian.h" +#include "collate.h" +#include "setlocale.h" +#include "ldpart.h" +#include "libc_private.h" + +struct xlocale_collate __xlocale_global_collate = { + {{0}, "C"}, 1, 0, 0, 0 +}; + +struct xlocale_collate __xlocale_C_collate = { + {{0}, "C"}, 1, 0, 0, 0 +}; + +static int +__collate_load_tables_l(const char *encoding, struct xlocale_collate *table); + +static void +destruct_collate(void *t) +{ + struct xlocale_collate *table = t; + if (table->map && (table->maplen > 0)) { + (void) munmap(table->map, table->maplen); + } + free(t); +} + +void * +__collate_load(const char *encoding, __unused locale_t unused) +{ + if (strcmp(encoding, "C") == 0 || strcmp(encoding, "POSIX") == 0) { + return &__xlocale_C_collate; + } + struct xlocale_collate *table = calloc(sizeof(struct xlocale_collate), 1); + table->header.header.destructor = destruct_collate; + // FIXME: Make sure that _LDP_CACHE is never returned. We should be doing + // the caching outside of this section + if (__collate_load_tables_l(encoding, table) != _LDP_LOADED) { + xlocale_release(table); + return NULL; + } + return table; +} + +/** + * Load the collation tables for the specified encoding into the global table. + */ +int +__collate_load_tables(const char *encoding) +{ + + return (__collate_load_tables_l(encoding, &__xlocale_global_collate)); +} + +int +__collate_load_tables_l(const char *encoding, struct xlocale_collate *table) +{ + int i, chains, z; + char *buf; + char *TMP; + char *map; + collate_info_t *info; + struct stat sbuf; + int fd; + + table->__collate_load_error = 1; + + /* 'encoding' must be already checked. */ + if (strcmp(encoding, "C") == 0 || strcmp(encoding, "POSIX") == 0) { + return (_LDP_CACHE); + } + + if (asprintf(&buf, "%s/%s/LC_COLLATE", _PathLocale, encoding) == -1) + return (_LDP_ERROR); + + if ((fd = _open(buf, O_RDONLY)) < 0) { + free(buf); + return (_LDP_ERROR); + } + free(buf); + if (_fstat(fd, &sbuf) < 0) { + (void) _close(fd); + return (_LDP_ERROR); + } + if (sbuf.st_size < (COLLATE_STR_LEN + sizeof (info))) { + (void) _close(fd); + errno = EINVAL; + return (_LDP_ERROR); + } + map = mmap(NULL, sbuf.st_size, PROT_READ, MAP_PRIVATE, fd, 0); + (void) _close(fd); + if ((TMP = map) == NULL) { + return (_LDP_ERROR); + } + + if (strncmp(TMP, COLLATE_VERSION, COLLATE_STR_LEN) != 0) { + (void) munmap(map, sbuf.st_size); + errno = EINVAL; + return (_LDP_ERROR); + } + TMP += COLLATE_STR_LEN; + + info = (void *)TMP; + TMP += sizeof (*info); + + if ((info->directive_count < 1) || + (info->directive_count >= COLL_WEIGHTS_MAX) || + ((chains = BSWAP(info->chain_count)) < 0)) { + (void) munmap(map, sbuf.st_size); + errno = EINVAL; + return (_LDP_ERROR); + } + + i = (sizeof (collate_char_t) * (UCHAR_MAX + 1)) + + (sizeof (collate_chain_t) * chains) + + (sizeof (collate_large_t) * BSWAP(info->large_count)); + for (z = 0; z < info->directive_count; z++) { + i += sizeof (collate_subst_t) * BSWAP(info->subst_count[z]); + } + if (i != (sbuf.st_size - (TMP - map))) { + (void) munmap(map, sbuf.st_size); + errno = EINVAL; + return (_LDP_ERROR); + } + + table->info = info; + table->char_pri_table = (void *)TMP; + TMP += sizeof (collate_char_t) * (UCHAR_MAX + 1); + + for (z = 0; z < info->directive_count; z++) { + if (BSWAP(info->subst_count[z]) > 0) { + table->subst_table[z] = (void *)TMP; + TMP += BSWAP(info->subst_count[z]) * sizeof (collate_subst_t); + } else { + table->subst_table[z] = NULL; + } + } + + if (chains > 0) { + table->chain_pri_table = (void *)TMP; + TMP += chains * sizeof (collate_chain_t); + } else + table->chain_pri_table = NULL; + if (BSWAP(info->large_count) > 0) + table->large_pri_table = (void *)TMP; + else + table->large_pri_table = NULL; + + table->__collate_load_error = 0; + return (_LDP_LOADED); +} + +static const int32_t * +substsearch(struct xlocale_collate *table, const wchar_t key, int pass) +{ + const collate_subst_t *p; + int n = BSWAP(table->info->subst_count[pass]); + + if (n == 0) + return (NULL); + + if (pass >= table->info->directive_count) + return (NULL); + + if (!(key & COLLATE_SUBST_PRIORITY)) + return (NULL); + + p = table->subst_table[pass] + (key & ~COLLATE_SUBST_PRIORITY); + assert(BSWAP(p->key) == key); + + return (p->pri); +} + +static collate_chain_t * +chainsearch(struct xlocale_collate *table, const wchar_t *key, int *len) +{ + int low = 0; + int high = BSWAP(table->info->chain_count) - 1; + int next, compar, l; + collate_chain_t *p; + collate_chain_t *tab = table->chain_pri_table; + + if (high < 0) + return (NULL); + + while (low <= high) { + next = (low + high) / 2; + p = tab + next; + compar = *key - le16toh(*p->str); + if (compar == 0) { + l = wcsnlen(p->str, COLLATE_STR_LEN); + compar = wcsncmp(key, p->str, l); + if (compar == 0) { + *len = l; + return (p); + } + } + if (compar > 0) + low = next + 1; + else + high = next - 1; + } + return (NULL); +} + +static collate_large_t * +largesearch(struct xlocale_collate *table, const wchar_t key) +{ + int low = 0; + int high = BSWAP(table->info->large_count) - 1; + int next, compar; + collate_large_t *p; + collate_large_t *tab = table->large_pri_table; + + if (high < 0) + return (NULL); + + while (low <= high) { + next = (low + high) / 2; + p = tab + next; + compar = key - BSWAP(p->val); + if (compar == 0) + return (p); + if (compar > 0) + low = next + 1; + else + high = next - 1; + } + return (NULL); +} + +void +_collate_lookup(struct xlocale_collate *table, const wchar_t *t, int *len, + int *pri, int which, const int **state) +{ + collate_chain_t *p2; + collate_large_t *match; + int p, l; + const int *sptr; + + /* + * If this is the "last" pass for the UNDEFINED, then + * we just return the priority itself. + */ + if (which >= table->info->directive_count) { + *pri = *t; + *len = 1; + *state = NULL; + return; + } + + /* + * If we have remaining substitution data from a previous + * call, consume it first. + */ + if ((sptr = *state) != NULL) { + *pri = *sptr; + sptr++; + if ((sptr == *state) || (sptr == NULL)) + *state = NULL; + else + *state = sptr; + *len = 0; + return; + } + + /* No active substitutions */ + *len = 1; + + /* + * Check for composites such as diphthongs that collate as a + * single element (aka chains or collating-elements). + */ + if (((p2 = chainsearch(table, t, &l)) != NULL) && + ((p = p2->pri[which]) >= 0)) { + + *len = l; + *pri = p; + + } else if (*t <= UCHAR_MAX) { + + /* + * Character is a small (8-bit) character. + * We just look these up directly for speed. + */ + *pri = BSWAP(table->char_pri_table[*t].pri[which]); + + } else if ((BSWAP(table->info->large_count) > 0) && + ((match = largesearch(table, *t)) != NULL)) { + + /* + * Character was found in the extended table. + */ + *pri = BSWAP(match->pri.pri[which]); + + } else { + /* + * Character lacks a specific definition. + */ + if (table->info->directive[which] & DIRECTIVE_UNDEFINED) { + /* Mask off sign bit to prevent ordering confusion. */ + *pri = (*t & COLLATE_MAX_PRIORITY); + } else { + *pri = BSWAP(table->info->undef_pri[which]); + } + /* No substitutions for undefined characters! */ + return; + } + + /* + * Try substituting (expanding) the character. We are + * currently doing this *after* the chain compression. I + * think it should not matter, but this way might be slightly + * faster. + * + * We do this after the priority search, as this will help us + * to identify a single key value. In order for this to work, + * its important that the priority assigned to a given element + * to be substituted be unique for that level. The localedef + * code ensures this for us. + */ + if ((sptr = substsearch(table, *pri, which)) != NULL) { + if ((*pri = BSWAP(*sptr)) > 0) { + sptr++; + *state = BSWAP(*sptr) ? sptr : NULL; + } + } + +} + +/* + * This is the meaty part of wcsxfrm & strxfrm. Note that it does + * NOT NULL terminate. That is left to the caller. + */ +size_t +_collate_wxfrm(struct xlocale_collate *table, const wchar_t *src, wchar_t *xf, + size_t room) +{ + int pri; + int len; + const wchar_t *t; + wchar_t *tr = NULL; + int direc; + int pass; + const int32_t *state; + size_t want = 0; + size_t need = 0; + int ndir = table->info->directive_count; + + assert(src); + + for (pass = 0; pass <= ndir; pass++) { + + state = NULL; + + if (pass != 0) { + /* insert level separator from the previous pass */ + if (room) { + *xf++ = 1; + room--; + } + want++; + } + + /* special pass for undefined */ + if (pass == ndir) { + direc = DIRECTIVE_FORWARD | DIRECTIVE_UNDEFINED; + } else { + direc = table->info->directive[pass]; + } + + t = src; + + if (direc & DIRECTIVE_BACKWARD) { + wchar_t *bp, *fp, c; + free(tr); + if ((tr = wcsdup(t)) == NULL) { + errno = ENOMEM; + goto fail; + } + bp = tr; + fp = tr + wcslen(tr) - 1; + while (bp < fp) { + c = *bp; + *bp++ = *fp; + *fp-- = c; + } + t = (const wchar_t *)tr; + } + + if (direc & DIRECTIVE_POSITION) { + while (*t || state) { + _collate_lookup(table, t, &len, &pri, pass, &state); + t += len; + if (pri <= 0) { + if (pri < 0) { + errno = EINVAL; + goto fail; + } + state = NULL; + pri = COLLATE_MAX_PRIORITY; + } + if (room) { + *xf++ = pri; + room--; + } + want++; + need = want; + } + } else { + while (*t || state) { + _collate_lookup(table, t, &len, &pri, pass, &state); + t += len; + if (pri <= 0) { + if (pri < 0) { + errno = EINVAL; + goto fail; + } + state = NULL; + continue; + } + if (room) { + *xf++ = pri; + room--; + } + want++; + need = want; + } + } + } + free(tr); + return (need); + +fail: + free(tr); + return ((size_t)(-1)); +} + +/* + * In the non-POSIX case, we transform each character into a string of + * characters representing the character's priority. Since char is usually + * signed, we are limited by 7 bits per byte. To avoid zero, we need to add + * XFRM_OFFSET, so we can't use a full 7 bits. For simplicity, we choose 6 + * bits per byte. + * + * It turns out that we sometimes have real priorities that are + * 31-bits wide. (But: be careful using priorities where the high + * order bit is set -- i.e. the priority is negative. The sort order + * may be surprising!) + * + * TODO: This would be a good area to optimize somewhat. It turns out + * that real prioririties *except for the last UNDEFINED pass* are generally + * very small. We need the localedef code to precalculate the max + * priority for us, and ideally also give us a mask, and then we could + * severely limit what we expand to. + */ +#define XFRM_BYTES 6 +#define XFRM_OFFSET ('0') /* make all printable characters */ +#define XFRM_SHIFT 6 +#define XFRM_MASK ((1 << XFRM_SHIFT) - 1) +#define XFRM_SEP ('.') /* chosen to be less than XFRM_OFFSET */ + +static int +xfrm(struct xlocale_collate *table, unsigned char *p, int pri, int pass) +{ + /* we use unsigned to ensure zero fill on right shift */ + uint32_t val = BSWAP((uint32_t)table->info->pri_count[pass]); + int nc = 0; + + while (val) { + *p = (pri & XFRM_MASK) + XFRM_OFFSET; + pri >>= XFRM_SHIFT; + val >>= XFRM_SHIFT; + p++; + nc++; + } + return (nc); +} + +size_t +_collate_sxfrm(struct xlocale_collate *table, const wchar_t *src, char *xf, + size_t room) +{ + int pri; + int len; + const wchar_t *t; + wchar_t *tr = NULL; + int direc; + int pass; + const int32_t *state; + size_t want = 0; + size_t need = 0; + int b; + uint8_t buf[XFRM_BYTES]; + int ndir = table->info->directive_count; + + assert(src); + + for (pass = 0; pass <= ndir; pass++) { + + state = NULL; + + if (pass != 0) { + /* insert level separator from the previous pass */ + if (room) { + *xf++ = XFRM_SEP; + room--; + } + want++; + } + + /* special pass for undefined */ + if (pass == ndir) { + direc = DIRECTIVE_FORWARD | DIRECTIVE_UNDEFINED; + } else { + direc = table->info->directive[pass]; + } + + t = src; + + if (direc & DIRECTIVE_BACKWARD) { + wchar_t *bp, *fp, c; + free(tr); + if ((tr = wcsdup(t)) == NULL) { + errno = ENOMEM; + goto fail; + } + bp = tr; + fp = tr + wcslen(tr) - 1; + while (bp < fp) { + c = *bp; + *bp++ = *fp; + *fp-- = c; + } + t = (const wchar_t *)tr; + } + + if (direc & DIRECTIVE_POSITION) { + while (*t || state) { + + _collate_lookup(table, t, &len, &pri, pass, &state); + t += len; + if (pri <= 0) { + if (pri < 0) { + errno = EINVAL; + goto fail; + } + state = NULL; + pri = COLLATE_MAX_PRIORITY; + } + + b = xfrm(table, buf, pri, pass); + want += b; + if (room) { + while (b) { + b--; + if (room) { + *xf++ = buf[b]; + room--; + } + } + } + need = want; + } + } else { + while (*t || state) { + _collate_lookup(table, t, &len, &pri, pass, &state); + t += len; + if (pri <= 0) { + if (pri < 0) { + errno = EINVAL; + goto fail; + } + state = NULL; + continue; + } + + b = xfrm(table, buf, pri, pass); + want += b; + if (room) { + + while (b) { + b--; + if (room) { + *xf++ = buf[b]; + room--; + } + } + } + need = want; + } + } + } + free(tr); + return (need); + +fail: + free(tr); + return ((size_t)(-1)); +} + +/* + * __collate_equiv_value returns the primary collation value for the given + * collating symbol specified by str and len. Zero or negative is returned + * if the collating symbol was not found. This function is used by bracket + * code in the TRE regex library. + */ +int +__collate_equiv_value(locale_t locale, const wchar_t *str, size_t len) +{ + int32_t e; + + if (len < 1 || len >= COLLATE_STR_LEN) + return (-1); + + FIX_LOCALE(locale); + struct xlocale_collate *table = + (struct xlocale_collate*)locale->components[XLC_COLLATE]; + + if (table->__collate_load_error) + return ((len == 1 && *str <= UCHAR_MAX) ? *str : -1); + + if (len == 1) { + e = -1; + if (*str <= UCHAR_MAX) + e = table->char_pri_table[*str].pri[0]; + else if (BSWAP(table->info->large_count) > 0) { + collate_large_t *match_large; + match_large = largesearch(table, *str); + if (match_large) + e = match_large->pri.pri[0]; + } + if (e == 0) + return (1); + return (e > 0 ? e : 0); + } + if (BSWAP(table->info->chain_count) > 0) { + wchar_t name[COLLATE_STR_LEN]; + collate_chain_t *match_chain; + int clen; + + wcsncpy (name, str, len); + name[len] = 0; + match_chain = chainsearch(table, name, &clen); + if (match_chain) { + e = match_chain->pri[0]; + if (e == 0) + return (1); + return (e < 0 ? -e : e); + } + } + return (0); +} diff --git a/lib/libc/locale/collate.h b/lib/libc/locale/collate.h new file mode 100644 index 0000000000000..4abb1f936ae25 --- /dev/null +++ b/lib/libc/locale/collate.h @@ -0,0 +1,141 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright 2010 Nexenta Systems, Inc. All rights reserved. + * Copyright (c) 1995 Alex Tatmanjants <alex@elvisti.kiev.ua> + * at Electronni Visti IA, Kiev, Ukraine. + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#ifndef _COLLATE_H_ +#define _COLLATE_H_ + +#include <sys/cdefs.h> +#include <sys/types.h> +#include <limits.h> +#include "xlocale_private.h" + +/* + * Work around buildworld bootstrapping from older systems whose limits.h + * sets COLL_WEIGHTS_MAX to 0. + */ +#if COLL_WEIGHTS_MAX == 0 +#undef COLL_WEIGHTS_MAX +#define COLL_WEIGHTS_MAX 10 +#endif + +#define COLLATE_STR_LEN 24 /* should be 64-bit multiple */ +#define COLLATE_VERSION "BSD 1.0\n" + +#define COLLATE_MAX_PRIORITY (0x7fffffff) /* max signed value */ +#define COLLATE_SUBST_PRIORITY (0x40000000) /* bit indicates subst table */ + +#define DIRECTIVE_UNDEF 0x00 +#define DIRECTIVE_FORWARD 0x01 +#define DIRECTIVE_BACKWARD 0x02 +#define DIRECTIVE_POSITION 0x04 +#define DIRECTIVE_UNDEFINED 0x08 /* special last weight for UNDEFINED */ + +#define DIRECTIVE_DIRECTION_MASK (DIRECTIVE_FORWARD | DIRECTIVE_BACKWARD) + +/* + * The collate file format is as follows: + * + * char version[COLLATE_STR_LEN]; // must be COLLATE_VERSION + * collate_info_t info; // see below, includes padding + * collate_char_pri_t char_data[256]; // 8 bit char values + * collate_subst_t subst[*]; // 0 or more substitutions + * collate_chain_pri_t chains[*]; // 0 or more chains + * collate_large_pri_t large[*]; // extended char priorities + * + * Note that all structures must be 32-bit aligned, as each structure + * contains 32-bit member fields. The entire file is mmap'd, so its + * critical that alignment be observed. It is not generally safe to + * use any 64-bit values in the structures. + */ + +typedef struct collate_info { + uint8_t directive_count; + uint8_t directive[COLL_WEIGHTS_MAX]; + int32_t pri_count[COLL_WEIGHTS_MAX]; + int32_t flags; + int32_t chain_count; + int32_t large_count; + int32_t subst_count[COLL_WEIGHTS_MAX]; + int32_t undef_pri[COLL_WEIGHTS_MAX]; +} collate_info_t; + +typedef struct collate_char { + int32_t pri[COLL_WEIGHTS_MAX]; +} collate_char_t; + +typedef struct collate_chain { + wchar_t str[COLLATE_STR_LEN]; + int32_t pri[COLL_WEIGHTS_MAX]; +} collate_chain_t; + +typedef struct collate_large { + int32_t val; + collate_char_t pri; +} collate_large_t; + +typedef struct collate_subst { + int32_t key; + int32_t pri[COLLATE_STR_LEN]; +} collate_subst_t; + +struct xlocale_collate { + struct xlocale_component header; + int __collate_load_error; + char * map; + size_t maplen; + + collate_info_t *info; + collate_char_t *char_pri_table; + collate_large_t *large_pri_table; + collate_chain_t *chain_pri_table; + collate_subst_t *subst_table[COLL_WEIGHTS_MAX]; +}; + +__BEGIN_DECLS +int __collate_load_tables(const char *); +int __collate_equiv_value(locale_t, const wchar_t *, size_t); +void _collate_lookup(struct xlocale_collate *,const wchar_t *, int *, int *, + int, const int **); +int __collate_range_cmp(char, char); +int __wcollate_range_cmp(wchar_t, wchar_t); +size_t _collate_wxfrm(struct xlocale_collate *, const wchar_t *, wchar_t *, + size_t); +size_t _collate_sxfrm(struct xlocale_collate *, const wchar_t *, char *, + size_t); +__END_DECLS + +#endif /* !_COLLATE_H_ */ diff --git a/lib/libc/locale/collcmp.c b/lib/libc/locale/collcmp.c new file mode 100644 index 0000000000000..b444ea0652ac0 --- /dev/null +++ b/lib/libc/locale/collcmp.c @@ -0,0 +1,65 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (C) 1996 by Andrey A. Chernov, Moscow, Russia. + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <string.h> +#include <wchar.h> +#include "collate.h" + +/* + * Compare two characters using collate + */ + +int __collate_range_cmp(char c1, char c2) +{ + char s1[2], s2[2]; + + s1[0] = c1; + s1[1] = '\0'; + s2[0] = c2; + s2[1] = '\0'; + return (strcoll(s1, s2)); +} + +int __wcollate_range_cmp(wchar_t c1, wchar_t c2) +{ + wchar_t s1[2], s2[2]; + + s1[0] = c1; + s1[1] = L'\0'; + s2[0] = c2; + s2[1] = L'\0'; + return (wcscoll(s1, s2)); +} diff --git a/lib/libc/locale/ctype.3 b/lib/libc/locale/ctype.3 new file mode 100644 index 0000000000000..5401fdb92b166 --- /dev/null +++ b/lib/libc/locale/ctype.3 @@ -0,0 +1,151 @@ +.\" Copyright (c) 1991, 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)ctype.3 8.1 (Berkeley) 6/4/93 +.\" $FreeBSD$ +.\" +.Dd March 30, 2004 +.Dt CTYPE 3 +.Os +.Sh NAME +.Nm digittoint , +.Nm isalnum , +.Nm isalpha , +.Nm isascii , +.Nm isblank , +.Nm iscntrl , +.Nm isdigit , +.Nm isgraph , +.Nm ishexnumber , +.Nm isideogram , +.Nm islower , +.Nm isnumber , +.Nm isphonogram , +.Nm isprint , +.Nm ispunct , +.Nm isrune , +.Nm isspace , +.Nm isspecial , +.Nm isupper , +.Nm isxdigit , +.Nm toascii , +.Nm tolower , +.Nm toupper +.Nd character classification functions +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In ctype.h +.Ft int +.Fn digittoint "int c" +.Ft int +.Fn isalnum "int c" +.Ft int +.Fn isalpha "int c" +.Ft int +.Fn isascii "int c" +.Ft int +.Fn iscntrl "int c" +.Ft int +.Fn isdigit "int c" +.Ft int +.Fn isgraph "int c" +.Ft int +.Fn ishexnumber "int c" +.Ft int +.Fn isideogram "int c" +.Ft int +.Fn islower "int c" +.Ft int +.Fn isnumber "int c" +.Ft int +.Fn isphonogram "int c" +.Ft int +.Fn isspecial "int c" +.Ft int +.Fn isprint "int c" +.Ft int +.Fn ispunct "int c" +.Ft int +.Fn isrune "int c" +.Ft int +.Fn isspace "int c" +.Ft int +.Fn isupper "int c" +.Ft int +.Fn isxdigit "int c" +.Ft int +.Fn toascii "int c" +.Ft int +.Fn tolower "int c" +.Ft int +.Fn toupper "int c" +.Sh DESCRIPTION +The above functions perform character tests and conversions on the integer +.Fa c . +They are available as macros, defined in the include file +.In ctype.h , +or as true functions in the C library. +See the specific manual pages for more information. +.Sh SEE ALSO +.Xr digittoint 3 , +.Xr isalnum 3 , +.Xr isalpha 3 , +.Xr isascii 3 , +.Xr isblank 3 , +.Xr iscntrl 3 , +.Xr isdigit 3 , +.Xr isgraph 3 , +.Xr isideogram 3 , +.Xr islower 3 , +.Xr isphonogram 3 , +.Xr isprint 3 , +.Xr ispunct 3 , +.Xr isrune 3 , +.Xr isspace 3 , +.Xr isspecial 3 , +.Xr isupper 3 , +.Xr isxdigit 3 , +.Xr toascii 3 , +.Xr tolower 3 , +.Xr toupper 3 , +.Xr wctype 3 , +.Xr ascii 7 +.Sh STANDARDS +These functions, except for +.Fn digittoint , +.Fn isascii , +.Fn ishexnumber , +.Fn isideogram , +.Fn isnumber , +.Fn isphonogram , +.Fn isrune , +.Fn isspecial +and +.Fn toascii , +conform to +.St -isoC . diff --git a/lib/libc/locale/ctype.c b/lib/libc/locale/ctype.c new file mode 100644 index 0000000000000..8a88b8d6512e3 --- /dev/null +++ b/lib/libc/locale/ctype.c @@ -0,0 +1,35 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * + * This software was developed by David Chisnall under sponsorship from + * the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions * are met: + * 1. Redistributions of source code must retain the above copyright notice, + * this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ +#define _XLOCALE_INLINE +#include <ctype.h> +#include <wctype.h> +#include <xlocale.h> diff --git a/lib/libc/locale/ctype_l.3 b/lib/libc/locale/ctype_l.3 new file mode 100644 index 0000000000000..5f427de06394e --- /dev/null +++ b/lib/libc/locale/ctype_l.3 @@ -0,0 +1,151 @@ +.\" Copyright (c) 2011 The FreeBSD Foundation +.\" All rights reserved. +.\" +.\" This documentation was written by David Chisnall under sponsorship from +.\" the FreeBSD Foundation. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd March 6, 2012 +.Dt CTYPE_L 3 +.Os +.Sh NAME +.Nm digittoint_l , +.Nm isalnum_l , +.Nm isalpha_l , +.Nm isascii_l , +.Nm isblank_l , +.Nm iscntrl_l , +.Nm isdigit_l , +.Nm isgraph_l , +.Nm ishexnumber_l , +.Nm isideogram_l , +.Nm islower_l , +.Nm isnumber_l , +.Nm isphonogram_l , +.Nm isprint_l , +.Nm ispunct_l , +.Nm isrune_l , +.Nm isspace_l , +.Nm isspecial_l , +.Nm isupper_l , +.Nm isxdigit_l , +.Nm tolower_l , +.Nm toupper_l +.Nd character classification functions +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In ctype.h +.Ft int +.Fn digittoint_l "int c" "locale_t loc" +.Ft int +.Fn isalnum_l "int c" "locale_t loc" +.Ft int +.Fn isalpha_l "int c" "locale_t loc" +.Ft int +.Fn isascii_l "int c" "locale_t loc" +.Ft int +.Fn iscntrl_l "int c" "locale_t loc" +.Ft int +.Fn isdigit_l "int c" "locale_t loc" +.Ft int +.Fn isgraph_l "int c" "locale_t loc" +.Ft int +.Fn ishexnumber_l "int c" "locale_t loc" +.Ft int +.Fn isideogram_l "int c" "locale_t loc" +.Ft int +.Fn islower_l "int c" "locale_t loc" +.Ft int +.Fn isnumber_l "int c" "locale_t loc" +.Ft int +.Fn isphonogram_l "int c" "locale_t loc" +.Ft int +.Fn isspecial_l "int c" "locale_t loc" +.Ft int +.Fn isprint_l "int c" "locale_t loc" +.Ft int +.Fn ispunct_l "int c" "locale_t loc" +.Ft int +.Fn isrune_l "int c" "locale_t loc" +.Ft int +.Fn isspace_l "int c" "locale_t loc" +.Ft int +.Fn isupper_l "int c" "locale_t loc" +.Ft int +.Fn isxdigit_l "int c" "locale_t loc" +.Ft int +.Fn tolower_l "int c" "locale_t loc" +.Ft int +.Fn toupper_l "int c" "locale_t loc" +.Sh DESCRIPTION +The above functions perform character tests and conversions on the integer +.Fa c +in the locale +.Fa loc . +They behave in the same way as the versions without the _l suffix, but use the +specified locale rather than the global or per-thread locale. +.In ctype.h , +or as true functions in the C library. +See the specific manual pages for more information. +.Sh SEE ALSO +.Xr digittoint 3 , +.Xr isalnum 3 , +.Xr isalpha 3 , +.Xr isascii 3 , +.Xr isblank 3 , +.Xr iscntrl 3 , +.Xr isdigit 3 , +.Xr isgraph 3 , +.Xr isideogram 3 , +.Xr islower 3 , +.Xr isphonogram 3 , +.Xr isprint 3 , +.Xr ispunct 3 , +.Xr isrune 3 , +.Xr isspace 3 , +.Xr isspecial 3 , +.Xr isupper 3 , +.Xr isxdigit 3 , +.Xr tolower 3 , +.Xr toupper 3 , +.Xr wctype 3 , +.Xr xlocale 3 +.Sh STANDARDS +These functions conform to +.St -p1003.1-2008 , +except for +.Fn digittoint_l , +.Fn isascii_l , +.Fn ishexnumber_l , +.Fn isideogram_l , +.Fn isnumber_l , +.Fn isphonogram_l , +.Fn isrune_l +and +.Fn isspecial_l +which are +.Fx +extensions. diff --git a/lib/libc/locale/digittoint.3 b/lib/libc/locale/digittoint.3 new file mode 100644 index 0000000000000..5caef66461e49 --- /dev/null +++ b/lib/libc/locale/digittoint.3 @@ -0,0 +1,68 @@ +.\" Copyright (c) 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)digittoint.3 8.1 (Berkeley) 6/4/93 +.\" $FreeBSD$ +.\" +.Dd April 6, 2001 +.Dt DIGITTOINT 3 +.Os +.Sh NAME +.Nm digittoint +.Nd convert a numeric character to its integer value +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In ctype.h +.Ft int +.Fn digittoint "int c" +.Ft int +.Fn digittoint_l "int c" "locale_t loc" +.Sh DESCRIPTION +The +.Fn digittoint +function converts a numeric character to its corresponding integer value. +The character can be any decimal digit or hexadecimal digit. +With hexadecimal characters, the case of the values does not matter. +.Pp +The +.Fn digittoint_l +function takes an explicit locale argument, whereas the +.Fn digittoint +function use the current global or per-thread locale. +.Sh RETURN VALUES +The +.Fn digittoint +function always returns an integer from the range of 0 to 15. +If the given character was not a digit as defined by +.Xr isxdigit 3 , +the function will return 0. +.Sh SEE ALSO +.Xr ctype 3 , +.Xr isdigit 3 , +.Xr isxdigit 3 , +.Xr xlocale 3 diff --git a/lib/libc/locale/duplocale.3 b/lib/libc/locale/duplocale.3 new file mode 100644 index 0000000000000..bc0c4bced812e --- /dev/null +++ b/lib/libc/locale/duplocale.3 @@ -0,0 +1,79 @@ +.\" Copyright (c) 2011 The FreeBSD Foundation +.\" All rights reserved. +.\" +.\" This documentation was written by David Chisnall under sponsorship from +.\" the FreeBSD Foundation. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd September 17, 2011 +.Dt DUPLOCALE 3 +.Os +.Sh NAME +.Nm duplocale +.Nd duplicate an locale +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In locale.h +.Ft locale_t +.Fn duplocale "locale_t locale" +.Sh DESCRIPTION +Duplicates an existing +.Fa locale_t +returning a new +.Fa locale_t +that refers to the same locale values but has an independent internal state. +Various functions, such as +.Xr mblen 3 +require a persistent state. +These functions formerly used static variables and calls to them from multiple +threads had undefined behavior. +They now use fields in the +.Fa locale_t +associated with the current thread by +.Xr uselocale 3 . +These calls are therefore only thread safe on threads with a unique per-thread +locale. +The locale returned by this call must be freed with +.Xr freelocale 3 . +.Sh SEE ALSO +.Xr freelocale 3 , +.Xr localeconv 3 , +.Xr newlocale 3 , +.Xr querylocale 3 , +.Xr uselocale 3 , +.Xr xlocale 3 +.Sh STANDARDS +This function conforms to +.St -p1003.1-2008 . +.Sh BUGS +Ideally, +.Xr uselocale 3 +should make a copy of the +.Fa locale_t +implicitly to ensure thread safety, +and a copy of the global locale should be installed lazily on each thread. +The FreeBSD implementation does not do this, +for compatibility with Darwin. diff --git a/lib/libc/locale/endian.h b/lib/libc/locale/endian.h new file mode 100644 index 0000000000000..d3b822788688b --- /dev/null +++ b/lib/libc/locale/endian.h @@ -0,0 +1,52 @@ +/*- + * Copyright (c) 2016 Ruslan Bukin <br@bsdpad.com> + * All rights reserved. + * + * Portions of this software were developed by SRI International and the + * University of Cambridge Computer Laboratory under DARPA/AFRL contract + * FA8750-10-C-0237 ("CTSRD"), as part of the DARPA CRASH research programme. + * + * Portions of this software were developed by the University of Cambridge + * Computer Laboratory as part of the CTSRD Project, with support from the + * UK Higher Education Innovation Fund (HEIF). + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#include <sys/endian.h> + +/* + * We assume locale files were generated on EL machine + * (e.g. during cross build on amd64 host), but used on EB + * machine (e.g. MIPS64EB), so convert it to host endianness. + * + * TODO: detect host endianness on the build machine and use + * correct macros here. + */ + +#if BYTE_ORDER == BIG_ENDIAN && defined(__mips__) +#define BSWAP(x) le32toh(x) +#else +#define BSWAP(x) x +#endif diff --git a/lib/libc/locale/euc.5 b/lib/libc/locale/euc.5 new file mode 100644 index 0000000000000..719ba2bb1ae91 --- /dev/null +++ b/lib/libc/locale/euc.5 @@ -0,0 +1,134 @@ +.\" Copyright (c) 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" Paul Borman at Krystal Technologies. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)euc.4 8.1 (Berkeley) 6/4/93 +.\" $FreeBSD$ +.\" +.Dd November 8, 2003 +.Dt EUC 5 +.Os +.Sh NAME +.Nm euc +.Nd EUC encoding of wide characters +.Sh SYNOPSIS +.Nm ENCODING +.Qq EUC +.Pp +.Nm VARIABLE +.Ar len1 +.Ar mask1 +.Ar len2 +.Ar mask2 +.Ar len3 +.Ar mask3 +.Ar len4 +.Ar mask4 +.Ar mask +.Sh DESCRIPTION +.\"The +.\".Nm EUC +.\"encoding is provided for compatibility with +.\".Ux +.\"based systems. +.\"See +.\".Xr mklocale 1 +.\"for a complete description of the +.\".Ev LC_CTYPE +.\"source file format. +.\".Pp +.Nm EUC +implements a system of 4 multibyte codesets. +A multibyte character in the first codeset consists of +.Ar len1 +bytes starting with a byte in the range of 0x00 to 0x7f. +To allow use of +.Tn ASCII , +.Ar len1 +is always 1. +A multibyte character in the second codeset consists of +.Ar len2 +bytes starting with a byte in the range of 0x80-0xff excluding 0x8e and 0x8f. +A multibyte character in the third codeset consists of +.Ar len3 +bytes starting with the byte 0x8e. +A multibyte character in the fourth codeset consists of +.Ar len4 +bytes starting with the byte 0x8f. +.Pp +The +.Vt wchar_t +encoding of +.Nm EUC +multibyte characters is dependent on the +.Ar len +and +.Ar mask +arguments. +First, the bytes are moved into a +.Vt wchar_t +as follows: +.Bd -literal +byte0 << ((\fIlen\fPN-1) * 8) | byte1 << ((\fIlen\fPN-2) * 8) | ... | byte\fIlen\fPN-1 +.Ed +.Pp +The result is then ANDed with +.Ar ~mask +and ORed with +.Ar maskN . +Codesets 2 and 3 are special in that the leading byte (0x8e or 0x8f) is +first removed and the +.Ar lenN +argument is reduced by 1. +.Pp +For example, the +.Li ja_JP.eucJP +locale has the following +.Va VARIABLE +line: +.Bd -literal +VARIABLE 1 0x0000 2 0x8080 2 0x0080 3 0x8000 0x8080 +.Ed +.Pp +Codeset 1 consists of the values 0x0000 - 0x007f. +.Pp +Codeset 2 consists of the values who have the bits 0x8080 set. +.Pp +Codeset 3 consists of the values 0x0080 - 0x00ff. +.Pp +Codeset 4 consists of the values 0x8000 - 0xff7f excluding the values +which have the 0x0080 bit set. +.Pp +Notice that the global +.Ar mask +is set to 0x8080, this implies that from those 2 bits the codeset can +be determined. +.Sh SEE ALSO +.Xr mklocale 1 , +.Xr setlocale 3 diff --git a/lib/libc/locale/euc.c b/lib/libc/locale/euc.c new file mode 100644 index 0000000000000..55055965ab4e1 --- /dev/null +++ b/lib/libc/locale/euc.c @@ -0,0 +1,457 @@ +/*- + * SPDX-License-Identifier: BSD-3-Clause + * + * Copyright 2013 Garrett D'Amore <garrett@damore.org> + * Copyright 2011 Nexenta Systems, Inc. All rights reserved. + * Copyright (c) 2002-2004 Tim J. Robbins. All rights reserved. + * Copyright (c) 1993 + * The Regents of the University of California. All rights reserved. + * + * This code is derived from software contributed to Berkeley by + * Paul Borman at Krystal Technologies. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#if defined(LIBC_SCCS) && !defined(lint) +static char sccsid[] = "@(#)euc.c 8.1 (Berkeley) 6/4/93"; +#endif /* LIBC_SCCS and not lint */ +#include <sys/param.h> +__FBSDID("$FreeBSD$"); + +#include <errno.h> +#include <limits.h> +#include <runetype.h> +#include <stdlib.h> +#include <string.h> +#include <wchar.h> +#include "mblocal.h" + +extern int __mb_sb_limit; + +static size_t _EUC_mbrtowc_impl(wchar_t * __restrict, const char * __restrict, + size_t, mbstate_t * __restrict, uint8_t, uint8_t, uint8_t, uint8_t); +static size_t _EUC_wcrtomb_impl(char * __restrict, wchar_t, + mbstate_t * __restrict, uint8_t, uint8_t, uint8_t, uint8_t); + +static size_t _EUC_CN_mbrtowc(wchar_t * __restrict, const char * __restrict, + size_t, mbstate_t * __restrict); +static size_t _EUC_JP_mbrtowc(wchar_t * __restrict, const char * __restrict, + size_t, mbstate_t * __restrict); +static size_t _EUC_KR_mbrtowc(wchar_t * __restrict, const char * __restrict, + size_t, mbstate_t * __restrict); +static size_t _EUC_TW_mbrtowc(wchar_t * __restrict, const char * __restrict, + size_t, mbstate_t * __restrict); + +static size_t _EUC_CN_wcrtomb(char * __restrict, wchar_t, + mbstate_t * __restrict); +static size_t _EUC_JP_wcrtomb(char * __restrict, wchar_t, + mbstate_t * __restrict); +static size_t _EUC_KR_wcrtomb(char * __restrict, wchar_t, + mbstate_t * __restrict); +static size_t _EUC_TW_wcrtomb(char * __restrict, wchar_t, + mbstate_t * __restrict); + +static size_t _EUC_CN_mbsnrtowcs(wchar_t * __restrict, + const char ** __restrict, size_t, size_t, + mbstate_t * __restrict); +static size_t _EUC_JP_mbsnrtowcs(wchar_t * __restrict, + const char ** __restrict, size_t, size_t, + mbstate_t * __restrict); +static size_t _EUC_KR_mbsnrtowcs(wchar_t * __restrict, + const char ** __restrict, size_t, size_t, + mbstate_t * __restrict); +static size_t _EUC_TW_mbsnrtowcs(wchar_t * __restrict, + const char ** __restrict, size_t, size_t, + mbstate_t * __restrict); + +static size_t _EUC_CN_wcsnrtombs(char * __restrict, + const wchar_t ** __restrict, size_t, size_t, + mbstate_t * __restrict); +static size_t _EUC_JP_wcsnrtombs(char * __restrict, + const wchar_t ** __restrict, size_t, size_t, + mbstate_t * __restrict); +static size_t _EUC_KR_wcsnrtombs(char * __restrict, + const wchar_t ** __restrict, size_t, size_t, + mbstate_t * __restrict); +static size_t _EUC_TW_wcsnrtombs(char * __restrict, + const wchar_t ** __restrict, size_t, size_t, + mbstate_t * __restrict); + +static int _EUC_mbsinit(const mbstate_t *); + +typedef struct { + wchar_t ch; + int set; + int want; +} _EucState; + +static int +_EUC_mbsinit(const mbstate_t *ps) +{ + + return (ps == NULL || ((const _EucState *)ps)->want == 0); +} + +/* + * EUC-CN uses CS0, CS1 and CS2 (4 bytes). + */ +int +_EUC_CN_init(struct xlocale_ctype *l, _RuneLocale *rl) +{ + l->__mbrtowc = _EUC_CN_mbrtowc; + l->__wcrtomb = _EUC_CN_wcrtomb; + l->__mbsnrtowcs = _EUC_CN_mbsnrtowcs; + l->__wcsnrtombs = _EUC_CN_wcsnrtombs; + l->__mbsinit = _EUC_mbsinit; + + l->runes = rl; + l->__mb_cur_max = 4; + l->__mb_sb_limit = 128; + return (0); +} + +static size_t +_EUC_CN_mbrtowc(wchar_t * __restrict pwc, const char * __restrict s, + size_t n, mbstate_t * __restrict ps) +{ + return (_EUC_mbrtowc_impl(pwc, s, n, ps, SS2, 4, 0, 0)); +} + +static size_t +_EUC_CN_mbsnrtowcs(wchar_t * __restrict dst, + const char ** __restrict src, + size_t nms, size_t len, mbstate_t * __restrict ps) +{ + return (__mbsnrtowcs_std(dst, src, nms, len, ps, _EUC_CN_mbrtowc)); +} + +static size_t +_EUC_CN_wcrtomb(char * __restrict s, wchar_t wc, + mbstate_t * __restrict ps) +{ + return (_EUC_wcrtomb_impl(s, wc, ps, SS2, 4, 0, 0)); +} + +static size_t +_EUC_CN_wcsnrtombs(char * __restrict dst, const wchar_t ** __restrict src, + size_t nwc, size_t len, mbstate_t * __restrict ps) +{ + return (__wcsnrtombs_std(dst, src, nwc, len, ps, _EUC_CN_wcrtomb)); +} + +/* + * EUC-KR uses only CS0 and CS1. + */ +int +_EUC_KR_init(struct xlocale_ctype *l, _RuneLocale *rl) +{ + l->__mbrtowc = _EUC_KR_mbrtowc; + l->__wcrtomb = _EUC_KR_wcrtomb; + l->__mbsnrtowcs = _EUC_KR_mbsnrtowcs; + l->__wcsnrtombs = _EUC_KR_wcsnrtombs; + l->__mbsinit = _EUC_mbsinit; + + l->runes = rl; + l->__mb_cur_max = 2; + l->__mb_sb_limit = 128; + return (0); +} + +static size_t +_EUC_KR_mbrtowc(wchar_t * __restrict pwc, const char * __restrict s, + size_t n, mbstate_t * __restrict ps) +{ + return (_EUC_mbrtowc_impl(pwc, s, n, ps, 0, 0, 0, 0)); +} + +static size_t +_EUC_KR_mbsnrtowcs(wchar_t * __restrict dst, + const char ** __restrict src, + size_t nms, size_t len, mbstate_t * __restrict ps) +{ + return (__mbsnrtowcs_std(dst, src, nms, len, ps, _EUC_KR_mbrtowc)); +} + +static size_t +_EUC_KR_wcrtomb(char * __restrict s, wchar_t wc, + mbstate_t * __restrict ps) +{ + return (_EUC_wcrtomb_impl(s, wc, ps, 0, 0, 0, 0)); +} + +static size_t +_EUC_KR_wcsnrtombs(char * __restrict dst, const wchar_t ** __restrict src, + size_t nwc, size_t len, mbstate_t * __restrict ps) +{ + return (__wcsnrtombs_std(dst, src, nwc, len, ps, _EUC_KR_wcrtomb)); +} + +/* + * EUC-JP uses CS0, CS1, CS2, and CS3. + */ +int +_EUC_JP_init(struct xlocale_ctype *l, _RuneLocale *rl) +{ + l->__mbrtowc = _EUC_JP_mbrtowc; + l->__wcrtomb = _EUC_JP_wcrtomb; + l->__mbsnrtowcs = _EUC_JP_mbsnrtowcs; + l->__wcsnrtombs = _EUC_JP_wcsnrtombs; + l->__mbsinit = _EUC_mbsinit; + + l->runes = rl; + l->__mb_cur_max = 3; + l->__mb_sb_limit = 128; + return (0); +} + +static size_t +_EUC_JP_mbrtowc(wchar_t * __restrict pwc, const char * __restrict s, + size_t n, mbstate_t * __restrict ps) +{ + return (_EUC_mbrtowc_impl(pwc, s, n, ps, SS2, 2, SS3, 3)); +} + +static size_t +_EUC_JP_mbsnrtowcs(wchar_t * __restrict dst, + const char ** __restrict src, + size_t nms, size_t len, mbstate_t * __restrict ps) +{ + return (__mbsnrtowcs_std(dst, src, nms, len, ps, _EUC_JP_mbrtowc)); +} + +static size_t +_EUC_JP_wcrtomb(char * __restrict s, wchar_t wc, + mbstate_t * __restrict ps) +{ + return (_EUC_wcrtomb_impl(s, wc, ps, SS2, 2, SS3, 3)); +} + +static size_t +_EUC_JP_wcsnrtombs(char * __restrict dst, const wchar_t ** __restrict src, + size_t nwc, size_t len, mbstate_t * __restrict ps) +{ + return (__wcsnrtombs_std(dst, src, nwc, len, ps, _EUC_JP_wcrtomb)); +} + +/* + * EUC-TW uses CS0, CS1, and CS2. + */ +int +_EUC_TW_init(struct xlocale_ctype *l, _RuneLocale *rl) +{ + l->__mbrtowc = _EUC_TW_mbrtowc; + l->__wcrtomb = _EUC_TW_wcrtomb; + l->__mbsnrtowcs = _EUC_TW_mbsnrtowcs; + l->__wcsnrtombs = _EUC_TW_wcsnrtombs; + l->__mbsinit = _EUC_mbsinit; + + l->runes = rl; + l->__mb_cur_max = 4; + l->__mb_sb_limit = 128; + return (0); +} + +static size_t +_EUC_TW_mbrtowc(wchar_t * __restrict pwc, const char * __restrict s, + size_t n, mbstate_t * __restrict ps) +{ + return (_EUC_mbrtowc_impl(pwc, s, n, ps, SS2, 4, 0, 0)); +} + +static size_t +_EUC_TW_mbsnrtowcs(wchar_t * __restrict dst, + const char ** __restrict src, + size_t nms, size_t len, mbstate_t * __restrict ps) +{ + return (__mbsnrtowcs_std(dst, src, nms, len, ps, _EUC_TW_mbrtowc)); +} + +static size_t +_EUC_TW_wcrtomb(char * __restrict s, wchar_t wc, + mbstate_t * __restrict ps) +{ + return (_EUC_wcrtomb_impl(s, wc, ps, SS2, 4, 0, 0)); +} + +static size_t +_EUC_TW_wcsnrtombs(char * __restrict dst, const wchar_t ** __restrict src, + size_t nwc, size_t len, mbstate_t * __restrict ps) +{ + return (__wcsnrtombs_std(dst, src, nwc, len, ps, _EUC_TW_wcrtomb)); +} + +/* + * Common EUC code. + */ + +static size_t +_EUC_mbrtowc_impl(wchar_t * __restrict pwc, const char * __restrict s, + size_t n, mbstate_t * __restrict ps, + uint8_t cs2, uint8_t cs2width, uint8_t cs3, uint8_t cs3width) +{ + _EucState *es; + int i, want; + wchar_t wc = 0; + unsigned char ch, chs; + + es = (_EucState *)ps; + + if (es->want < 0 || es->want > MB_CUR_MAX) { + errno = EINVAL; + return ((size_t)-1); + } + + if (s == NULL) { + s = ""; + n = 1; + pwc = NULL; + } + + if (n == 0) + /* Incomplete multibyte sequence */ + return ((size_t)-2); + + if (es->want == 0) { + /* Fast path for plain ASCII (CS0) */ + if (((ch = (unsigned char)*s) & 0x80) == 0) { + if (pwc != NULL) + *pwc = ch; + return (ch != '\0' ? 1 : 0); + } + + if (ch >= 0xa1) { + /* CS1 */ + want = 2; + } else if (ch == cs2) { + want = cs2width; + } else if (ch == cs3) { + want = cs3width; + } else { + errno = EILSEQ; + return ((size_t)-1); + } + + + es->want = want; + es->ch = 0; + } else { + want = es->want; + wc = es->ch; + } + + for (i = 0; i < MIN(want, n); i++) { + wc <<= 8; + chs = *s; + wc |= chs; + s++; + } + if (i < want) { + /* Incomplete multibyte sequence */ + es->want = want - i; + es->ch = wc; + errno = EILSEQ; + return ((size_t)-2); + } + if (pwc != NULL) + *pwc = wc; + es->want = 0; + return (wc == L'\0' ? 0 : want); +} + +static size_t +_EUC_wcrtomb_impl(char * __restrict s, wchar_t wc, + mbstate_t * __restrict ps, + uint8_t cs2, uint8_t cs2width, uint8_t cs3, uint8_t cs3width) +{ + _EucState *es; + int i, len; + wchar_t nm; + + es = (_EucState *)ps; + + if (es->want != 0) { + errno = EINVAL; + return ((size_t)-1); + } + + if (s == NULL) + /* Reset to initial shift state (no-op) */ + return (1); + + if ((wc & ~0x7f) == 0) { + /* Fast path for plain ASCII (CS0) */ + *s = (char)wc; + return (1); + } + + /* Determine the "length" */ + if ((unsigned)wc > 0xffffff) { + len = 4; + } else if ((unsigned)wc > 0xffff) { + len = 3; + } else if ((unsigned)wc > 0xff) { + len = 2; + } else { + len = 1; + } + + if (len > MB_CUR_MAX) { + errno = EILSEQ; + return ((size_t)-1); + } + + /* This first check excludes CS1, which is implicitly valid. */ + if ((wc < 0xa100) || (wc > 0xffff)) { + /* Check for valid CS2 or CS3 */ + nm = (wc >> ((len - 1) * 8)); + if (nm == cs2) { + if (len != cs2width) { + errno = EILSEQ; + return ((size_t)-1); + } + } else if (nm == cs3) { + if (len != cs3width) { + errno = EILSEQ; + return ((size_t)-1); + } + } else { + errno = EILSEQ; + return ((size_t)-1); + } + } + + /* Stash the bytes, least significant last */ + for (i = len - 1; i >= 0; i--) { + s[i] = (wc & 0xff); + wc >>= 8; + } + return (len); +} diff --git a/lib/libc/locale/fix_grouping.c b/lib/libc/locale/fix_grouping.c new file mode 100644 index 0000000000000..d6e74eb80b70e --- /dev/null +++ b/lib/libc/locale/fix_grouping.c @@ -0,0 +1,88 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2001 Alexey Zelkin <phantom@FreeBSD.org> + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <ctype.h> +#include <limits.h> +#include <stddef.h> + +static const char nogrouping[] = { CHAR_MAX, '\0' }; + +/* + * Internal helper used to convert grouping sequences from string + * representation into POSIX specified form, i.e. + * + * "3;3;-1" -> "\003\003\177\000" + */ + +const char * +__fix_locale_grouping_str(const char *str) +{ + char *src, *dst; + char n; + + if (str == NULL || *str == '\0') { + return nogrouping; + } + + for (src = (char*)str, dst = (char*)str; *src != '\0'; src++) { + + /* input string examples: "3;3", "3;2;-1" */ + if (*src == ';') + continue; + + if (*src == '-' && *(src+1) == '1') { + *dst++ = CHAR_MAX; + src++; + continue; + } + + if (!isdigit((unsigned char)*src)) { + /* broken grouping string */ + return nogrouping; + } + + /* assume all numbers <= 99 */ + n = *src - '0'; + if (isdigit((unsigned char)*(src+1))) { + src++; + n *= 10; + n += *src - '0'; + } + + *dst = n; + /* NOTE: assume all input started with "0" as 'no grouping' */ + if (*dst == '\0') + return (dst == (char*)str) ? nogrouping : str; + dst++; + } + *dst = '\0'; + return str; +} diff --git a/lib/libc/locale/freelocale.3 b/lib/libc/locale/freelocale.3 new file mode 100644 index 0000000000000..d3e2d4498226a --- /dev/null +++ b/lib/libc/locale/freelocale.3 @@ -0,0 +1,59 @@ +.\" Copyright (c) 2011 The FreeBSD Foundation +.\" All rights reserved. +.\" +.\" This documentation was written by David Chisnall under sponsorship from +.\" the FreeBSD Foundation. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.Dd July 26, 2016 +.Dt FREELOCALE 3 +.Os +.Sh NAME +.Nm freelocale +.Nd Frees a locale created with +.Xr duplocale 3 +or +.Xr newlocale 3 +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In locale.h +.Ft void +.Fn freelocale "locale_t locale" +.Sh DESCRIPTION +Frees a +.Fa locale_t . +This relinquishes any resources held exclusively by this locale. +Note that locales share reference-counted components, +so a call to this function is not guaranteed to free all of the components. +.Sh SEE ALSO +.Xr duplocale 3 , +.Xr localeconv 3 , +.Xr newlocale 3 , +.Xr querylocale 3 , +.Xr uselocale 3 , +.Xr xlocale 3 +.Sh STANDARDS +This function conforms to +.St -p1003.1-2008 . diff --git a/lib/libc/locale/gb18030.5 b/lib/libc/locale/gb18030.5 new file mode 100644 index 0000000000000..3a296c017844d --- /dev/null +++ b/lib/libc/locale/gb18030.5 @@ -0,0 +1,78 @@ +.\" Copyright (c) 2002, 2003 Tim J. Robbins +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd August 10, 2003 +.Dt GB18030 5 +.Os +.Sh NAME +.Nm gb18030 +.Nd "GB 18030 encoding method for Chinese text" +.Sh SYNOPSIS +.Nm ENCODING +.Qq GB18030 +.Sh DESCRIPTION +The +.Nm GB18030 +encoding implements GB 18030-2000, a PRC national standard for the encoding of +Chinese characters. +It is a superset of the older GB\ 2312-1980 and GBK encodings, +and incorporates Unicode's Unihan Extension A completely. +It also provides code space for all Unicode 3.0 code points. +.Pp +Multibyte characters in the +.Nm GB18030 +encoding can be one byte, two bytes, or +four bytes long. +There are a total of over 1.5 million code positions. +.Pp +.No GB\ 11383-1981 Pq Tn ASCII +characters are represented by single bytes in the range 0x00 to 0x7F. +.Pp +Chinese characters are represented as either two bytes or four bytes. +Characters that are represented by two bytes begin with a byte in the range +0x81-0xFE and end with a byte either in the range 0x40-0x7E or 0x80-0xFE. +.Pp +Characters that are represented by four bytes begin with a byte in the range +0x81-0xFE, have a second byte in the range 0x30-0x39, a third byte in the range +0x81-0xFE and a fourth byte in the range 0x30-0x39. +.Sh SEE ALSO +.Xr euc 5 , +.Xr gb2312 5 , +.Xr gbk 5 , +.Xr utf8 5 +.Rs +.%T "Chinese National Standard GB 18030-2000: Information Technology -- Chinese ideograms coded character set for information interchange -- Extension for the basic set" +.%D "March 2000" +.Re +.Rs +.%Q "The Unicode Consortium" +.%T "The Unicode Standard, Version 3.0" +.%D "2000" +.Re +.Sh STANDARDS +The +.Nm GB18030 +encoding is believed to be compatible with GB 18030-2000. diff --git a/lib/libc/locale/gb18030.c b/lib/libc/locale/gb18030.c new file mode 100644 index 0000000000000..b4f80678d8401 --- /dev/null +++ b/lib/libc/locale/gb18030.c @@ -0,0 +1,254 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright 2013 Garrett D'Amore <garrett@damore.org> + * Copyright 2010 Nexenta Systems, Inc. All rights reserved. + * Copyright (c) 2002-2004 Tim J. Robbins + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +/* + * PRC National Standard GB 18030-2000 encoding of Chinese text. + * + * See gb18030(5) for details. + */ + +#include <sys/param.h> +__FBSDID("$FreeBSD$"); + +#include <errno.h> +#include <runetype.h> +#include <stdlib.h> +#include <string.h> +#include <wchar.h> +#include "mblocal.h" + +static size_t _GB18030_mbrtowc(wchar_t * __restrict, const char * __restrict, + size_t, mbstate_t * __restrict); +static int _GB18030_mbsinit(const mbstate_t *); +static size_t _GB18030_wcrtomb(char * __restrict, wchar_t, + mbstate_t * __restrict); +static size_t _GB18030_mbsnrtowcs(wchar_t * __restrict, + const char ** __restrict, size_t, size_t, + mbstate_t * __restrict); +static size_t _GB18030_wcsnrtombs(char * __restrict, + const wchar_t ** __restrict, size_t, size_t, + mbstate_t * __restrict); + + +typedef struct { + int count; + u_char bytes[4]; +} _GB18030State; + +int +_GB18030_init(struct xlocale_ctype *l, _RuneLocale *rl) +{ + + l->__mbrtowc = _GB18030_mbrtowc; + l->__wcrtomb = _GB18030_wcrtomb; + l->__mbsinit = _GB18030_mbsinit; + l->__mbsnrtowcs = _GB18030_mbsnrtowcs; + l->__wcsnrtombs = _GB18030_wcsnrtombs; + l->runes = rl; + l->__mb_cur_max = 4; + l->__mb_sb_limit = 128; + + return (0); +} + +static int +_GB18030_mbsinit(const mbstate_t *ps) +{ + + return (ps == NULL || ((const _GB18030State *)ps)->count == 0); +} + +static size_t +_GB18030_mbrtowc(wchar_t * __restrict pwc, const char * __restrict s, + size_t n, mbstate_t * __restrict ps) +{ + _GB18030State *gs; + wchar_t wch; + int ch, len, ocount; + size_t ncopy; + + gs = (_GB18030State *)ps; + + if (gs->count < 0 || gs->count > sizeof(gs->bytes)) { + errno = EINVAL; + return ((size_t)-1); + } + + if (s == NULL) { + s = ""; + n = 1; + pwc = NULL; + } + + ncopy = MIN(MIN(n, MB_CUR_MAX), sizeof(gs->bytes) - gs->count); + memcpy(gs->bytes + gs->count, s, ncopy); + ocount = gs->count; + gs->count += ncopy; + s = (char *)gs->bytes; + n = gs->count; + + if (n == 0) + /* Incomplete multibyte sequence */ + return ((size_t)-2); + + /* + * Single byte: [00-7f] + * Two byte: [81-fe][40-7e,80-fe] + * Four byte: [81-fe][30-39][81-fe][30-39] + */ + ch = (unsigned char)*s++; + if (ch <= 0x7f) { + len = 1; + wch = ch; + } else if (ch >= 0x81 && ch <= 0xfe) { + wch = ch; + if (n < 2) + return ((size_t)-2); + ch = (unsigned char)*s++; + if ((ch >= 0x40 && ch <= 0x7e) || (ch >= 0x80 && ch <= 0xfe)) { + wch = (wch << 8) | ch; + len = 2; + } else if (ch >= 0x30 && ch <= 0x39) { + /* + * Strip high bit off the wide character we will + * eventually output so that it is positive when + * cast to wint_t on 32-bit twos-complement machines. + */ + wch = ((wch & 0x7f) << 8) | ch; + if (n < 3) + return ((size_t)-2); + ch = (unsigned char)*s++; + if (ch < 0x81 || ch > 0xfe) + goto ilseq; + wch = (wch << 8) | ch; + if (n < 4) + return ((size_t)-2); + ch = (unsigned char)*s++; + if (ch < 0x30 || ch > 0x39) + goto ilseq; + wch = (wch << 8) | ch; + len = 4; + } else + goto ilseq; + } else + goto ilseq; + + if (pwc != NULL) + *pwc = wch; + gs->count = 0; + return (wch == L'\0' ? 0 : len - ocount); +ilseq: + errno = EILSEQ; + return ((size_t)-1); +} + +static size_t +_GB18030_wcrtomb(char * __restrict s, wchar_t wc, mbstate_t * __restrict ps) +{ + _GB18030State *gs; + size_t len; + int c; + + gs = (_GB18030State *)ps; + + if (gs->count != 0) { + errno = EINVAL; + return ((size_t)-1); + } + + if (s == NULL) + /* Reset to initial shift state (no-op) */ + return (1); + if ((wc & ~0x7fffffff) != 0) + goto ilseq; + if (wc & 0x7f000000) { + /* Replace high bit that mbrtowc() removed. */ + wc |= 0x80000000; + c = (wc >> 24) & 0xff; + if (c < 0x81 || c > 0xfe) + goto ilseq; + *s++ = c; + c = (wc >> 16) & 0xff; + if (c < 0x30 || c > 0x39) + goto ilseq; + *s++ = c; + c = (wc >> 8) & 0xff; + if (c < 0x81 || c > 0xfe) + goto ilseq; + *s++ = c; + c = wc & 0xff; + if (c < 0x30 || c > 0x39) + goto ilseq; + *s++ = c; + len = 4; + } else if (wc & 0x00ff0000) + goto ilseq; + else if (wc & 0x0000ff00) { + c = (wc >> 8) & 0xff; + if (c < 0x81 || c > 0xfe) + goto ilseq; + *s++ = c; + c = wc & 0xff; + if (c < 0x40 || c == 0x7f || c == 0xff) + goto ilseq; + *s++ = c; + len = 2; + } else if (wc <= 0x7f) { + *s++ = wc; + len = 1; + } else + goto ilseq; + + return (len); +ilseq: + errno = EILSEQ; + return ((size_t)-1); +} + +static size_t +_GB18030_mbsnrtowcs(wchar_t * __restrict dst, + const char ** __restrict src, size_t nms, size_t len, + mbstate_t * __restrict ps) +{ + return (__mbsnrtowcs_std(dst, src, nms, len, ps, _GB18030_mbrtowc)); +} + +static size_t +_GB18030_wcsnrtombs(char * __restrict dst, + const wchar_t ** __restrict src, size_t nwc, size_t len, + mbstate_t * __restrict ps) +{ + return (__wcsnrtombs_std(dst, src, nwc, len, ps, _GB18030_wcrtomb)); +} diff --git a/lib/libc/locale/gb2312.5 b/lib/libc/locale/gb2312.5 new file mode 100644 index 0000000000000..5f1f712097d54 --- /dev/null +++ b/lib/libc/locale/gb2312.5 @@ -0,0 +1,57 @@ +.\" Copyright (c) 2003 Tim J. Robbins +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd November 7, 2003 +.Dt GB2312 5 +.Os +.Sh NAME +.Nm gb2312 +.Nd "GB2312 encoding method for Chinese text" +.Sh SYNOPSIS +.Nm ENCODING +.Qq GB2312 +.Sh DESCRIPTION +The +.Nm GB2312 +encoding implements GB\ 2312-1980, a PRC national standard +for the encoding of simplified Chinese characters. +.Pp +Multibyte characters in the GB2312 +encoding can be one byte or two bytes long. +.No GB\ 11383-1981 Pq Tn ASCII +characters are represented by single bytes in the range 0x00 to 0x7F. +Simplified Chinese characters are represented by two bytes, both in +the range 0xA1-0xFE. +.Sh SEE ALSO +.Xr euc 5 , +.Xr gb18030 5 , +.Xr gbk 5 +.Sh STANDARDS +The +.Nm GB2312 +encoding is believed to be compatible with GB\ 2312-1980. +This standard has been superseded by GB\ 18030-2000, but is still +in wide use. diff --git a/lib/libc/locale/gb2312.c b/lib/libc/locale/gb2312.c new file mode 100644 index 0000000000000..07a6c4764bb38 --- /dev/null +++ b/lib/libc/locale/gb2312.c @@ -0,0 +1,189 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright 2013 Garrett D'Amore <garrett@damore.org> + * Copyright 2010 Nexenta Systems, Inc. All rights reserved. + * Copyright (c) 2004 Tim J. Robbins. All rights reserved. + * Copyright (c) 2003 David Xu <davidxu@freebsd.org> + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/param.h> +__FBSDID("$FreeBSD$"); + +#include <errno.h> +#include <runetype.h> +#include <stdlib.h> +#include <string.h> +#include <wchar.h> +#include "mblocal.h" + +static size_t _GB2312_mbrtowc(wchar_t * __restrict, const char * __restrict, + size_t, mbstate_t * __restrict); +static int _GB2312_mbsinit(const mbstate_t *); +static size_t _GB2312_wcrtomb(char * __restrict, wchar_t, + mbstate_t * __restrict); +static size_t _GB2312_mbsnrtowcs(wchar_t * __restrict, + const char ** __restrict, size_t, size_t, + mbstate_t * __restrict); +static size_t _GB2312_wcsnrtombs(char * __restrict, + const wchar_t ** __restrict, size_t, size_t, + mbstate_t * __restrict); + + +typedef struct { + int count; + u_char bytes[2]; +} _GB2312State; + +int +_GB2312_init(struct xlocale_ctype *l, _RuneLocale *rl) +{ + + l->runes = rl; + l->__mbrtowc = _GB2312_mbrtowc; + l->__wcrtomb = _GB2312_wcrtomb; + l->__mbsinit = _GB2312_mbsinit; + l->__mbsnrtowcs = _GB2312_mbsnrtowcs; + l->__wcsnrtombs = _GB2312_wcsnrtombs; + l->__mb_cur_max = 2; + l->__mb_sb_limit = 128; + return (0); +} + +static int +_GB2312_mbsinit(const mbstate_t *ps) +{ + + return (ps == NULL || ((const _GB2312State *)ps)->count == 0); +} + +static int +_GB2312_check(const char *str, size_t n) +{ + const u_char *s = (const u_char *)str; + + if (n == 0) + /* Incomplete multibyte sequence */ + return (-2); + if (s[0] >= 0xa1 && s[0] <= 0xfe) { + if (n < 2) + /* Incomplete multibyte sequence */ + return (-2); + if (s[1] < 0xa1 || s[1] > 0xfe) + /* Invalid multibyte sequence */ + return (-1); + return (2); + } else if (s[0] & 0x80) { + /* Invalid multibyte sequence */ + return (-1); + } + return (1); +} + +static size_t +_GB2312_mbrtowc(wchar_t * __restrict pwc, const char * __restrict s, size_t n, + mbstate_t * __restrict ps) +{ + _GB2312State *gs; + wchar_t wc; + int i, len, ocount; + size_t ncopy; + + gs = (_GB2312State *)ps; + + if (gs->count < 0 || gs->count > sizeof(gs->bytes)) { + errno = EINVAL; + return ((size_t)-1); + } + + if (s == NULL) { + s = ""; + n = 1; + pwc = NULL; + } + + ncopy = MIN(MIN(n, MB_CUR_MAX), sizeof(gs->bytes) - gs->count); + memcpy(gs->bytes + gs->count, s, ncopy); + ocount = gs->count; + gs->count += ncopy; + s = (char *)gs->bytes; + n = gs->count; + + if ((len = _GB2312_check(s, n)) < 0) + return ((size_t)len); + wc = 0; + i = len; + while (i-- > 0) + wc = (wc << 8) | (unsigned char)*s++; + if (pwc != NULL) + *pwc = wc; + gs->count = 0; + return (wc == L'\0' ? 0 : len - ocount); +} + +static size_t +_GB2312_wcrtomb(char * __restrict s, wchar_t wc, mbstate_t * __restrict ps) +{ + _GB2312State *gs; + + gs = (_GB2312State *)ps; + + if (gs->count != 0) { + errno = EINVAL; + return ((size_t)-1); + } + + if (s == NULL) + /* Reset to initial shift state (no-op) */ + return (1); + if (wc & 0x8000) { + *s++ = (wc >> 8) & 0xff; + *s = wc & 0xff; + return (2); + } + *s = wc & 0xff; + return (1); +} + +static size_t +_GB2312_mbsnrtowcs(wchar_t * __restrict dst, + const char ** __restrict src, size_t nms, size_t len, + mbstate_t * __restrict ps) +{ + return (__mbsnrtowcs_std(dst, src, nms, len, ps, _GB2312_mbrtowc)); +} + +static size_t +_GB2312_wcsnrtombs(char * __restrict dst, + const wchar_t ** __restrict src, size_t nwc, size_t len, + mbstate_t * __restrict ps) +{ + return (__wcsnrtombs_std(dst, src, nwc, len, ps, _GB2312_wcrtomb)); +} diff --git a/lib/libc/locale/gbk.5 b/lib/libc/locale/gbk.5 new file mode 100644 index 0000000000000..cec22c6395408 --- /dev/null +++ b/lib/libc/locale/gbk.5 @@ -0,0 +1,63 @@ +.\" Copyright (c) 2003 Tim J. Robbins +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd August 10, 2003 +.Dt GBK 5 +.Os +.Sh NAME +.Nm gbk +.Nd "Guojia biaozhun kuozhan (GBK) encoding method for Chinese text" +.Sh SYNOPSIS +.Nm ENCODING +.Qq GBK +.Sh DESCRIPTION +GBK is a backwards-compatible extension of the GB\ 2312-1980 encoding +method for Chinese text, which adds the characters defined in the +Unified Han portion of the Unicode 2.1 standard. +.Pp +Multibyte characters in the GBK +encoding can be one byte or two bytes long. +.No GB\ 11383-1981 Pq Tn ASCII +characters are represented by single bytes in the range 0x00 to 0x7F. +Chinese characters are represented by two bytes, beginning with a byte in +the range 0x80-0xFE and ending with a byte in the range 0x40-0xFE. +.Sh SEE ALSO +.Xr euc 5 , +.Xr gb18030 5 , +.Xr gb2312 5 , +.Xr utf8 5 +.Rs +.%Q "The Unicode Consortium" +.%T "The Unicode Standard, Version 2.1" +.%D "1999" +.Re +.Rs +.%T "Chinese National Standard GB 18030-2000: Information Technology -- Chinese ideograms coded character set for information interchange -- Extension for the basic set" +.%D "March 2000" +.Re +.Sh STANDARDS +GBK is not a standard, but has been superseded by +GB\ 18030-2000. diff --git a/lib/libc/locale/gbk.c b/lib/libc/locale/gbk.c new file mode 100644 index 0000000000000..78ff623be3799 --- /dev/null +++ b/lib/libc/locale/gbk.c @@ -0,0 +1,199 @@ +/*- + * SPDX-License-Identifier: BSD-3-Clause + * + * Copyright 2013 Garrett D'Amore <garrett@damore.org> + * Copyright 2010 Nexenta Systems, Inc. All rights reserved. + * Copyright (c) 2002-2004 Tim J. Robbins. All rights reserved. + * Copyright (c) 1993 + * The Regents of the University of California. All rights reserved. + * + * This code is derived from software contributed to Berkeley by + * Paul Borman at Krystal Technologies. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <sys/types.h> +#include <errno.h> +#include <runetype.h> +#include <stdlib.h> +#include <string.h> +#include <wchar.h> +#include "mblocal.h" + +extern int __mb_sb_limit; + +static size_t _GBK_mbrtowc(wchar_t * __restrict, const char * __restrict, + size_t, mbstate_t * __restrict); +static int _GBK_mbsinit(const mbstate_t *); +static size_t _GBK_wcrtomb(char * __restrict, wchar_t, + mbstate_t * __restrict); +static size_t _GBK_mbsnrtowcs(wchar_t * __restrict, + const char ** __restrict, size_t, size_t, + mbstate_t * __restrict); +static size_t _GBK_wcsnrtombs(char * __restrict, + const wchar_t ** __restrict, size_t, size_t, + mbstate_t * __restrict); + +typedef struct { + wchar_t ch; +} _GBKState; + +int +_GBK_init(struct xlocale_ctype *l, _RuneLocale *rl) +{ + + l->__mbrtowc = _GBK_mbrtowc; + l->__wcrtomb = _GBK_wcrtomb; + l->__mbsinit = _GBK_mbsinit; + l->__mbsnrtowcs = _GBK_mbsnrtowcs; + l->__wcsnrtombs = _GBK_wcsnrtombs; + l->runes = rl; + l->__mb_cur_max = 2; + l->__mb_sb_limit = 128; + return (0); +} + +static int +_GBK_mbsinit(const mbstate_t *ps) +{ + + return (ps == NULL || ((const _GBKState *)ps)->ch == 0); +} + +static int +_gbk_check(u_int c) +{ + + c &= 0xff; + return ((c >= 0x81 && c <= 0xfe) ? 2 : 1); +} + +static size_t +_GBK_mbrtowc(wchar_t * __restrict pwc, const char * __restrict s, size_t n, + mbstate_t * __restrict ps) +{ + _GBKState *gs; + wchar_t wc; + size_t len; + + gs = (_GBKState *)ps; + + if ((gs->ch & ~0xFF) != 0) { + /* Bad conversion state. */ + errno = EINVAL; + return ((size_t)-1); + } + + if (s == NULL) { + s = ""; + n = 1; + pwc = NULL; + } + + if (n == 0) + /* Incomplete multibyte sequence */ + return ((size_t)-2); + + if (gs->ch != 0) { + if (*s == '\0') { + errno = EILSEQ; + return ((size_t)-1); + } + wc = (gs->ch << 8) | (*s & 0xFF); + if (pwc != NULL) + *pwc = wc; + gs->ch = 0; + return (1); + } + + len = (size_t)_gbk_check(*s); + wc = *s++ & 0xff; + if (len == 2) { + if (n < 2) { + /* Incomplete multibyte sequence */ + gs->ch = wc; + return ((size_t)-2); + } + if (*s == '\0') { + errno = EILSEQ; + return ((size_t)-1); + } + wc = (wc << 8) | (*s++ & 0xff); + if (pwc != NULL) + *pwc = wc; + return (2); + } else { + if (pwc != NULL) + *pwc = wc; + return (wc == L'\0' ? 0 : 1); + } +} + +static size_t +_GBK_wcrtomb(char * __restrict s, wchar_t wc, mbstate_t * __restrict ps) +{ + _GBKState *gs; + + gs = (_GBKState *)ps; + + if (gs->ch != 0) { + errno = EINVAL; + return ((size_t)-1); + } + + if (s == NULL) + /* Reset to initial shift state (no-op) */ + return (1); + if (wc & 0x8000) { + *s++ = (wc >> 8) & 0xff; + *s = wc & 0xff; + return (2); + } + *s = wc & 0xff; + return (1); +} + +static size_t +_GBK_mbsnrtowcs(wchar_t * __restrict dst, const char ** __restrict src, + size_t nms, size_t len, mbstate_t * __restrict ps) +{ + return (__mbsnrtowcs_std(dst, src, nms, len, ps, _GBK_mbrtowc)); +} + +static size_t +_GBK_wcsnrtombs(char * __restrict dst, const wchar_t ** __restrict src, + size_t nwc, size_t len, mbstate_t * __restrict ps) +{ + return (__wcsnrtombs_std(dst, src, nwc, len, ps, _GBK_wcrtomb)); +} diff --git a/lib/libc/locale/isalnum.3 b/lib/libc/locale/isalnum.3 new file mode 100644 index 0000000000000..5dcc4cc1f12b7 --- /dev/null +++ b/lib/libc/locale/isalnum.3 @@ -0,0 +1,115 @@ +.\" Copyright (c) 1991, 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" the American National Standards Committee X3, on Information +.\" Processing Systems. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)isalnum.3 8.1 (Berkeley) 6/4/93 +.\" $FreeBSD$ +.\" +.Dd July 17, 2005 +.Dt ISALNUM 3 +.Os +.Sh NAME +.Nm isalnum +.Nd alphanumeric character test +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In ctype.h +.Ft int +.Fn isalnum "int c" +.Ft int +.Fn isalnum_l "int c" "locale_t loc" +.Sh DESCRIPTION +The +.Fn isalnum +function tests for any character for which +.Xr isalpha 3 +or +.Xr isdigit 3 +is true. +The value of the argument must be representable as an +.Vt "unsigned char" +or the value of +.Dv EOF . +.Pp +In the ASCII character set, this includes the following characters +(with their numeric values shown in octal): +.Bl -column \&000_``0''__ \&000_``0''__ \&000_``0''__ \&000_``0''__ \&000_``0''__ +.It "\&060\ ``0''" Ta "061\ ``1''" Ta "062\ ``2''" Ta "063\ ``3''" Ta "064\ ``4''" +.It "\&065\ ``5''" Ta "066\ ``6''" Ta "067\ ``7''" Ta "070\ ``8''" Ta "071\ ``9''" +.It "\&101\ ``A''" Ta "102\ ``B''" Ta "103\ ``C''" Ta "104\ ``D''" Ta "105\ ``E''" +.It "\&106\ ``F''" Ta "107\ ``G''" Ta "110\ ``H''" Ta "111\ ``I''" Ta "112\ ``J''" +.It "\&113\ ``K''" Ta "114\ ``L''" Ta "115\ ``M''" Ta "116\ ``N''" Ta "117\ ``O''" +.It "\&120\ ``P''" Ta "121\ ``Q''" Ta "122\ ``R''" Ta "123\ ``S''" Ta "124\ ``T''" +.It "\&125\ ``U''" Ta "126\ ``V''" Ta "127\ ``W''" Ta "130\ ``X''" Ta "131\ ``Y''" +.It "\&132\ ``Z''" Ta "141\ ``a''" Ta "142\ ``b''" Ta "143\ ``c''" Ta "144\ ``d''" +.It "\&145\ ``e''" Ta "146\ ``f''" Ta "147\ ``g''" Ta "150\ ``h''" Ta "151\ ``i''" +.It "\&152\ ``j''" Ta "153\ ``k''" Ta "154\ ``l''" Ta "155\ ``m''" Ta "156\ ``n''" +.It "\&157\ ``o''" Ta "160\ ``p''" Ta "161\ ``q''" Ta "162\ ``r''" Ta "163\ ``s''" +.It "\&164\ ``t''" Ta "165\ ``u''" Ta "166\ ``v''" Ta "167\ ``w''" Ta "170\ ``x''" +.It "\&171\ ``y''" Ta "172\ ``z''" Ta \& Ta \& Ta \& +.El +.Pp +The +.Fn isalnum_l +function takes an explicit locale argument, whereas the +.Fn isalnum +function uses the current global or per-thread locale. +.Sh RETURN VALUES +The +.Fn isalnum +function returns zero if the character tests false and +returns non-zero if the character tests true. +.Sh COMPATIBILITY +The +.Bx 4.4 +extension of accepting arguments outside of the range of the +.Vt "unsigned char" +type in locales with large character sets is considered obsolete +and may not be supported in future releases. +The +.Fn iswalnum +function should be used instead. +.Sh SEE ALSO +.Xr ctype 3 , +.Xr isalpha 3 , +.Xr isdigit 3 , +.Xr iswalnum 3 , +.Xr xlocale 3 , +.Xr ascii 7 +.Sh STANDARDS +The +.Fn isalnum +function conforms to +.St -isoC . +The +.Fn isalnum_l +function conforms to +.St -p1003.1-2008 . diff --git a/lib/libc/locale/isalpha.3 b/lib/libc/locale/isalpha.3 new file mode 100644 index 0000000000000..1a7330f3a0d19 --- /dev/null +++ b/lib/libc/locale/isalpha.3 @@ -0,0 +1,113 @@ +.\" Copyright (c) 1991, 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" the American National Standards Committee X3, on Information +.\" Processing Systems. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)isalpha.3 8.1 (Berkeley) 6/4/93 +.\" $FreeBSD$ +.\" +.Dd July 17, 2005 +.Dt ISALPHA 3 +.Os +.Sh NAME +.Nm isalpha +.Nd alphabetic character test +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In ctype.h +.Ft int +.Fn isalpha "int c" +.Ft int +.Fn isalpha_l "int c" "locale_t loc" +.Sh DESCRIPTION +The +.Fn isalpha +function tests for any character for which +.Xr isupper 3 +or +.Xr islower 3 +is true. +The value of the argument must be representable as an +.Vt "unsigned char" +or the value of +.Dv EOF . +.Pp +In the ASCII character set, this includes the following characters +(with their numeric values shown in octal): +.Bl -column \&000_``0''__ \&000_``0''__ \&000_``0''__ \&000_``0''__ \&000_``0''__ +.It "\&101\ ``A''" Ta "102\ ``B''" Ta "103\ ``C''" Ta "104\ ``D''" Ta "105\ ``E''" +.It "\&106\ ``F''" Ta "107\ ``G''" Ta "110\ ``H''" Ta "111\ ``I''" Ta "112\ ``J''" +.It "\&113\ ``K''" Ta "114\ ``L''" Ta "115\ ``M''" Ta "116\ ``N''" Ta "117\ ``O''" +.It "\&120\ ``P''" Ta "121\ ``Q''" Ta "122\ ``R''" Ta "123\ ``S''" Ta "124\ ``T''" +.It "\&125\ ``U''" Ta "126\ ``V''" Ta "127\ ``W''" Ta "130\ ``X''" Ta "131\ ``Y''" +.It "\&132\ ``Z''" Ta "141\ ``a''" Ta "142\ ``b''" Ta "143\ ``c''" Ta "144\ ``d''" +.It "\&145\ ``e''" Ta "146\ ``f''" Ta "147\ ``g''" Ta "150\ ``h''" Ta "151\ ``i''" +.It "\&152\ ``j''" Ta "153\ ``k''" Ta "154\ ``l''" Ta "155\ ``m''" Ta "156\ ``n''" +.It "\&157\ ``o''" Ta "160\ ``p''" Ta "161\ ``q''" Ta "162\ ``r''" Ta "163\ ``s''" +.It "\&164\ ``t''" Ta "165\ ``u''" Ta "166\ ``v''" Ta "167\ ``w''" Ta "170\ ``x''" +.It "\&171\ ``y''" Ta "172\ ``z''" Ta \& Ta \& Ta \& +.El +.Pp +The +.Fn isalpha_l +function takes an explicit locale argument, whereas the +.Fn isalpha +function uses the current global or per-thread locale. +.Sh RETURN VALUES +The +.Fn isalpha +function returns zero if the character tests false and +returns non-zero if the character tests true. +.Sh COMPATIBILITY +The +.Bx 4.4 +extension of accepting arguments outside of the range of the +.Vt "unsigned char" +type in locales with large character sets is considered obsolete +and may not be supported in future releases. +The +.Fn iswalpha +function should be used instead. +.Sh SEE ALSO +.Xr ctype 3 , +.Xr islower 3 , +.Xr isupper 3 , +.Xr iswalpha 3 , +.Xr xlocale 3 , +.Xr ascii 7 +.Sh STANDARDS +The +.Fn isalpha +function conforms to +.St -isoC . +The +.Fn isalpha_l +function conforms to +.St -p1003.1-2008 . diff --git a/lib/libc/locale/isascii.3 b/lib/libc/locale/isascii.3 new file mode 100644 index 0000000000000..933293ca88954 --- /dev/null +++ b/lib/libc/locale/isascii.3 @@ -0,0 +1,53 @@ +.\" Copyright (c) 1989, 1991, 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)isascii.3 8.2 (Berkeley) 12/11/93 +.\" $FreeBSD$ +.\" +.Dd October 6, 2002 +.Dt ISASCII 3 +.Os +.Sh NAME +.Nm isascii +.Nd test for ASCII character +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In ctype.h +.Ft int +.Fn isascii "int c" +.Sh DESCRIPTION +The +.Fn isascii +function tests for an +.Tn ASCII +character, which is any character +between 0 and octal 0177 inclusive. +.Sh SEE ALSO +.Xr ctype 3 , +.Xr iswascii 3 , +.Xr ascii 7 diff --git a/lib/libc/locale/isblank.3 b/lib/libc/locale/isblank.3 new file mode 100644 index 0000000000000..b3e805ae0085f --- /dev/null +++ b/lib/libc/locale/isblank.3 @@ -0,0 +1,96 @@ +.\" Copyright (c) 1991, 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)isblank.3 8.1 (Berkeley) 6/4/93 +.\" $FreeBSD$ +.\" +.Dd July 17, 2005 +.Dt ISBLANK 3 +.Os +.Sh NAME +.Nm isblank +.Nd space or tab character test +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In ctype.h +.Ft int +.Fn isblank "int c" +.Ft int +.Fn isblank_l "int c" "locale_t loc" +.Sh DESCRIPTION +The +.Fn isblank +function tests for a space or tab character. +For any locale, this includes the following standard characters: +.Bl -column XXXX +.It Do \et Dc Ta Dq " " +.El +.Pp +In the "C" locale, a successful +.Fn isblank +test is limited to these characters only. +The value of the argument must be representable as an +.Vt "unsigned char" +or the value of +.Dv EOF . +.Pp +The +.Fn isblank_l +function takes an explicit locale argument, whereas the +.Fn isblank +function uses the current global or per-thread locale. +.Sh RETURN VALUES +The +.Fn isblank +function returns zero if the character tests false and +returns non-zero if the character tests true. +.Sh COMPATIBILITY +The +.Bx 4.4 +extension of accepting arguments outside of the range of the +.Vt "unsigned char" +type in locales with large character sets is considered obsolete +and may not be supported in future releases. +The +.Fn iswblank +function should be used instead. +.Sh SEE ALSO +.Xr ctype 3 , +.Xr iswblank 3 , +.Xr xlocale 3 , +.Xr ascii 7 +.Sh STANDARDS +The +.Fn isblank +function +conforms to +.St -isoC-99 . +The +.Fn isblank_l +function conforms to +.St -p1003.1-2008 . diff --git a/lib/libc/locale/iscntrl.3 b/lib/libc/locale/iscntrl.3 new file mode 100644 index 0000000000000..717e1e5682397 --- /dev/null +++ b/lib/libc/locale/iscntrl.3 @@ -0,0 +1,103 @@ +.\" Copyright (c) 1991, 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" the American National Standards Committee X3, on Information +.\" Processing Systems. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)iscntrl.3 8.1 (Berkeley) 6/4/93 +.\" $FreeBSD$ +.\" +.Dd July 17, 2005 +.Dt ISCNTRL 3 +.Os +.Sh NAME +.Nm iscntrl +.Nd control character test +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In ctype.h +.Ft int +.Fn iscntrl "int c" +.Ft int +.Fn iscntrl_l "int c" "locale_t loc" +.Sh DESCRIPTION +The +.Fn iscntrl +function tests for any control character. +The value of the argument must be representable as an +.Vt "unsigned char" +or the value of +.Dv EOF . +.Pp +In the ASCII character set, this includes the following characters +(with their numeric values shown in octal): +.Bl -column \&000_``0''__ \&000_``0''__ \&000_``0''__ \&000_``0''__ \&000_``0''__ +.It "\&000\ NUL" Ta "001\ SOH" Ta "002\ STX" Ta "003\ ETX" Ta "004\ EOT" +.It "\&005\ ENQ" Ta "006\ ACK" Ta "007\ BEL" Ta "010\ BS" Ta "011\ HT" +.It "\&012\ NL" Ta "013\ VT" Ta "014\ NP" Ta "015\ CR" Ta "016\ SO" +.It "\&017\ SI" Ta "020\ DLE" Ta "021\ DC1" Ta "022\ DC2" Ta "023\ DC3" +.It "\&024\ DC4" Ta "025\ NAK" Ta "026\ SYN" Ta "027\ ETB" Ta "030\ CAN" +.It "\&031\ EM" Ta "032\ SUB" Ta "033\ ESC" Ta "034\ FS" Ta "035\ GS" +.It "\&036\ RS" Ta "037\ US" Ta "177\ DEL" Ta \& Ta \& +.El +.Pp +The +.Fn iscntrl_l +function takes an explicit locale argument, whereas the +.Fn iscntrl +function uses the current global or per-thread locale. +.Sh RETURN VALUES +The +.Fn iscntrl +function returns zero if the character tests false and +returns non-zero if the character tests true. +.Sh COMPATIBILITY +The +.Bx 4.4 +extension of accepting arguments outside of the range of the +.Vt "unsigned char" +type in locales with large character sets is considered obsolete +and may not be supported in future releases. +The +.Fn iswcntrl +function should be used instead. +.Sh SEE ALSO +.Xr ctype 3 , +.Xr iswcntrl 3 , +.Xr xlocale 3 , +.Xr ascii 7 +.Sh STANDARDS +The +.Fn iscntrl +function conforms to +.St -isoC . +The +.Fn iscntrl_l +function conforms to +.St -p1003.1-2008 . diff --git a/lib/libc/locale/isctype.c b/lib/libc/locale/isctype.c new file mode 100644 index 0000000000000..3731508ba15a8 --- /dev/null +++ b/lib/libc/locale/isctype.c @@ -0,0 +1,208 @@ +/*- + * SPDX-License-Identifier: BSD-3-Clause + * + * Copyright (c) 1989, 1993 + * The Regents of the University of California. All rights reserved. + * (c) UNIX System Laboratories, Inc. + * All or some portions of this file are derived from material licensed + * to the University of California by American Telephone and Telegraph + * Co. or Unix System Laboratories, Inc. and are reproduced herein with + * the permission of UNIX System Laboratories, Inc. + * + * This code is derived from software contributed to Berkeley by + * Paul Borman at Krystal Technologies. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#if defined(LIBC_SCCS) && !defined(lint) +static char sccsid[] = "@(#)isctype.c 8.3 (Berkeley) 2/24/94"; +#endif /* LIBC_SCCS and not lint */ +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <ctype.h> + +#undef digittoint +int +digittoint(int c) +{ + return (__sbmaskrune(c, 0xFF)); +} + +#undef isalnum +int +isalnum(int c) +{ + return (__sbistype(c, _CTYPE_A|_CTYPE_N)); +} + +#undef isalpha +int +isalpha(int c) +{ + return (__sbistype(c, _CTYPE_A)); +} + +#undef isascii +int +isascii(int c) +{ + return ((c & ~0x7F) == 0); +} + +#undef isblank +int +isblank(int c) +{ + return (__sbistype(c, _CTYPE_B)); +} + +#undef iscntrl +int +iscntrl(int c) +{ + return (__sbistype(c, _CTYPE_C)); +} + +#undef isdigit +int +isdigit(int c) +{ + return (__isctype(c, _CTYPE_D)); +} + +#undef isgraph +int +isgraph(int c) +{ + return (__sbistype(c, _CTYPE_G)); +} + +#undef ishexnumber +int +ishexnumber(int c) +{ + return (__sbistype(c, _CTYPE_X)); +} + +#undef isideogram +int +isideogram(int c) +{ + return (__sbistype(c, _CTYPE_I)); +} + +#undef islower +int +islower(int c) +{ + return (__sbistype(c, _CTYPE_L)); +} + +#undef isnumber +int +isnumber(int c) +{ + return (__sbistype(c, _CTYPE_N)); +} + +#undef isphonogram +int +isphonogram(int c) +{ + return (__sbistype(c, _CTYPE_Q)); +} + +#undef isprint +int +isprint(int c) +{ + return (__sbistype(c, _CTYPE_R)); +} + +#undef ispunct +int +ispunct(int c) +{ + return (__sbistype(c, _CTYPE_P)); +} + +#undef isrune +int +isrune(int c) +{ + return (__sbistype(c, 0xFFFFFF00L)); +} + +#undef isspace +int +isspace(int c) +{ + return (__sbistype(c, _CTYPE_S)); +} + +#undef isspecial +int +isspecial(int c) +{ + return (__sbistype(c, _CTYPE_T)); +} + +#undef isupper +int +isupper(int c) +{ + return (__sbistype(c, _CTYPE_U)); +} + +#undef isxdigit +int +isxdigit(int c) +{ + return (__isctype(c, _CTYPE_X)); +} + +#undef toascii +int +toascii(int c) +{ + return (c & 0x7F); +} + +#undef tolower +int +tolower(int c) +{ + return (__sbtolower(c)); +} + +#undef toupper +int +toupper(int c) +{ + return (__sbtoupper(c)); +} + diff --git a/lib/libc/locale/isdigit.3 b/lib/libc/locale/isdigit.3 new file mode 100644 index 0000000000000..d1a75465e277b --- /dev/null +++ b/lib/libc/locale/isdigit.3 @@ -0,0 +1,114 @@ +.\" Copyright (c) 1991, 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" the American National Standards Committee X3, on Information +.\" Processing Systems. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)isdigit.3 8.1 (Berkeley) 6/4/93 +.\" $FreeBSD$ +.\" +.Dd May 4, 2007 +.Dt ISDIGIT 3 +.Os +.Sh NAME +.Nm isdigit , +.Nm isnumber +.Nd decimal-digit character test +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In ctype.h +.Ft int +.Fn isdigit "int c" +.Ft int +.Fn isnumber "int c" +.Ft int +.Fn isdigit_l "int c" "locale_t loc" +.Ft int +.Fn isnumber_l "int c" "locale_t loc" +.Sh DESCRIPTION +The +.Fn isdigit +function tests for a decimal digit character. +Regardless of locale, this includes the following characters only: +.Bl -column \&``0''______ \&``0''______ \&``0''______ \&``0''______ \&``0''______ +.It "\&``0''" Ta "``1''" Ta "``2''" Ta "``3''" Ta "``4''" +.It "\&``5''" Ta "``6''" Ta "``7''" Ta "``8''" Ta "``9''" +.El +.Pp +The +.Fn isnumber +function behaves similarly to +.Fn isdigit , +but may recognize additional characters, depending on the current locale +setting. +.Pp +The value of the argument must be representable as an +.Vt "unsigned char" +or the value of +.Dv EOF . +.Pp +The _l-suffixed versions take an explicit locale argument, whereas the +non-suffixed versions use the current global or per-thread locale. +.Sh RETURN VALUES +The +.Fn isdigit +and +.Fn isnumber +functions return zero if the character tests false and +return non-zero if the character tests true. +.Sh COMPATIBILITY +The +.Bx 4.4 +extension of accepting arguments outside of the range of the +.Vt "unsigned char" +type in locales with large character sets is considered obsolete +and may not be supported in future releases. +The +.Fn iswdigit +function should be used instead. +.Sh SEE ALSO +.Xr ctype 3 , +.Xr iswdigit 3 , +.Xr multibyte 3 , +.Xr xlocale 3 , +.Xr ascii 7 +.Sh STANDARDS +The +.Fn isdigit +function conforms to +.St -isoC . +The +.Fn isdigit_l +function conforms to +.St -p1003.1-2008 . +.Sh HISTORY +The +.Fn isnumber +function appeared in +.Bx 4.4 . diff --git a/lib/libc/locale/isgraph.3 b/lib/libc/locale/isgraph.3 new file mode 100644 index 0000000000000..b3c078cea0796 --- /dev/null +++ b/lib/libc/locale/isgraph.3 @@ -0,0 +1,119 @@ +.\" Copyright (c) 1991, 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" the American National Standards Committee X3, on Information +.\" Processing Systems. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)isgraph.3 8.2 (Berkeley) 12/11/93 +.\" $FreeBSD$ +.\" +.Dd July 30, 2012 +.Dt ISGRAPH 3 +.Os +.Sh NAME +.Nm isgraph +.Nd printing character test (space character exclusive) +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In ctype.h +.Ft int +.Fn isgraph "int c" +.Ft int +.Fn isgraph_l "int c" "locale_t loc" +.Sh DESCRIPTION +The +.Fn isgraph +function tests for any printing character except space +.Pq Ql "\~" +and other +locale-specific space-like characters. +The value of the argument must be representable as an +.Vt "unsigned char" +or the value of +.Dv EOF . +.Pp +In the ASCII character set, this includes the following characters +(with their numeric values shown in octal): +.Bl -column \&000_``0''__ \&000_``0''__ \&000_``0''__ \&000_``0''__ \&000_``0''__ +.It "\&041\ ``!''" Ta "042\ ``""''" Ta "043\ ``#''" Ta "044\ ``$''" Ta "045\ ``%''" +.It "\&046\ ``&''" Ta "047\ ``'''" Ta "050\ ``(''" Ta "051\ ``)''" Ta "052\ ``*''" +.It "\&053\ ``+''" Ta "054\ ``,''" Ta "055\ ``-''" Ta "056\ ``.''" Ta "057\ ``/''" +.It "\&060\ ``0''" Ta "061\ ``1''" Ta "062\ ``2''" Ta "063\ ``3''" Ta "064\ ``4''" +.It "\&065\ ``5''" Ta "066\ ``6''" Ta "067\ ``7''" Ta "070\ ``8''" Ta "071\ ``9''" +.It "\&072\ ``:''" Ta "073\ ``;''" Ta "074\ ``<''" Ta "075\ ``=''" Ta "076\ ``>''" +.It "\&077\ ``?''" Ta "100\ ``@''" Ta "101\ ``A''" Ta "102\ ``B''" Ta "103\ ``C''" +.It "\&104\ ``D''" Ta "105\ ``E''" Ta "106\ ``F''" Ta "107\ ``G''" Ta "110\ ``H''" +.It "\&111\ ``I''" Ta "112\ ``J''" Ta "113\ ``K''" Ta "114\ ``L''" Ta "115\ ``M''" +.It "\&116\ ``N''" Ta "117\ ``O''" Ta "120\ ``P''" Ta "121\ ``Q''" Ta "122\ ``R''" +.It "\&123\ ``S''" Ta "124\ ``T''" Ta "125\ ``U''" Ta "126\ ``V''" Ta "127\ ``W''" +.It "\&130\ ``X''" Ta "131\ ``Y''" Ta "132\ ``Z''" Ta "133\ ``[''" Ta "134\ ``\e\|''" +.It "\&135\ ``]''" Ta "136\ ``^''" Ta "137\ ``_''" Ta "140\ ```''" Ta "141\ ``a''" +.It "\&142\ ``b''" Ta "143\ ``c''" Ta "144\ ``d''" Ta "145\ ``e''" Ta "146\ ``f''" +.It "\&147\ ``g''" Ta "150\ ``h''" Ta "151\ ``i''" Ta "152\ ``j''" Ta "153\ ``k''" +.It "\&154\ ``l''" Ta "155\ ``m''" Ta "156\ ``n''" Ta "157\ ``o''" Ta "160\ ``p''" +.It "\&161\ ``q''" Ta "162\ ``r''" Ta "163\ ``s''" Ta "164\ ``t''" Ta "165\ ``u''" +.It "\&166\ ``v''" Ta "167\ ``w''" Ta "170\ ``x''" Ta "171\ ``y''" Ta "172\ ``z''" +.It "\&173\ ``{''" Ta "174\ ``|''" Ta "175\ ``}''" Ta "176\ ``~''" Ta \& +.El +.Pp +The +.Fn isgraph_l +function takes an explicit locale argument, whereas the +.Fn isgraph +function uses the current global or per-thread locale. +.Sh RETURN VALUES +The +.Fn isgraph +and +.Fn isgraph_l +functions return zero if the character tests false and +return non-zero if the character tests true. +.Sh COMPATIBILITY +The +.Bx 4.4 +extension of accepting arguments outside of the range of the +.Vt "unsigned char" +type in locales with large character sets is considered obsolete +and may not be supported in future releases. +The +.Fn iswgraph +function should be used instead. +.Sh SEE ALSO +.Xr ctype 3 , +.Xr iswgraph 3 , +.Xr ascii 7 +.Sh STANDARDS +The +.Fn isgraph +function conforms to +.St -isoC . +The +.Fn isgraph_l +function conforms to +.St -p1003.1-2008 . diff --git a/lib/libc/locale/isideogram.3 b/lib/libc/locale/isideogram.3 new file mode 100644 index 0000000000000..cbaa625e91dff --- /dev/null +++ b/lib/libc/locale/isideogram.3 @@ -0,0 +1,57 @@ +.\" +.\" Copyright (c) 2004 Tim J. Robbins +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd March 30, 2004 +.Dt ISIDEOGRAM 3 +.Os +.Sh NAME +.Nm isideogram +.Nd ideographic character test +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In ctype.h +.Ft int +.Fn isideogram "int c" +.Sh DESCRIPTION +The +.Fn isideogram +function tests for an ideographic character. +.Sh RETURN VALUES +The +.Fn isideogram +function returns zero if the character tests false and +returns non-zero if the character tests true. +.Sh SEE ALSO +.Xr ctype 3 , +.Xr isphonogram 3 , +.Xr iswideogram 3 +.Sh HISTORY +The +.Fn isideogram +function appeared in +.Bx 4.4 . diff --git a/lib/libc/locale/islower.3 b/lib/libc/locale/islower.3 new file mode 100644 index 0000000000000..7f31ad9c7c5ac --- /dev/null +++ b/lib/libc/locale/islower.3 @@ -0,0 +1,103 @@ +.\" Copyright (c) 1991, 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" the American National Standards Committee X3, on Information +.\" Processing Systems. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)islower.3 8.1 (Berkeley) 6/4/93 +.\" $FreeBSD$ +.\" +.Dd July 30, 2012 +.Dt ISLOWER 3 +.Os +.Sh NAME +.Nm islower +.Nd lower-case character test +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In ctype.h +.Ft int +.Fn islower "int c" +.Ft int +.Fn islower_l "int c" "locale_t loc" +.Sh DESCRIPTION +The +.Fn islower +function tests for any lower-case letters. +The value of the argument must be representable as an +.Vt "unsigned char" +or the value of +.Dv EOF . +.Pp +In the ASCII character set, this includes the following characters +(with their numeric values shown in octal): +.Bl -column \&000_``0''__ \&000_``0''__ \&000_``0''__ \&000_``0''__ \&000_``0''__ +.It "\&141\ ``a''" Ta "142\ ``b''" Ta "143\ ``c''" Ta "144\ ``d''" Ta "145\ ``e''" +.It "\&146\ ``f''" Ta "147\ ``g''" Ta "150\ ``h''" Ta "151\ ``i''" Ta "152\ ``j''" +.It "\&153\ ``k''" Ta "154\ ``l''" Ta "155\ ``m''" Ta "156\ ``n''" Ta "157\ ``o''" +.It "\&160\ ``p''" Ta "161\ ``q''" Ta "162\ ``r''" Ta "163\ ``s''" Ta "164\ ``t''" +.It "\&165\ ``u''" Ta "166\ ``v''" Ta "167\ ``w''" Ta "170\ ``x''" Ta "171\ ``y''" +.It "\&172\ ``z''" Ta \& Ta \& Ta \& Ta \& +.El +The +.Fn islower_l +function takes an explicit locale argument, whereas the +.Fn islower +function uses the current global or per-thread locale. +.Sh RETURN VALUES +The +.Fn islower +and +.Fn islower_l +functions return zero if the character tests false and +return non-zero if the character tests true. +.Sh COMPATIBILITY +The +.Bx 4.4 +extension of accepting arguments outside of the range of the +.Vt "unsigned char" +type in locales with large character sets is considered obsolete +and may not be supported in future releases. +The +.Fn iswlower +function should be used instead. +.Sh SEE ALSO +.Xr ctype 3 , +.Xr iswlower 3 , +.Xr tolower 3 , +.Xr ascii 7 +.Sh STANDARDS +The +.Fn islower +function conforms to +.St -isoC . +The +.Fn islower_l +function conforms to +.St -p1003.1-2008 . diff --git a/lib/libc/locale/isphonogram.3 b/lib/libc/locale/isphonogram.3 new file mode 100644 index 0000000000000..b0d82c428cba3 --- /dev/null +++ b/lib/libc/locale/isphonogram.3 @@ -0,0 +1,57 @@ +.\" +.\" Copyright (c) 2004 Tim J. Robbins +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd March 30, 2004 +.Dt ISPHONOGRAM 3 +.Os +.Sh NAME +.Nm isphonogram +.Nd phonographic character test +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In ctype.h +.Ft int +.Fn isphonogram "int c" +.Sh DESCRIPTION +The +.Fn isphonogram +function tests for a phonographic character. +.Sh RETURN VALUES +The +.Fn isphonogram +function returns zero if the character tests false and +returns non-zero if the character tests true. +.Sh SEE ALSO +.Xr ctype 3 , +.Xr isideogram 3 , +.Xr iswphonogram 3 +.Sh HISTORY +The +.Fn isphonogram +function appeared in +.Bx 4.4 . diff --git a/lib/libc/locale/isprint.3 b/lib/libc/locale/isprint.3 new file mode 100644 index 0000000000000..4a374304dad2f --- /dev/null +++ b/lib/libc/locale/isprint.3 @@ -0,0 +1,103 @@ +.\" Copyright (c) 1991, 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" the American National Standards Committee X3, on Information +.\" Processing Systems. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)isprint.3 8.1 (Berkeley) 6/4/93 +.\" $FreeBSD$ +.\" +.Dd July 17, 2005 +.Dt ISPRINT 3 +.Os +.Sh NAME +.Nm isprint +.Nd printing character test (space character inclusive) +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In ctype.h +.Ft int +.Fn isprint "int c" +.Sh DESCRIPTION +The +.Fn isprint +function tests for any printing character, including space +.Pq Ql "\ " . +The value of the argument must be representable as an +.Vt "unsigned char" +or the value of +.Dv EOF . +.Pp +In the ASCII character set, this includes the following characters +(with their numeric values shown in octal): +.Bl -column \&000_``0''__ \&000_``0''__ \&000_``0''__ \&000_``0''__ \&000_``0''__ +.It "\&040\ sp" Ta "041\ ``!''" Ta "042\ ``""''" Ta "043\ ``#''" Ta "044\ ``$''" +.It "\&045\ ``%''" Ta "046\ ``&''" Ta "047\ ``'''" Ta "050\ ``(''" Ta "051\ ``)''" +.It "\&052\ ``*''" Ta "053\ ``+''" Ta "054\ ``,''" Ta "055\ ``-''" Ta "056\ ``.''" +.It "\&057\ ``/''" Ta "060\ ``0''" Ta "061\ ``1''" Ta "062\ ``2''" Ta "063\ ``3''" +.It "\&064\ ``4''" Ta "065\ ``5''" Ta "066\ ``6''" Ta "067\ ``7''" Ta "070\ ``8''" +.It "\&071\ ``9''" Ta "072\ ``:''" Ta "073\ ``;''" Ta "074\ ``<''" Ta "075\ ``=''" +.It "\&076\ ``>''" Ta "077\ ``?''" Ta "100\ ``@''" Ta "101\ ``A''" Ta "102\ ``B''" +.It "\&103\ ``C''" Ta "104\ ``D''" Ta "105\ ``E''" Ta "106\ ``F''" Ta "107\ ``G''" +.It "\&110\ ``H''" Ta "111\ ``I''" Ta "112\ ``J''" Ta "113\ ``K''" Ta "114\ ``L''" +.It "\&115\ ``M''" Ta "116\ ``N''" Ta "117\ ``O''" Ta "120\ ``P''" Ta "121\ ``Q''" +.It "\&122\ ``R''" Ta "123\ ``S''" Ta "124\ ``T''" Ta "125\ ``U''" Ta "126\ ``V''" +.It "\&127\ ``W''" Ta "130\ ``X''" Ta "131\ ``Y''" Ta "132\ ``Z''" Ta "133\ ``[''" +.It "\&134\ ``\e\|''" Ta "135\ ``]''" Ta "136\ ``^''" Ta "137\ ``_''" Ta "140\ ```''" +.It "\&141\ ``a''" Ta "142\ ``b''" Ta "143\ ``c''" Ta "144\ ``d''" Ta "145\ ``e''" +.It "\&146\ ``f''" Ta "147\ ``g''" Ta "150\ ``h''" Ta "151\ ``i''" Ta "152\ ``j''" +.It "\&153\ ``k''" Ta "154\ ``l''" Ta "155\ ``m''" Ta "156\ ``n''" Ta "157\ ``o''" +.It "\&160\ ``p''" Ta "161\ ``q''" Ta "162\ ``r''" Ta "163\ ``s''" Ta "164\ ``t''" +.It "\&165\ ``u''" Ta "166\ ``v''" Ta "167\ ``w''" Ta "170\ ``x''" Ta "171\ ``y''" +.It "\&172\ ``z''" Ta "173\ ``{''" Ta "174\ ``|''" Ta "175\ ``}''" Ta "176\ ``~''" +.El +.Sh RETURN VALUES +The +.Fn isprint +function returns zero if the character tests false and +returns non-zero if the character tests true. +.Sh COMPATIBILITY +The +.Bx 4.4 +extension of accepting arguments outside of the range of the +.Vt "unsigned char" +type in locales with large character sets is considered obsolete +and may not be supported in future releases. +The +.Fn iswprint +function should be used instead. +.Sh SEE ALSO +.Xr ctype 3 , +.Xr iswprint 3 , +.Xr ascii 7 +.Sh STANDARDS +The +.Fn isprint +function conforms to +.St -isoC . diff --git a/lib/libc/locale/ispunct.3 b/lib/libc/locale/ispunct.3 new file mode 100644 index 0000000000000..7ebfc06c4c87d --- /dev/null +++ b/lib/libc/locale/ispunct.3 @@ -0,0 +1,109 @@ +.\" Copyright (c) 1991, 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" the American National Standards Committee X3, on Information +.\" Processing Systems. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)ispunct.3 8.1 (Berkeley) 6/4/93 +.\" $FreeBSD$ +.\" +.Dd July 30, 2012 +.Dt ISPUNCT 3 +.Os +.Sh NAME +.Nm ispunct +.Nd punctuation character test +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In ctype.h +.Ft int +.Fn ispunct "int c" +.Ft int +.Fn ispunct_l "int c" "locale_t loc" +.Sh DESCRIPTION +The +.Fn ispunct +function tests for any printing character except for space +.Pq Ql "\ " +or a +character for which +.Xr isalnum 3 +is true. +The value of the argument must be representable as an +.Vt "unsigned char" +or the value of +.Dv EOF . +.Pp +In the ASCII character set, this includes the following characters +(with their numeric values shown in octal): +.Bl -column \&000_``0''__ \&000_``0''__ \&000_``0''__ \&000_``0''__ \&000_``0''__ +.It "\&041\ ``!''" Ta "042\ ``""''" Ta "043\ ``#''" Ta "044\ ``$''" Ta "045\ ``%''" +.It "\&046\ ``&''" Ta "047\ ``'''" Ta "050\ ``(''" Ta "051\ ``)''" Ta "052\ ``*''" +.It "\&053\ ``+''" Ta "054\ ``,''" Ta "055\ ``-''" Ta "056\ ``.''" Ta "057\ ``/''" +.It "\&072\ ``:''" Ta "073\ ``;''" Ta "074\ ``<''" Ta "075\ ``=''" Ta "076\ ``>''" +.It "\&077\ ``?''" Ta "100\ ``@''" Ta "133\ ``[''" Ta "134\ ``\e\|''" Ta "135\ ``]''" +.It "\&136\ ``^''" Ta "137\ ``_''" Ta "140\ ```''" Ta "173\ ``{''" Ta "174\ ``|''" +.It "\&175\ ``}''" Ta "176\ ``~''" Ta \& Ta \& Ta \& +.El +.Pp +The +.Fn ispunct_l +function takes an explicit locale argument, whereas the +.Fn ispunct +function uses the current global or per-thread locale. +.Sh RETURN VALUES +The +.Fn ispunct +and +.Fn ispunct_l +functions return zero if the character tests false and +return non-zero if the character tests true. +.Sh COMPATIBILITY +The +.Bx 4.4 +extension of accepting arguments outside of the range of the +.Vt "unsigned char" +type in locales with large character sets is considered obsolete +and may not be supported in future releases. +The +.Fn iswpunct +function should be used instead. +.Sh SEE ALSO +.Xr ctype 3 , +.Xr iswpunct 3 , +.Xr ascii 7 +.Sh STANDARDS +The +.Fn ispunct +function conforms to +.St -isoC . +The +.Fn ispunct_l +function conforms to +.St -p1003.1-2008 . diff --git a/lib/libc/locale/isrune.3 b/lib/libc/locale/isrune.3 new file mode 100644 index 0000000000000..424c367d20144 --- /dev/null +++ b/lib/libc/locale/isrune.3 @@ -0,0 +1,63 @@ +.\" +.\" Copyright (c) 2004 Tim J. Robbins +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd March 30, 2004 +.Dt ISRUNE 3 +.Os +.Sh NAME +.Nm isrune +.Nd valid character test +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In ctype.h +.Ft int +.Fn isrune "int c" +.Sh DESCRIPTION +The +.Fn isrune +function tests for any character that is valid in the current +character set. +In the +.Tn ASCII +character set, this is equivalent to +.Fn isascii . +.Sh RETURN VALUES +The +.Fn isrune +function returns zero if the character tests false and +returns non-zero if the character tests true. +.Sh SEE ALSO +.Xr ctype 3 , +.Xr isascii 3 , +.Xr iswrune 3 , +.Xr ascii 7 +.Sh HISTORY +The +.Fn isrune +function appeared in +.Bx 4.4 . diff --git a/lib/libc/locale/isspace.3 b/lib/libc/locale/isspace.3 new file mode 100644 index 0000000000000..9c120c41917ac --- /dev/null +++ b/lib/libc/locale/isspace.3 @@ -0,0 +1,101 @@ +.\" Copyright (c) 1991, 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" the American National Standards Committee X3, on Information +.\" Processing Systems. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)isspace.3 8.1 (Berkeley) 6/4/93 +.\" $FreeBSD$ +.\" +.Dd July 30, 2012 +.Dt ISSPACE 3 +.Os +.Sh NAME +.Nm isspace +.Nd white-space character test +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In ctype.h +.Ft int +.Fn isspace "int c" +.Ft int +.Fn isspace_l "int c" "locale_t loc" +.Sh DESCRIPTION +The +.Fn isspace +function tests for white-space characters. +For any locale, this includes the following standard characters: +.Bl -column \&`\et''___ \&``\et''___ \&``\et''___ \&``\et''___ \&``\et''___ \&``\et''___ +.It "\&``\et''" Ta "``\en''" Ta "``\ev''" Ta "``\ef''" Ta "``\er''" Ta "`` ''" +.El +.Pp +In the "C" locale, +.Fn isspace +returns non-zero for these characters only. +The value of the argument must be representable as an +.Vt "unsigned char" +or the value of +.Dv EOF . +.Pp +The +.Fn isspace_l +function takes an explicit locale argument, whereas the +.Fn isspace +function uses the current global or per-thread locale. +.Sh RETURN VALUES +The +.Fn isspace +and +.Fn isspace_l +functions return zero if the character tests false and +return non-zero if the character tests true. +.Sh COMPATIBILITY +The +.Bx 4.4 +extension of accepting arguments outside of the range of the +.Vt "unsigned char" +type in locales with large character sets is considered obsolete +and may not be supported in future releases. +The +.Fn iswspace +function should be used instead. +.Sh SEE ALSO +.Xr ctype 3 , +.Xr iswspace 3 , +.Xr multibyte 3 , +.Xr ascii 7 +.Sh STANDARDS +The +.Fn isspace +function conforms to +.St -isoC . +The +.Fn isspace_l +function conforms to +.St -p1003.1-2008 . diff --git a/lib/libc/locale/isspecial.3 b/lib/libc/locale/isspecial.3 new file mode 100644 index 0000000000000..de361d28f96ba --- /dev/null +++ b/lib/libc/locale/isspecial.3 @@ -0,0 +1,56 @@ +.\" +.\" Copyright (c) 2004 Tim J. Robbins +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd March 30, 2004 +.Dt ISSPECIAL 3 +.Os +.Sh NAME +.Nm isspecial +.Nd special character test +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In ctype.h +.Ft int +.Fn isspecial "int c" +.Sh DESCRIPTION +The +.Fn isspecial +function tests for a special character. +.Sh RETURN VALUES +The +.Fn isspecial +function returns zero if the character tests false and +returns non-zero if the character tests true. +.Sh SEE ALSO +.Xr ctype 3 , +.Xr iswspecial 3 +.Sh HISTORY +The +.Fn isspecial +function appeared in +.Bx 4.4 . diff --git a/lib/libc/locale/isupper.3 b/lib/libc/locale/isupper.3 new file mode 100644 index 0000000000000..891ba2cffa406 --- /dev/null +++ b/lib/libc/locale/isupper.3 @@ -0,0 +1,90 @@ +.\" Copyright (c) 1991, 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" the American National Standards Committee X3, on Information +.\" Processing Systems. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)isupper.3 8.1 (Berkeley) 6/4/93 +.\" $FreeBSD$ +.\" +.Dd July 17, 2005 +.Dt ISUPPER 3 +.Os +.Sh NAME +.Nm isupper +.Nd upper-case character test +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In ctype.h +.Ft int +.Fn isupper "int c" +.Sh DESCRIPTION +The +.Fn isupper +function tests for any upper-case letter. +The value of the argument must be representable as an +.Vt "unsigned char" +or the value of +.Dv EOF . +.Pp +In the ASCII character set, this includes the following characters +(with their numeric values shown in octal): +.Bl -column \&000_``0''__ \&000_``0''__ \&000_``0''__ \&000_``0''__ \&000_``0''__ +.It "\&101\ ``A''" Ta "102\ ``B''" Ta "103\ ``C''" Ta "104\ ``D''" Ta "105\ ``E''" +.It "\&106\ ``F''" Ta "107\ ``G''" Ta "110\ ``H''" Ta "111\ ``I''" Ta "112\ ``J''" +.It "\&113\ ``K''" Ta "114\ ``L''" Ta "115\ ``M''" Ta "116\ ``N''" Ta "117\ ``O''" +.It "\&120\ ``P''" Ta "121\ ``Q''" Ta "122\ ``R''" Ta "123\ ``S''" Ta "124\ ``T''" +.It "\&125\ ``U''" Ta "126\ ``V''" Ta "127\ ``W''" Ta "130\ ``X''" Ta "131\ ``Y''" +.It "\&132\ ``Z''" Ta \& Ta \& Ta \& Ta \& +.El +.Sh RETURN VALUES +The +.Fn isupper +function returns zero if the character tests false and +returns non-zero if the character tests true. +.Sh COMPATIBILITY +The +.Bx 4.4 +extension of accepting arguments outside of the range of the +.Vt "unsigned char" +type in locales with large character sets is considered obsolete +and may not be supported in future releases. +The +.Fn iswupper +function should be used instead. +.Sh SEE ALSO +.Xr ctype 3 , +.Xr iswupper 3 , +.Xr toupper 3 , +.Xr ascii 7 +.Sh STANDARDS +The +.Fn isupper +function conforms to +.St -isoC . diff --git a/lib/libc/locale/iswalnum.3 b/lib/libc/locale/iswalnum.3 new file mode 100644 index 0000000000000..9004b34ea65eb --- /dev/null +++ b/lib/libc/locale/iswalnum.3 @@ -0,0 +1,158 @@ +.\" $NetBSD: iswalnum.3,v 1.5 2002/07/10 14:46:10 yamt Exp $ +.\" +.\" Copyright (c) 1991 The Regents of the University of California. +.\" All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" the American National Standards Committee X3, on Information +.\" Processing Systems. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)isalnum.3 5.2 (Berkeley) 6/29/91 +.\" $FreeBSD$ +.\" +.Dd October 3, 2002 +.Dt ISWALNUM 3 +.Os +.Sh NAME +.Nm iswalnum , +.Nm iswalpha , +.Nm iswascii , +.Nm iswblank , +.Nm iswcntrl , +.Nm iswdigit , +.Nm iswgraph , +.Nm iswhexnumber , +.Nm iswideogram , +.Nm iswlower , +.Nm iswnumber , +.Nm iswphonogram , +.Nm iswprint , +.Nm iswpunct , +.Nm iswrune , +.Nm iswspace , +.Nm iswspecial , +.Nm iswupper , +.Nm iswxdigit +.Nd wide character classification utilities +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In wctype.h +.Ft int +.Fn iswalnum "wint_t wc" +.Ft int +.Fn iswalpha "wint_t wc" +.Ft int +.Fn iswascii "wint_t wc" +.Ft int +.Fn iswblank "wint_t wc" +.Ft int +.Fn iswcntrl "wint_t wc" +.Ft int +.Fn iswdigit "wint_t wc" +.Ft int +.Fn iswgraph "wint_t wc" +.Ft int +.Fn iswhexnumber "wint_t wc" +.Ft int +.Fn iswideogram "wint_t wc" +.Ft int +.Fn iswlower "wint_t wc" +.Ft int +.Fn iswnumber "wint_t wc" +.Ft int +.Fn iswphonogram "wint_t wc" +.Ft int +.Fn iswprint "wint_t wc" +.Ft int +.Fn iswpunct "wint_t wc" +.Ft int +.Fn iswrune "wint_t wc" +.Ft int +.Fn iswspace "wint_t wc" +.Ft int +.Fn iswspecial "wint_t wc" +.Ft int +.Fn iswupper "wint_t wc" +.Ft int +.Fn iswxdigit "wint_t wc" +.Sh DESCRIPTION +The above functions are character classification utility functions, +for use with wide characters +.Vt ( wchar_t +or +.Vt wint_t ) . +See the description for the similarly-named single byte classification +functions (like +.Xr isalnum 3 ) , +for details. +.Sh RETURN VALUES +The functions return zero if the character tests false and +return non-zero if the character tests true. +.Sh SEE ALSO +.Xr isalnum 3 , +.Xr isalpha 3 , +.Xr isascii 3 , +.Xr isblank 3 , +.Xr iscntrl 3 , +.Xr isdigit 3 , +.Xr isgraph 3 , +.Xr ishexnumber 3 , +.Xr isideogram 3 , +.Xr islower 3 , +.Xr isnumber 3 , +.Xr isphonogram 3 , +.Xr isprint 3 , +.Xr ispunct 3 , +.Xr isrune 3 , +.Xr isspace 3 , +.Xr isspecial 3 , +.Xr isupper 3 , +.Xr isxdigit 3 , +.Xr wctype 3 +.Sh STANDARDS +These functions conform to +.St -p1003.1-2001 , +except +.Fn iswascii , +.Fn iswhexnumber , +.Fn iswideogram , +.Fn iswnumber , +.Fn iswphonogram , +.Fn iswrune +and +.Fn iswspecial , +which are +.Fx +extensions. +.Sh CAVEATS +The result of these functions is undefined unless +the argument is +.Dv WEOF +or a valid +.Vt wchar_t +value for the current locale. diff --git a/lib/libc/locale/iswalnum_l.3 b/lib/libc/locale/iswalnum_l.3 new file mode 100644 index 0000000000000..21ee48fa7b5b1 --- /dev/null +++ b/lib/libc/locale/iswalnum_l.3 @@ -0,0 +1,168 @@ +.\" Copyright (c) 2012 Isabell Long <issyl0@FreeBSD.org> +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd July 25, 2012 +.Dt ISWALNUM_L 3 +.Os +.Sh NAME +.Nm iswalnum_l , +.Nm iswalpha_l , +.Nm iswcntrl_l , +.Nm iswctype_l , +.Nm iswdigit_l , +.Nm iswgraph_l , +.Nm iswlower_l , +.Nm iswprint_l , +.Nm iswpunct_l , +.Nm iswspace_l , +.Nm iswupper_l , +.Nm iswxdigit_l , +.Nm towlower_l , +.Nm towupper_l , +.Nm wctype_l , +.Nm iswblank_l , +.Nm iswhexnumber_l , +.Nm iswideogram_l , +.Nm iswnumber_l , +.Nm iswphonogram_l , +.Nm iswrune_l , +.Nm iswspecial_l , +.Nm nextwctype_l , +.Nm towctrans_l , +.Nm wctrans_l +.Nd wide character classification utilities +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In wctype.h +.Ft int +.Fn iswalnum_l "wint_t wc" "locale_t loc" +.Ft int +.Fn iswalpha_l "wint_t wc" "locale_t loc" +.Ft int +.Fn iswcntrl_l "wint_t wc" "locale_t loc" +.Ft int +.Fn iswctype_l "wint_t wc" "locale_t loc" +.Ft int +.Fn iswdigit_l "wint_t wc" "locale_t loc" +.Ft int +.Fn iswgraph_l "wint_t wc" "locale_t loc" +.Ft int +.Fn iswlower_l "wint_t wc" "locale_t loc" +.Ft int +.Fn iswprint_l "wint_t wc" "locale_t loc" +.Ft int +.Fn iswpunct_l "wint_t wc" "locale_t loc" +.Ft int +.Fn iswspace_l "wint_t wc" "locale_t loc" +.Ft int +.Fn iswupper_l "wint_t wc" "locale_t loc" +.Ft int +.Fn iswxdigit_l "wint_t wc" "locale_t loc" +.Ft wint_t +.Fn towlower_l "wint_t wc" "locale_t loc" +.Ft wint_t +.Fn towupper_l "wint_t wc" "locale_t loc" +.Ft wctype_t +.Fn wctype_l "wint_t wc" "locale_t loc" +.Ft int +.Fn iswblank_l "wint_t wc" "locale_t loc" +.Ft int +.Fn iswhexnumber_l "wint_t wc" "locale_t loc" +.Ft int +.Fn iswideogram_l "wint_t wc" "locale_t loc" +.Ft int +.Fn iswnumber_l "wint_t wc" "locale_t loc" +.Ft int +.Fn iswphonogram_l "wint_t wc" "locale_t loc" +.Ft int +.Fn iswrune_l "wint_t wc" "locale_t loc" +.Ft int +.Fn iswspecial_l "wint_t wc" "locale_t loc" +.Ft wint_t +.Fn nextwctype_l "wint_t wc" "locale_t loc" +.Ft wint_t +.Fn towctrans_l "wint_t wc" "wctrans_t" "locale_t loc" +.Ft wctrans_t +.Fn wctrans_l "const char *" "locale_t loc" +.Sh DESCRIPTION +The above functions are character classification utility functions, +for use with wide characters +.Vt ( wchar_t +or +.Vt wint_t ) +in the locale +.Fa loc . +They behave in the same way as the versions without the _l suffix, but use +the specified locale rather than the global or per-thread locale. +These functions may be implemented as inline functions in +.In wctype.h +and as functions in the C library. +See the specific manual pages for more information. +.Sh RETURN VALUES +These functions return the same things as their non-locale versions. +If the locale is invalid, their behaviors are undefined. +.Sh SEE ALSO +.Xr iswalnum 3 , +.Xr iswalpha 3 , +.Xr iswblank 3 , +.Xr iswcntrl 3 , +.Xr iswctype 3 , +.Xr iswdigit 3 , +.Xr iswgraph 3 , +.Xr iswhexnumber 3 , +.Xr iswideogram 3 , +.Xr iswlower 3 , +.Xr iswnumber 3 , +.Xr iswphonogram 3 , +.Xr iswprint 3 , +.Xr iswpunct 3 , +.Xr iswrune 3 , +.Xr iswspace 3 , +.Xr iswspecial 3 , +.Xr iswupper 3 , +.Xr iswxdigit 3 , +.Xr nextwctype 3 , +.Xr towctrans 3 , +.Xr towlower 3 , +.Xr towupper 3 , +.Xr wctrans 3 , +.Xr wctype 3 +.Sh STANDARDS +These functions conform to +.St -p1003.1-2008 , +except for +.Fn iswascii_l , +.Fn iswhexnumber_l , +.Fn iswideogram_l , +.Fn iswphonogram_l , +.Fn iswrune_l , +.Fn iswspecial_l +and +.Fn nextwctype_l +which are +.Fx +extensions. diff --git a/lib/libc/locale/iswctype.c b/lib/libc/locale/iswctype.c new file mode 100644 index 0000000000000..251e98c2b9c76 --- /dev/null +++ b/lib/libc/locale/iswctype.c @@ -0,0 +1,191 @@ +/*- + * SPDX-License-Identifier: BSD-3-Clause + * + * Copyright (c) 1989, 1993 + * The Regents of the University of California. All rights reserved. + * (c) UNIX System Laboratories, Inc. + * All or some portions of this file are derived from material licensed + * to the University of California by American Telephone and Telegraph + * Co. or Unix System Laboratories, Inc. and are reproduced herein with + * the permission of UNIX System Laboratories, Inc. + * + * This code is derived from software contributed to Berkeley by + * Paul Borman at Krystal Technologies. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <wctype.h> + +#undef iswalnum +int +iswalnum(wint_t wc) +{ + return (__istype(wc, _CTYPE_A|_CTYPE_N)); +} + +#undef iswalpha +int +iswalpha(wint_t wc) +{ + return (__istype(wc, _CTYPE_A)); +} + +#undef iswascii +int +iswascii(wint_t wc) +{ + return ((wc & ~0x7F) == 0); +} + +#undef iswblank +int +iswblank(wint_t wc) +{ + return (__istype(wc, _CTYPE_B)); +} + +#undef iswcntrl +int +iswcntrl(wint_t wc) +{ + return (__istype(wc, _CTYPE_C)); +} + +#undef iswdigit +int +iswdigit(wint_t wc) +{ + return (__istype(wc, _CTYPE_D)); +} + +#undef iswgraph +int +iswgraph(wint_t wc) +{ + return (__istype(wc, _CTYPE_G)); +} + +#undef iswhexnumber +int +iswhexnumber(wint_t wc) +{ + return (__istype(wc, _CTYPE_X)); +} + +#undef iswideogram +int +iswideogram(wint_t wc) +{ + return (__istype(wc, _CTYPE_I)); +} + +#undef iswlower +int +iswlower(wint_t wc) +{ + return (__istype(wc, _CTYPE_L)); +} + +#undef iswnumber +int +iswnumber(wint_t wc) +{ + return (__istype(wc, _CTYPE_N)); +} + +#undef iswphonogram +int +iswphonogram(wint_t wc) +{ + return (__istype(wc, _CTYPE_Q)); +} + +#undef iswprint +int +iswprint(wint_t wc) +{ + return (__istype(wc, _CTYPE_R)); +} + +#undef iswpunct +int +iswpunct(wint_t wc) +{ + return (__istype(wc, _CTYPE_P)); +} + +#undef iswrune +int +iswrune(wint_t wc) +{ + return (__istype(wc, 0xFFFFFF00L)); +} + +#undef iswspace +int +iswspace(wint_t wc) +{ + return (__istype(wc, _CTYPE_S)); +} + +#undef iswspecial +int +iswspecial(wint_t wc) +{ + return (__istype(wc, _CTYPE_T)); +} + +#undef iswupper +int +iswupper(wint_t wc) +{ + return (__istype(wc, _CTYPE_U)); +} + +#undef iswxdigit +int +iswxdigit(wint_t wc) +{ + return (__istype(wc, _CTYPE_X)); +} + +#undef towlower +wint_t +towlower(wint_t wc) +{ + return (__tolower(wc)); +} + +#undef towupper +wint_t +towupper(wint_t wc) +{ + return (__toupper(wc)); +} + diff --git a/lib/libc/locale/isxdigit.3 b/lib/libc/locale/isxdigit.3 new file mode 100644 index 0000000000000..7e065a4e255a5 --- /dev/null +++ b/lib/libc/locale/isxdigit.3 @@ -0,0 +1,102 @@ +.\" Copyright (c) 1991, 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" the American National Standards Committee X3, on Information +.\" Processing Systems. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)isxdigit.3 8.1 (Berkeley) 6/4/93 +.\" $FreeBSD$ +.\" +.Dd July 17, 2005 +.Dt ISXDIGIT 3 +.Os +.Sh NAME +.Nm isxdigit , +.Nm ishexnumber +.Nd hexadecimal-digit character test +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In ctype.h +.Ft int +.Fn isxdigit "int c" +.Ft int +.Fn ishexnumber "int c" +.Sh DESCRIPTION +The +.Fn isxdigit +function tests for any hexadecimal-digit character. +Regardless of locale, this includes the following characters only: +.Bl -column \&``0''______ \&``0''______ \&``0''______ \&``0''______ \&``0''______ +.It "\&``0''" Ta "``1''" Ta "``2''" Ta "``3''" Ta "``4''" +.It "\&``5''" Ta "``6''" Ta "``7''" Ta "``8''" Ta "``9''" +.It "\&``A''" Ta "``B''" Ta "``C''" Ta "``D''" Ta "``E''" +.It "\&``F''" Ta "``a''" Ta "``b''" Ta "``c''" Ta "``d''" +.It "\&``e''" Ta "``f''" Ta \& Ta \& Ta \& +.El +.Pp +The +.Fn ishexnumber +function behaves similarly to +.Fn isxdigit , +but may recognize additional characters, +depending on the current locale setting. +.Pp +The value of the argument must be representable as an +.Vt "unsigned char" +or the value of +.Dv EOF . +.Sh RETURN VALUES +The +.Fn isxdigit +function returns zero if the character tests false and +returns non-zero if the character tests true. +.Sh COMPATIBILITY +The +.Bx 4.4 +extension of accepting arguments outside of the range of the +.Vt "unsigned char" +type in locales with large character sets is considered obsolete +and may not be supported in future releases. +The +.Fn iswxdigit +function should be used instead. +.Sh SEE ALSO +.Xr ctype 3 , +.Xr iswxdigit 3 , +.Xr ascii 7 +.Sh STANDARDS +The +.Fn isxdigit +function conforms to +.St -isoC . +.Sh HISTORY +The +.Fn ishexnumber +function appeared in +.Bx 4.4 . diff --git a/lib/libc/locale/ldpart.c b/lib/libc/locale/ldpart.c new file mode 100644 index 0000000000000..ed794337e1157 --- /dev/null +++ b/lib/libc/locale/ldpart.c @@ -0,0 +1,168 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2000, 2001 Alexey Zelkin <phantom@FreeBSD.org> + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include "namespace.h" +#include <sys/types.h> +#include <sys/stat.h> + +#include <errno.h> +#include <fcntl.h> +#include <limits.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> +#include "un-namespace.h" + +#include "ldpart.h" +#include "setlocale.h" + +static int split_lines(char *, const char *); + +int +__part_load_locale(const char *name, + int *using_locale, + char **locale_buf, + const char *category_filename, + int locale_buf_size_max, + int locale_buf_size_min, + const char **dst_localebuf) +{ + int saverr, fd, i, num_lines; + char *lbuf, *p; + const char *plim; + char filename[PATH_MAX]; + struct stat st; + size_t namesize, bufsize; + + /* 'name' must be already checked. */ + if (strcmp(name, "C") == 0 || strcmp(name, "POSIX") == 0) { + *using_locale = 0; + return (_LDP_CACHE); + } + + /* + * If the locale name is the same as our cache, use the cache. + */ + if (*locale_buf != NULL && strcmp(name, *locale_buf) == 0) { + *using_locale = 1; + return (_LDP_CACHE); + } + + /* + * Slurp the locale file into the cache. + */ + namesize = strlen(name) + 1; + + /* 'PathLocale' must be already set & checked. */ + + /* Range checking not needed, 'name' size is limited */ + strcpy(filename, _PathLocale); + strcat(filename, "/"); + strcat(filename, name); + strcat(filename, "/"); + strcat(filename, category_filename); + if ((fd = _open(filename, O_RDONLY | O_CLOEXEC)) < 0) + return (_LDP_ERROR); + if (_fstat(fd, &st) != 0) + goto bad_locale; + if (st.st_size <= 0) { + errno = EFTYPE; + goto bad_locale; + } + bufsize = namesize + st.st_size; + if ((lbuf = malloc(bufsize)) == NULL) { + errno = ENOMEM; + goto bad_locale; + } + (void)strcpy(lbuf, name); + p = lbuf + namesize; + plim = p + st.st_size; + if (_read(fd, p, (size_t) st.st_size) != st.st_size) + goto bad_lbuf; + /* + * Parse the locale file into localebuf. + */ + if (plim[-1] != '\n') { + errno = EFTYPE; + goto bad_lbuf; + } + num_lines = split_lines(p, plim); + if (num_lines >= locale_buf_size_max) + num_lines = locale_buf_size_max; + else if (num_lines >= locale_buf_size_min) + num_lines = locale_buf_size_min; + else { + errno = EFTYPE; + goto bad_lbuf; + } + (void)_close(fd); + /* + * Record the successful parse in the cache. + */ + if (*locale_buf != NULL) + free(*locale_buf); + *locale_buf = lbuf; + for (p = *locale_buf, i = 0; i < num_lines; i++) + dst_localebuf[i] = (p += strlen(p) + 1); + for (i = num_lines; i < locale_buf_size_max; i++) + dst_localebuf[i] = NULL; + *using_locale = 1; + + return (_LDP_LOADED); + +bad_lbuf: + saverr = errno; + free(lbuf); + errno = saverr; +bad_locale: + saverr = errno; + (void)_close(fd); + errno = saverr; + + return (_LDP_ERROR); +} + +static int +split_lines(char *p, const char *plim) +{ + int i; + + i = 0; + while (p < plim) { + if (*p == '\n') { + *p = '\0'; + i++; + } + p++; + } + return (i); +} + diff --git a/lib/libc/locale/ldpart.h b/lib/libc/locale/ldpart.h new file mode 100644 index 0000000000000..8985b324ae031 --- /dev/null +++ b/lib/libc/locale/ldpart.h @@ -0,0 +1,41 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2000, 2001 Alexey Zelkin <phantom@FreeBSD.org> + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#ifndef _LDPART_H_ +#define _LDPART_H_ + +#define _LDP_LOADED 0 +#define _LDP_ERROR (-1) +#define _LDP_CACHE 1 + +int __part_load_locale(const char *, int*, char **, const char *, + int, int, const char **); + +#endif /* !_LDPART_H_ */ diff --git a/lib/libc/locale/lmessages.c b/lib/libc/locale/lmessages.c new file mode 100644 index 0000000000000..9c442dc3d7a11 --- /dev/null +++ b/lib/libc/locale/lmessages.c @@ -0,0 +1,128 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2001 Alexey Zelkin <phantom@FreeBSD.org> + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <stddef.h> + +#include "ldpart.h" +#include "lmessages.h" + +#define LCMESSAGES_SIZE_FULL (sizeof(struct lc_messages_T) / sizeof(char *)) +#define LCMESSAGES_SIZE_MIN \ + (offsetof(struct lc_messages_T, yesstr) / sizeof(char *)) + +struct xlocale_messages { + struct xlocale_component header; + char *buffer; + struct lc_messages_T locale; +}; + +struct xlocale_messages __xlocale_global_messages; + +static char empty[] = ""; + +static const struct lc_messages_T _C_messages_locale = { + "^[yY]" , /* yesexpr */ + "^[nN]" , /* noexpr */ + "yes" , /* yesstr */ + "no" /* nostr */ +}; + +static void destruct_messages(void *v) +{ + struct xlocale_messages *l = v; + if (l->buffer) + free(l->buffer); + free(l); +} + +static int +messages_load_locale(struct xlocale_messages *loc, int *using_locale, const char *name) +{ + int ret; + struct lc_messages_T *l = &loc->locale; + + ret = __part_load_locale(name, using_locale, + &loc->buffer, "LC_MESSAGES", + LCMESSAGES_SIZE_FULL, LCMESSAGES_SIZE_MIN, + (const char **)l); + if (ret == _LDP_LOADED) { + if (l->yesstr == NULL) + l->yesstr = empty; + if (l->nostr == NULL) + l->nostr = empty; + } + return (ret); +} +int +__messages_load_locale(const char *name) +{ + return messages_load_locale(&__xlocale_global_messages, + &__xlocale_global_locale.using_messages_locale, name); +} +void * +__messages_load(const char *name, locale_t l) +{ + struct xlocale_messages *new = calloc(sizeof(struct xlocale_messages), 1); + new->header.header.destructor = destruct_messages; + if (messages_load_locale(new, &l->using_messages_locale, name) == _LDP_ERROR) { + xlocale_release(new); + return NULL; + } + return new; +} + +struct lc_messages_T * +__get_current_messages_locale(locale_t loc) +{ + return (loc->using_messages_locale + ? &((struct xlocale_messages *)loc->components[XLC_MESSAGES])->locale + : (struct lc_messages_T *)&_C_messages_locale); +} + +#ifdef LOCALE_DEBUG +void +msgdebug() { +printf( "yesexpr = %s\n" + "noexpr = %s\n" + "yesstr = %s\n" + "nostr = %s\n", + _messages_locale.yesexpr, + _messages_locale.noexpr, + _messages_locale.yesstr, + _messages_locale.nostr +); +} +#endif /* LOCALE_DEBUG */ diff --git a/lib/libc/locale/lmessages.h b/lib/libc/locale/lmessages.h new file mode 100644 index 0000000000000..0e8278f4a6501 --- /dev/null +++ b/lib/libc/locale/lmessages.h @@ -0,0 +1,51 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2000, 2001 Alexey Zelkin <phantom@FreeBSD.org> + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#ifndef _LMESSAGES_H_ +#define _LMESSAGES_H_ + +#include "xlocale_private.h" + +struct lc_messages_T { + const char *yesexpr; + const char *noexpr; + const char *yesstr; + const char *nostr; +}; + +struct lc_messages_T *__get_current_messages_locale(locale_t); +int __messages_load_locale(const char *); + +#endif /* !_LMESSAGES_H_ */ diff --git a/lib/libc/locale/lmonetary.c b/lib/libc/locale/lmonetary.c new file mode 100644 index 0000000000000..99800ae699226 --- /dev/null +++ b/lib/libc/locale/lmonetary.c @@ -0,0 +1,227 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2000, 2001 Alexey Zelkin <phantom@FreeBSD.org> + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <limits.h> +#include <stddef.h> +#include <stdlib.h> + +#include "ldpart.h" +#include "lmonetary.h" + +extern const char * __fix_locale_grouping_str(const char *); + +#define LCMONETARY_SIZE_FULL (sizeof(struct lc_monetary_T) / sizeof(char *)) +#define LCMONETARY_SIZE_MIN \ + (offsetof(struct lc_monetary_T, int_p_cs_precedes) / \ + sizeof(char *)) + +static char empty[] = ""; +static char numempty[] = { CHAR_MAX, '\0'}; + +static const struct lc_monetary_T _C_monetary_locale = { + empty, /* int_curr_symbol */ + empty, /* currency_symbol */ + empty, /* mon_decimal_point */ + empty, /* mon_thousands_sep */ + numempty, /* mon_grouping */ + empty, /* positive_sign */ + empty, /* negative_sign */ + numempty, /* int_frac_digits */ + numempty, /* frac_digits */ + numempty, /* p_cs_precedes */ + numempty, /* p_sep_by_space */ + numempty, /* n_cs_precedes */ + numempty, /* n_sep_by_space */ + numempty, /* p_sign_posn */ + numempty, /* n_sign_posn */ + numempty, /* int_p_cs_precedes */ + numempty, /* int_n_cs_precedes */ + numempty, /* int_p_sep_by_space */ + numempty, /* int_n_sep_by_space */ + numempty, /* int_p_sign_posn */ + numempty /* int_n_sign_posn */ +}; + +struct xlocale_monetary __xlocale_global_monetary; + +static char +cnv(const char *str) +{ + int i = strtol(str, NULL, 10); + + if (i == -1) + i = CHAR_MAX; + return ((char)i); +} + +static void +destruct_monetary(void *v) +{ + struct xlocale_monetary *l = v; + if (l->buffer) + free(l->buffer); + free(l); +} + +static int +monetary_load_locale_l(struct xlocale_monetary *loc, int *using_locale, + int *changed, const char *name) +{ + int ret; + struct lc_monetary_T *l = &loc->locale; + + ret = __part_load_locale(name, using_locale, + &loc->buffer, "LC_MONETARY", + LCMONETARY_SIZE_FULL, LCMONETARY_SIZE_MIN, + (const char **)l); + if (ret != _LDP_ERROR) + *changed = 1; + if (ret == _LDP_LOADED) { + l->mon_grouping = + __fix_locale_grouping_str(l->mon_grouping); + +#define M_ASSIGN_CHAR(NAME) (((char *)l->NAME)[0] = \ + cnv(l->NAME)) + + M_ASSIGN_CHAR(int_frac_digits); + M_ASSIGN_CHAR(frac_digits); + M_ASSIGN_CHAR(p_cs_precedes); + M_ASSIGN_CHAR(p_sep_by_space); + M_ASSIGN_CHAR(n_cs_precedes); + M_ASSIGN_CHAR(n_sep_by_space); + M_ASSIGN_CHAR(p_sign_posn); + M_ASSIGN_CHAR(n_sign_posn); + + /* + * The six additional C99 international monetary formatting + * parameters default to the national parameters when + * reading FreeBSD LC_MONETARY data files. + */ +#define M_ASSIGN_ICHAR(NAME) \ + do { \ + if (l->int_##NAME == NULL) \ + l->int_##NAME = \ + l->NAME; \ + else \ + M_ASSIGN_CHAR(int_##NAME); \ + } while (0) + + M_ASSIGN_ICHAR(p_cs_precedes); + M_ASSIGN_ICHAR(n_cs_precedes); + M_ASSIGN_ICHAR(p_sep_by_space); + M_ASSIGN_ICHAR(n_sep_by_space); + M_ASSIGN_ICHAR(p_sign_posn); + M_ASSIGN_ICHAR(n_sign_posn); + } + return (ret); +} +int +__monetary_load_locale(const char *name) +{ + return monetary_load_locale_l(&__xlocale_global_monetary, + &__xlocale_global_locale.using_monetary_locale, + &__xlocale_global_locale.monetary_locale_changed, name); +} +void* __monetary_load(const char *name, locale_t l) +{ + struct xlocale_monetary *new = calloc(sizeof(struct xlocale_monetary), 1); + new->header.header.destructor = destruct_monetary; + if (monetary_load_locale_l(new, &l->using_monetary_locale, + &l->monetary_locale_changed, name) == _LDP_ERROR) + { + xlocale_release(new); + return NULL; + } + return new; +} + + +struct lc_monetary_T * +__get_current_monetary_locale(locale_t loc) +{ + return (loc->using_monetary_locale + ? &((struct xlocale_monetary*)loc->components[XLC_MONETARY])->locale + : (struct lc_monetary_T *)&_C_monetary_locale); +} + +#ifdef LOCALE_DEBUG +void +monetdebug() { +printf( "int_curr_symbol = %s\n" + "currency_symbol = %s\n" + "mon_decimal_point = %s\n" + "mon_thousands_sep = %s\n" + "mon_grouping = %s\n" + "positive_sign = %s\n" + "negative_sign = %s\n" + "int_frac_digits = %d\n" + "frac_digits = %d\n" + "p_cs_precedes = %d\n" + "p_sep_by_space = %d\n" + "n_cs_precedes = %d\n" + "n_sep_by_space = %d\n" + "p_sign_posn = %d\n" + "n_sign_posn = %d\n" + "int_p_cs_precedes = %d\n" + "int_p_sep_by_space = %d\n" + "int_n_cs_precedes = %d\n" + "int_n_sep_by_space = %d\n" + "int_p_sign_posn = %d\n" + "int_n_sign_posn = %d\n", + _monetary_locale.int_curr_symbol, + _monetary_locale.currency_symbol, + _monetary_locale.mon_decimal_point, + _monetary_locale.mon_thousands_sep, + _monetary_locale.mon_grouping, + _monetary_locale.positive_sign, + _monetary_locale.negative_sign, + _monetary_locale.int_frac_digits[0], + _monetary_locale.frac_digits[0], + _monetary_locale.p_cs_precedes[0], + _monetary_locale.p_sep_by_space[0], + _monetary_locale.n_cs_precedes[0], + _monetary_locale.n_sep_by_space[0], + _monetary_locale.p_sign_posn[0], + _monetary_locale.n_sign_posn[0], + _monetary_locale.int_p_cs_precedes[0], + _monetary_locale.int_p_sep_by_space[0], + _monetary_locale.int_n_cs_precedes[0], + _monetary_locale.int_n_sep_by_space[0], + _monetary_locale.int_p_sign_posn[0], + _monetary_locale.int_n_sign_posn[0] +); +} +#endif /* LOCALE_DEBUG */ diff --git a/lib/libc/locale/lmonetary.h b/lib/libc/locale/lmonetary.h new file mode 100644 index 0000000000000..69facf9f88dd0 --- /dev/null +++ b/lib/libc/locale/lmonetary.h @@ -0,0 +1,72 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2000, 2001 Alexey Zelkin <phantom@FreeBSD.org> + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#ifndef _LMONETARY_H_ +#define _LMONETARY_H_ +#include "xlocale_private.h" + +struct lc_monetary_T { + const char *int_curr_symbol; + const char *currency_symbol; + const char *mon_decimal_point; + const char *mon_thousands_sep; + const char *mon_grouping; + const char *positive_sign; + const char *negative_sign; + const char *int_frac_digits; + const char *frac_digits; + const char *p_cs_precedes; + const char *p_sep_by_space; + const char *n_cs_precedes; + const char *n_sep_by_space; + const char *p_sign_posn; + const char *n_sign_posn; + const char *int_p_cs_precedes; + const char *int_n_cs_precedes; + const char *int_p_sep_by_space; + const char *int_n_sep_by_space; + const char *int_p_sign_posn; + const char *int_n_sign_posn; +}; +struct xlocale_monetary { + struct xlocale_component header; + char *buffer; + struct lc_monetary_T locale; +}; + +struct lc_monetary_T *__get_current_monetary_locale(locale_t loc); +int __monetary_load_locale(const char *); + +#endif /* !_LMONETARY_H_ */ diff --git a/lib/libc/locale/lnumeric.c b/lib/libc/locale/lnumeric.c new file mode 100644 index 0000000000000..046d1f1817dc9 --- /dev/null +++ b/lib/libc/locale/lnumeric.c @@ -0,0 +1,129 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2000, 2001 Alexey Zelkin <phantom@FreeBSD.org> + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <limits.h> + +#include "ldpart.h" +#include "lnumeric.h" + +extern const char *__fix_locale_grouping_str(const char *); + +#define LCNUMERIC_SIZE (sizeof(struct lc_numeric_T) / sizeof(char *)) + +static char numempty[] = { CHAR_MAX, '\0' }; + +static const struct lc_numeric_T _C_numeric_locale = { + ".", /* decimal_point */ + "", /* thousands_sep */ + numempty /* grouping */ +}; + +static void +destruct_numeric(void *v) +{ + struct xlocale_numeric *l = v; + if (l->buffer) + free(l->buffer); + free(l); +} + +struct xlocale_numeric __xlocale_global_numeric; + +static int +numeric_load_locale(struct xlocale_numeric *loc, int *using_locale, int *changed, + const char *name) +{ + int ret; + struct lc_numeric_T *l = &loc->locale; + + ret = __part_load_locale(name, using_locale, + &loc->buffer, "LC_NUMERIC", + LCNUMERIC_SIZE, LCNUMERIC_SIZE, + (const char**)l); + if (ret != _LDP_ERROR) + *changed= 1; + if (ret == _LDP_LOADED) { + /* Can't be empty according to C99 */ + if (*l->decimal_point == '\0') + l->decimal_point = + _C_numeric_locale.decimal_point; + l->grouping = + __fix_locale_grouping_str(l->grouping); + } + return (ret); +} + +int +__numeric_load_locale(const char *name) +{ + return numeric_load_locale(&__xlocale_global_numeric, + &__xlocale_global_locale.using_numeric_locale, + &__xlocale_global_locale.numeric_locale_changed, name); +} +void * +__numeric_load(const char *name, locale_t l) +{ + struct xlocale_numeric *new = calloc(sizeof(struct xlocale_numeric), 1); + new->header.header.destructor = destruct_numeric; + if (numeric_load_locale(new, &l->using_numeric_locale, + &l->numeric_locale_changed, name) == _LDP_ERROR) + { + xlocale_release(new); + return NULL; + } + return new; +} + +struct lc_numeric_T * +__get_current_numeric_locale(locale_t loc) +{ + return (loc->using_numeric_locale + ? &((struct xlocale_numeric *)loc->components[XLC_NUMERIC])->locale + : (struct lc_numeric_T *)&_C_numeric_locale); +} + +#ifdef LOCALE_DEBUG +void +numericdebug(void) { +printf( "decimal_point = %s\n" + "thousands_sep = %s\n" + "grouping = %s\n", + _numeric_locale.decimal_point, + _numeric_locale.thousands_sep, + _numeric_locale.grouping +); +} +#endif /* LOCALE_DEBUG */ diff --git a/lib/libc/locale/lnumeric.h b/lib/libc/locale/lnumeric.h new file mode 100644 index 0000000000000..43ff161278629 --- /dev/null +++ b/lib/libc/locale/lnumeric.h @@ -0,0 +1,54 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2000, 2001 Alexey Zelkin <phantom@FreeBSD.org> + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#ifndef _LNUMERIC_H_ +#define _LNUMERIC_H_ +#include "xlocale_private.h" + +struct lc_numeric_T { + const char *decimal_point; + const char *thousands_sep; + const char *grouping; +}; +struct xlocale_numeric { + struct xlocale_component header; + char *buffer; + struct lc_numeric_T locale; +}; + +struct lc_numeric_T *__get_current_numeric_locale(locale_t loc); +int __numeric_load_locale(const char *); + +#endif /* !_LNUMERIC_H_ */ diff --git a/lib/libc/locale/localeconv.3 b/lib/libc/locale/localeconv.3 new file mode 100644 index 0000000000000..6262bf8863466 --- /dev/null +++ b/lib/libc/locale/localeconv.3 @@ -0,0 +1,238 @@ +.\" Copyright (c) 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" Donn Seeley at BSDI. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" From @(#)setlocale.3 8.1 (Berkeley) 6/9/93 +.\" From FreeBSD: src/lib/libc/locale/setlocale.3,v 1.28 2003/11/15 02:26:04 tjr Exp +.\" $FreeBSD$ +.\" +.Dd November 21, 2003 +.Dt LOCALECONV 3 +.Os +.Sh NAME +.Nm localeconv +.Nd natural language formatting for C +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In locale.h +.Ft struct lconv * +.Fn localeconv "void" +.In xlocale.h +.Ft struct lconv * +.Fn localeconv_l "locale_t locale" +.Sh DESCRIPTION +The +.Fn localeconv +function returns a pointer to a structure +which provides parameters for formatting numbers, +especially currency values: +.Bd -literal -offset indent +struct lconv { + char *decimal_point; + char *thousands_sep; + char *grouping; + char *int_curr_symbol; + char *currency_symbol; + char *mon_decimal_point; + char *mon_thousands_sep; + char *mon_grouping; + char *positive_sign; + char *negative_sign; + char int_frac_digits; + char frac_digits; + char p_cs_precedes; + char p_sep_by_space; + char n_cs_precedes; + char n_sep_by_space; + char p_sign_posn; + char n_sign_posn; + char int_p_cs_precedes; + char int_n_cs_precedes; + char int_p_sep_by_space; + char int_n_sep_by_space; + char int_p_sign_posn; + char int_n_sign_posn; +}; +.Ed +.Pp +The individual fields have the following meanings: +.Bl -tag -width mon_decimal_point +.It Va decimal_point +The decimal point character, except for currency values, +cannot be an empty string. +.It Va thousands_sep +The separator between groups of digits +before the decimal point, except for currency values. +.It Va grouping +The sizes of the groups of digits, except for currency values. +This is a pointer to a vector of integers, each of size +.Vt char , +representing group size from low order digit groups +to high order (right to left). +The list may be terminated with 0 or +.Dv CHAR_MAX . +If the list is terminated with 0, +the last group size before the 0 is repeated to account for all the digits. +If the list is terminated with +.Dv CHAR_MAX , +no more grouping is performed. +.It Va int_curr_symbol +The standardized international currency symbol. +.It Va currency_symbol +The local currency symbol. +.It Va mon_decimal_point +The decimal point character for currency values. +.It Va mon_thousands_sep +The separator for digit groups in currency values. +.It Va mon_grouping +Like +.Va grouping +but for currency values. +.It Va positive_sign +The character used to denote nonnegative currency values, +usually the empty string. +.It Va negative_sign +The character used to denote negative currency values, +usually a minus sign. +.It Va int_frac_digits +The number of digits after the decimal point +in an international-style currency value. +.It Va frac_digits +The number of digits after the decimal point +in the local style for currency values. +.It Va p_cs_precedes +1 if the currency symbol precedes the currency value +for nonnegative values, 0 if it follows. +.It Va p_sep_by_space +1 if a space is inserted between the currency symbol +and the currency value for nonnegative values, 0 otherwise. +.It Va n_cs_precedes +Like +.Va p_cs_precedes +but for negative values. +.It Va n_sep_by_space +Like +.Va p_sep_by_space +but for negative values. +.It Va p_sign_posn +The location of the +.Va positive_sign +with respect to a nonnegative quantity and the +.Va currency_symbol , +coded as follows: +.Pp +.Bl -tag -width 3n -compact +.It Li 0 +Parentheses around the entire string. +.It Li 1 +Before the string. +.It Li 2 +After the string. +.It Li 3 +Just before +.Va currency_symbol . +.It Li 4 +Just after +.Va currency_symbol . +.El +.It Va n_sign_posn +Like +.Va p_sign_posn +but for negative currency values. +.It Va int_p_cs_precedes +Same as +.Va p_cs_precedes , +but for internationally formatted monetary quantities. +.It Va int_n_cs_precedes +Same as +.Va n_cs_precedes , +but for internationally formatted monetary quantities. +.It Va int_p_sep_by_space +Same as +.Va p_sep_by_space , +but for internationally formatted monetary quantities. +.It Va int_n_sep_by_space +Same as +.Va n_sep_by_space , +but for internationally formatted monetary quantities. +.It Va int_p_sign_posn +Same as +.Va p_sign_posn , +but for internationally formatted monetary quantities. +.It Va int_n_sign_posn +Same as +.Va n_sign_posn , +but for internationally formatted monetary quantities. +.El +.Pp +Unless mentioned above, +an empty string as a value for a field +indicates a zero length result or +a value that is not in the current locale. +A +.Dv CHAR_MAX +result similarly denotes an unavailable value. +.Pp +The +.Fn localeconv_l +function takes an explicit locale parameter. +For more information, see +.Xr xlocale 3 . +.Sh RETURN VALUES +The +.Fn localeconv +function returns a pointer to a static object +which may be altered by later calls to +.Xr setlocale 3 +or +.Fn localeconv . +The return value for +.Fn localeconv_l +is stored with the locale. +It will remain valid until a subsequent call to +.Xr freelocale 3 . +If a thread-local locale is in effect then the return value from +.Fn localeconv +will remain valid until the locale is destroyed. +.Sh ERRORS +No errors are defined. +.Sh SEE ALSO +.Xr setlocale 3 , +.Xr strfmon 3 +.Sh STANDARDS +The +.Fn localeconv +function conforms to +.St -isoC-99 . +.Sh HISTORY +The +.Fn localeconv +function first appeared in +.Bx 4.4 . diff --git a/lib/libc/locale/localeconv.c b/lib/libc/locale/localeconv.c new file mode 100644 index 0000000000000..641773944e328 --- /dev/null +++ b/lib/libc/locale/localeconv.c @@ -0,0 +1,119 @@ +/*- + * SPDX-License-Identifier: BSD-3-Clause + * + * Copyright (c) 2001 Alexey Zelkin <phantom@FreeBSD.org> + * Copyright (c) 1991, 1993 + * The Regents of the University of California. All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#if defined(LIBC_SCCS) && !defined(lint) +static char sccsid[] = "@(#)localeconv.c 8.1 (Berkeley) 6/4/93"; +#endif /* LIBC_SCCS and not lint */ +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <locale.h> + +#include "lmonetary.h" +#include "lnumeric.h" + +/* + * The localeconv() function constructs a struct lconv from the current + * monetary and numeric locales. + * + * Because localeconv() may be called many times (especially by library + * routines like printf() & strtod()), the approprate members of the + * lconv structure are computed only when the monetary or numeric + * locale has been changed. + */ + +/* + * Return the current locale conversion. + */ +struct lconv * +localeconv_l(locale_t loc) +{ + FIX_LOCALE(loc); + struct lconv *ret = &loc->lconv; + + if (loc->monetary_locale_changed) { + /* LC_MONETARY part */ + struct lc_monetary_T * mptr; + +#define M_ASSIGN_STR(NAME) (ret->NAME = (char*)mptr->NAME) +#define M_ASSIGN_CHAR(NAME) (ret->NAME = mptr->NAME[0]) + + mptr = __get_current_monetary_locale(loc); + M_ASSIGN_STR(int_curr_symbol); + M_ASSIGN_STR(currency_symbol); + M_ASSIGN_STR(mon_decimal_point); + M_ASSIGN_STR(mon_thousands_sep); + M_ASSIGN_STR(mon_grouping); + M_ASSIGN_STR(positive_sign); + M_ASSIGN_STR(negative_sign); + M_ASSIGN_CHAR(int_frac_digits); + M_ASSIGN_CHAR(frac_digits); + M_ASSIGN_CHAR(p_cs_precedes); + M_ASSIGN_CHAR(p_sep_by_space); + M_ASSIGN_CHAR(n_cs_precedes); + M_ASSIGN_CHAR(n_sep_by_space); + M_ASSIGN_CHAR(p_sign_posn); + M_ASSIGN_CHAR(n_sign_posn); + M_ASSIGN_CHAR(int_p_cs_precedes); + M_ASSIGN_CHAR(int_n_cs_precedes); + M_ASSIGN_CHAR(int_p_sep_by_space); + M_ASSIGN_CHAR(int_n_sep_by_space); + M_ASSIGN_CHAR(int_p_sign_posn); + M_ASSIGN_CHAR(int_n_sign_posn); + loc->monetary_locale_changed = 0; + } + + if (loc->numeric_locale_changed) { + /* LC_NUMERIC part */ + struct lc_numeric_T * nptr; + +#define N_ASSIGN_STR(NAME) (ret->NAME = (char*)nptr->NAME) + + nptr = __get_current_numeric_locale(loc); + N_ASSIGN_STR(decimal_point); + N_ASSIGN_STR(thousands_sep); + N_ASSIGN_STR(grouping); + loc->numeric_locale_changed = 0; + } + + return ret; +} +struct lconv * +localeconv(void) +{ + return localeconv_l(__get_locale()); +} diff --git a/lib/libc/locale/mblen.3 b/lib/libc/locale/mblen.3 new file mode 100644 index 0000000000000..a491096cbbab2 --- /dev/null +++ b/lib/libc/locale/mblen.3 @@ -0,0 +1,106 @@ +.\" Copyright (c) 2002-2004 Tim J. Robbins. All rights reserved. +.\" Copyright (c) 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" Donn Seeley of BSDI. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" From @(#)multibyte.3 8.1 (Berkeley) 6/4/93 +.\" From FreeBSD: src/lib/libc/locale/multibyte.3,v 1.22 2003/11/08 03:23:11 tjr Exp +.\" $FreeBSD$ +.\" +.Dd April 11, 2004 +.Dt MBLEN 3 +.Os +.Sh NAME +.Nm mblen +.Nd get number of bytes in a character +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In stdlib.h +.Ft int +.Fn mblen "const char *mbchar" "size_t nbytes" +.Sh DESCRIPTION +The +.Fn mblen +function computes the length in bytes +of a multibyte character +.Fa mbchar +according to the current conversion state. +Up to +.Fa nbytes +bytes are examined. +.Pp +A call with a null +.Fa mbchar +pointer returns nonzero if the current locale requires shift states, +zero otherwise; +if shift states are required, the shift state is reset to the initial state. +.Sh RETURN VALUES +If +.Fa mbchar +is +.Dv NULL , +the +.Fn mblen +function returns nonzero if shift states are supported, +zero otherwise. +.Pp +Otherwise, if +.Fa mbchar +is not a null pointer, +.Fn mblen +either returns 0 if +.Fa mbchar +represents the null wide character, or returns +the number of bytes processed in +.Fa mbchar , +or returns \-1 if no multibyte character +could be recognized or converted. +In this case, +.Fn mblen Ns 's +internal conversion state is undefined. +.Sh ERRORS +The +.Fn mblen +function will fail if: +.Bl -tag -width Er +.It Bq Er EILSEQ +An invalid multibyte sequence was detected. +.It Bq Er EINVAL +The internal conversion state is not valid. +.El +.Sh SEE ALSO +.Xr mbrlen 3 , +.Xr mbtowc 3 , +.Xr multibyte 3 +.Sh STANDARDS +The +.Fn mblen +function conforms to +.St -isoC-99 . diff --git a/lib/libc/locale/mblen.c b/lib/libc/locale/mblen.c new file mode 100644 index 0000000000000..e972298f5b404 --- /dev/null +++ b/lib/libc/locale/mblen.c @@ -0,0 +1,63 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2002-2004 Tim J. Robbins. + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <stdlib.h> +#include <wchar.h> +#include "mblocal.h" + +int +mblen_l(const char *s, size_t n, locale_t locale) +{ + static const mbstate_t initial; + size_t rval; + FIX_LOCALE(locale); + + if (s == NULL) { + /* No support for state dependent encodings. */ + locale->mblen = initial; + return (0); + } + rval = XLOCALE_CTYPE(locale)->__mbrtowc(NULL, s, n, &locale->mblen); + if (rval == (size_t)-1 || rval == (size_t)-2) + return (-1); + return ((int)rval); +} + +int +mblen(const char *s, size_t n) +{ + return mblen_l(s, n, __get_locale()); +} diff --git a/lib/libc/locale/mblocal.h b/lib/libc/locale/mblocal.h new file mode 100644 index 0000000000000..cffe3ba92bc8e --- /dev/null +++ b/lib/libc/locale/mblocal.h @@ -0,0 +1,92 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright 2013 Garrett D'Amore <garrett@damore.org> + * Copyright 2011 Nexenta Systems, Inc. All rights reserved. + * Copyright (c) 2004 Tim J. Robbins. + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#ifndef _MBLOCAL_H_ +#define _MBLOCAL_H_ + +#include <runetype.h> +#include "xlocale_private.h" + +#define SS2 0x008e +#define SS3 0x008f + +/* + * Conversion function pointers for current encoding. + */ +struct xlocale_ctype { + struct xlocale_component header; + _RuneLocale *runes; + size_t (*__mbrtowc)(wchar_t * __restrict, const char * __restrict, + size_t, mbstate_t * __restrict); + int (*__mbsinit)(const mbstate_t *); + size_t (*__mbsnrtowcs)(wchar_t * __restrict, const char ** __restrict, + size_t, size_t, mbstate_t * __restrict); + size_t (*__wcrtomb)(char * __restrict, wchar_t, mbstate_t * __restrict); + size_t (*__wcsnrtombs)(char * __restrict, const wchar_t ** __restrict, + size_t, size_t, mbstate_t * __restrict); + int __mb_cur_max; + int __mb_sb_limit; +}; +#define XLOCALE_CTYPE(x) ((struct xlocale_ctype*)(x)->components[XLC_CTYPE]) +extern struct xlocale_ctype __xlocale_global_ctype; + +/* + * Rune initialization function prototypes. + */ +int _none_init(struct xlocale_ctype *, _RuneLocale *) __hidden; +int _UTF8_init(struct xlocale_ctype *, _RuneLocale *) __hidden; +int _EUC_CN_init(struct xlocale_ctype *, _RuneLocale *) __hidden; +int _EUC_JP_init(struct xlocale_ctype *, _RuneLocale *) __hidden; +int _EUC_KR_init(struct xlocale_ctype *, _RuneLocale *) __hidden; +int _EUC_TW_init(struct xlocale_ctype *, _RuneLocale *) __hidden; +int _GB18030_init(struct xlocale_ctype *, _RuneLocale *) __hidden; +int _GB2312_init(struct xlocale_ctype *, _RuneLocale *) __hidden; +int _GBK_init(struct xlocale_ctype *, _RuneLocale *) __hidden; +int _BIG5_init(struct xlocale_ctype *, _RuneLocale *) __hidden; +int _MSKanji_init(struct xlocale_ctype *, _RuneLocale *) __hidden; +int _ascii_init(struct xlocale_ctype *, _RuneLocale *) __hidden; + +typedef size_t (*mbrtowc_pfn_t)(wchar_t * __restrict, + const char * __restrict, size_t, mbstate_t * __restrict); +typedef size_t (*wcrtomb_pfn_t)(char * __restrict, wchar_t, + mbstate_t * __restrict); +size_t __mbsnrtowcs_std(wchar_t * __restrict, const char ** __restrict, + size_t, size_t, mbstate_t * __restrict, mbrtowc_pfn_t); +size_t __wcsnrtombs_std(char * __restrict, const wchar_t ** __restrict, + size_t, size_t, mbstate_t * __restrict, wcrtomb_pfn_t); + +#endif /* _MBLOCAL_H_ */ diff --git a/lib/libc/locale/mbrlen.3 b/lib/libc/locale/mbrlen.3 new file mode 100644 index 0000000000000..524d2c7b4016f --- /dev/null +++ b/lib/libc/locale/mbrlen.3 @@ -0,0 +1,145 @@ +.\" Copyright (c) 2002-2004 Tim J. Robbins +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd April 7, 2004 +.Dt MBRLEN 3 +.Os +.Sh NAME +.Nm mbrlen +.Nd "get number of bytes in a character (restartable)" +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In wchar.h +.Ft size_t +.Fn mbrlen "const char * restrict s" "size_t n" "mbstate_t * restrict ps" +.Sh DESCRIPTION +The +.Fn mbrlen +function inspects at most +.Fa n +bytes pointed to by +.Fa s +to determine the number of bytes needed to complete the next +multibyte character. +.Pp +The +.Vt mbstate_t +argument, +.Fa ps , +is used to keep track of the shift state. +If it is +.Dv NULL , +.Fn mbrlen +uses an internal, static +.Vt mbstate_t +object, which is initialized to the initial conversion state +at program startup. +.Pp +It is equivalent to: +.Pp +.Dl "mbrtowc(NULL, s, n, ps);" +.Pp +Except that when +.Fa ps +is a +.Dv NULL +pointer, +.Fn mbrlen +uses its own static, internal +.Vt mbstate_t +object to keep track of the shift state. +.Sh RETURN VALUES +The +.Fn mbrlen +functions returns: +.Bl -tag -width indent +.It 0 +The next +.Fa n +or fewer bytes +represent the null wide character +.Pq Li "L'\e0'" . +.It >0 +The next +.Fa n +or fewer bytes +represent a valid character, +.Fn mbrlen +returns the number of bytes used to complete the multibyte character. +.It Po Vt size_t Pc Ns \-2 +The next +.Fa n +contribute to, but do not complete, a valid multibyte character sequence, +and all +.Fa n +bytes have been processed. +.It Po Vt size_t Pc Ns \-1 +An encoding error has occurred. +The next +.Fa n +or fewer bytes do not contribute to a valid multibyte character. +.El +.Sh EXAMPLES +A function that calculates the number of characters in a multibyte +character string: +.Bd -literal -offset indent +size_t +nchars(const char *s) +{ + size_t charlen, chars; + mbstate_t mbs; + + chars = 0; + memset(&mbs, 0, sizeof(mbs)); + while ((charlen = mbrlen(s, MB_CUR_MAX, &mbs)) != 0 && + charlen != (size_t)-1 && charlen != (size_t)-2) { + s += charlen; + chars++; + } + + return (chars); +} +.Ed +.Sh ERRORS +The +.Fn mbrlen +function will fail if: +.Bl -tag -width Er +.It Bq Er EILSEQ +An invalid multibyte sequence was detected. +.It Bq Er EINVAL +The conversion state is invalid. +.El +.Sh SEE ALSO +.Xr mblen 3 , +.Xr mbrtowc 3 , +.Xr multibyte 3 +.Sh STANDARDS +The +.Fn mbrlen +function conforms to +.St -isoC-99 . diff --git a/lib/libc/locale/mbrlen.c b/lib/libc/locale/mbrlen.c new file mode 100644 index 0000000000000..f84fce7b61b0a --- /dev/null +++ b/lib/libc/locale/mbrlen.c @@ -0,0 +1,53 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2002-2004 Tim J. Robbins. + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <wchar.h> +#include "mblocal.h" + +size_t +mbrlen_l(const char * __restrict s, size_t n, mbstate_t * __restrict ps, locale_t locale) +{ + FIX_LOCALE(locale); + if (ps == NULL) + ps = &locale->mbrlen; + return (XLOCALE_CTYPE(locale)->__mbrtowc(NULL, s, n, ps)); +} + +size_t +mbrlen(const char * __restrict s, size_t n, mbstate_t * __restrict ps) +{ + return mbrlen_l(s, n, ps, __get_locale()); +} diff --git a/lib/libc/locale/mbrtoc16.c b/lib/libc/locale/mbrtoc16.c new file mode 100644 index 0000000000000..497f1cd028d01 --- /dev/null +++ b/lib/libc/locale/mbrtoc16.c @@ -0,0 +1,91 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2013 Ed Schouten <ed@FreeBSD.org> + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <uchar.h> +#include "xlocale_private.h" + +typedef struct { + char16_t trail_surrogate; + mbstate_t c32_mbstate; +} _Char16State; + +size_t +mbrtoc16_l(char16_t * __restrict pc16, const char * __restrict s, size_t n, + mbstate_t * __restrict ps, locale_t locale) +{ + _Char16State *cs; + char32_t c32; + ssize_t len; + + FIX_LOCALE(locale); + if (ps == NULL) + ps = &locale->mbrtoc16; + cs = (_Char16State *)ps; + + /* + * Call straight into mbrtoc32_l() if we don't need to return a + * character value. According to the spec, if s is a null + * pointer, the value of parameter pc16 is also ignored. + */ + if (pc16 == NULL || s == NULL) { + cs->trail_surrogate = 0; + return (mbrtoc32_l(NULL, s, n, &cs->c32_mbstate, locale)); + } + + /* Return the trail surrogate from the previous invocation. */ + if (cs->trail_surrogate >= 0xdc00 && cs->trail_surrogate <= 0xdfff) { + *pc16 = cs->trail_surrogate; + cs->trail_surrogate = 0; + return ((size_t)-3); + } + + len = mbrtoc32_l(&c32, s, n, &cs->c32_mbstate, locale); + if (len >= 0) { + if (c32 < 0x10000) { + /* Fits in one UTF-16 character. */ + *pc16 = c32; + } else { + /* Split up in a surrogate pair. */ + c32 -= 0x10000; + *pc16 = 0xd800 | (c32 >> 10); + cs->trail_surrogate = 0xdc00 | (c32 & 0x3ff); + } + } + return (len); +} + +size_t +mbrtoc16(char16_t * __restrict pc16, const char * __restrict s, size_t n, + mbstate_t * __restrict ps) +{ + + return (mbrtoc16_l(pc16, s, n, ps, __get_locale())); +} diff --git a/lib/libc/locale/mbrtoc16_iconv.c b/lib/libc/locale/mbrtoc16_iconv.c new file mode 100644 index 0000000000000..f1eaf1925496d --- /dev/null +++ b/lib/libc/locale/mbrtoc16_iconv.c @@ -0,0 +1,8 @@ +/* $FreeBSD$ */ +#define charXX_t char16_t +#define mbrtocXX mbrtoc16 +#define mbrtocXX_l mbrtoc16_l +#define DSTBUF_LEN 2 +#define UTF_XX_INTERNAL "UTF-16-INTERNAL" + +#include "mbrtocXX_iconv.h" diff --git a/lib/libc/locale/mbrtoc32.c b/lib/libc/locale/mbrtoc32.c new file mode 100644 index 0000000000000..d1d8102e16f99 --- /dev/null +++ b/lib/libc/locale/mbrtoc32.c @@ -0,0 +1,55 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2013 Ed Schouten <ed@FreeBSD.org> + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <uchar.h> +#include <wchar.h> +#include "xlocale_private.h" + +size_t +mbrtoc32_l(char32_t * __restrict pc32, const char * __restrict s, size_t n, + mbstate_t * __restrict ps, locale_t locale) +{ + + FIX_LOCALE(locale); + if (ps == NULL) + ps = &locale->mbrtoc32; + + /* Assume wchar_t uses UTF-32. */ + return (mbrtowc_l(pc32, s, n, ps, locale)); +} + +size_t +mbrtoc32(char32_t * __restrict pc32, const char * __restrict s, size_t n, + mbstate_t * __restrict ps) +{ + + return (mbrtoc32_l(pc32, s, n, ps, __get_locale())); +} diff --git a/lib/libc/locale/mbrtoc32_iconv.c b/lib/libc/locale/mbrtoc32_iconv.c new file mode 100644 index 0000000000000..ec2c0145d9d66 --- /dev/null +++ b/lib/libc/locale/mbrtoc32_iconv.c @@ -0,0 +1,8 @@ +/* $FreeBSD$ */ +#define charXX_t char32_t +#define mbrtocXX mbrtoc32 +#define mbrtocXX_l mbrtoc32_l +#define DSTBUF_LEN 1 +#define UTF_XX_INTERNAL "UTF-32-INTERNAL" + +#include "mbrtocXX_iconv.h" diff --git a/lib/libc/locale/mbrtocXX_iconv.h b/lib/libc/locale/mbrtocXX_iconv.h new file mode 100644 index 0000000000000..262818ee79d56 --- /dev/null +++ b/lib/libc/locale/mbrtocXX_iconv.h @@ -0,0 +1,160 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2013 Ed Schouten <ed@FreeBSD.org> + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <sys/queue.h> + +#include <assert.h> +#include <errno.h> +#include <langinfo.h> +#include <limits.h> +#include <string.h> +#include <uchar.h> + +#include "../iconv/citrus_hash.h" +#include "../iconv/citrus_module.h" +#include "../iconv/citrus_iconv.h" +#include "xlocale_private.h" + +typedef struct { + bool initialized; + struct _citrus_iconv iconv; + char srcbuf[MB_LEN_MAX]; + size_t srcbuf_len; + union { + charXX_t widechar[DSTBUF_LEN]; + char bytes[sizeof(charXX_t) * DSTBUF_LEN]; + } dstbuf; + size_t dstbuf_len; +} _ConversionState; +_Static_assert(sizeof(_ConversionState) <= sizeof(mbstate_t), + "Size of _ConversionState must not exceed mbstate_t's size."); + +size_t +mbrtocXX_l(charXX_t * __restrict pc, const char * __restrict s, size_t n, + mbstate_t * __restrict ps, locale_t locale) +{ + _ConversionState *cs; + struct _citrus_iconv *handle; + size_t i, retval; + charXX_t retchar; + + FIX_LOCALE(locale); + if (ps == NULL) + ps = &locale->mbrtocXX; + cs = (_ConversionState *)ps; + handle = &cs->iconv; + + /* Reinitialize mbstate_t. */ + if (s == NULL || !cs->initialized) { + if (_citrus_iconv_open(&handle, + nl_langinfo_l(CODESET, locale), UTF_XX_INTERNAL) != 0) { + cs->initialized = false; + errno = EINVAL; + return (-1); + } + handle->cv_shared->ci_discard_ilseq = true; + handle->cv_shared->ci_hooks = NULL; + cs->srcbuf_len = cs->dstbuf_len = 0; + cs->initialized = true; + if (s == NULL) + return (0); + } + + /* See if we still have characters left from the previous invocation. */ + if (cs->dstbuf_len > 0) { + retval = (size_t)-3; + goto return_char; + } + + /* Fill up the read buffer as far as possible. */ + if (n > sizeof(cs->srcbuf) - cs->srcbuf_len) + n = sizeof(cs->srcbuf) - cs->srcbuf_len; + memcpy(cs->srcbuf + cs->srcbuf_len, s, n); + + /* Convert as few characters to the dst buffer as possible. */ + for (i = 0; ; i++) { + char *src, *dst; + size_t srcleft, dstleft, invlen; + int err; + + src = cs->srcbuf; + srcleft = cs->srcbuf_len + n; + dst = cs->dstbuf.bytes; + dstleft = i * sizeof(charXX_t); + assert(srcleft <= sizeof(cs->srcbuf) && + dstleft <= sizeof(cs->dstbuf.bytes)); + err = _citrus_iconv_convert(handle, &src, &srcleft, + &dst, &dstleft, 0, &invlen); + cs->dstbuf_len = (dst - cs->dstbuf.bytes) / sizeof(charXX_t); + + /* Got new character(s). Return the first. */ + if (cs->dstbuf_len > 0) { + assert(src - cs->srcbuf > cs->srcbuf_len); + retval = src - cs->srcbuf - cs->srcbuf_len; + cs->srcbuf_len = 0; + goto return_char; + } + + /* Increase dst buffer size, to obtain the surrogate pair. */ + if (err == E2BIG) + continue; + + /* Illegal sequence. */ + if (invlen > 0) { + cs->srcbuf_len = 0; + errno = EILSEQ; + return ((size_t)-1); + } + + /* Save unprocessed remainder for the next invocation. */ + memmove(cs->srcbuf, src, srcleft); + cs->srcbuf_len = srcleft; + return ((size_t)-2); + } + +return_char: + retchar = cs->dstbuf.widechar[0]; + memmove(&cs->dstbuf.widechar[0], &cs->dstbuf.widechar[1], + --cs->dstbuf_len * sizeof(charXX_t)); + if (pc != NULL) + *pc = retchar; + if (retchar == 0) + return (0); + return (retval); +} + +size_t +mbrtocXX(charXX_t * __restrict pc, const char * __restrict s, size_t n, + mbstate_t * __restrict ps) +{ + + return (mbrtocXX_l(pc, s, n, ps, __get_locale())); +} diff --git a/lib/libc/locale/mbrtowc.3 b/lib/libc/locale/mbrtowc.3 new file mode 100644 index 0000000000000..22b26cd2a2447 --- /dev/null +++ b/lib/libc/locale/mbrtowc.3 @@ -0,0 +1,179 @@ +.\" Copyright (c) 2002-2004 Tim J. Robbins +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd May 21, 2013 +.Dt MBRTOWC 3 +.Os +.Sh NAME +.Nm mbrtowc , +.Nm mbrtoc16 , +.Nm mbrtoc32 +.Nd "convert a character to a wide-character code (restartable)" +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In wchar.h +.Ft size_t +.Fo mbrtowc +.Fa "wchar_t * restrict pc" "const char * restrict s" "size_t n" +.Fa "mbstate_t * restrict ps" +.Fc +.In uchar.h +.Ft size_t +.Fo mbrtoc16 +.Fa "char16_t * restrict pc" "const char * restrict s" "size_t n" +.Fa "mbstate_t * restrict ps" +.Fc +.Ft size_t +.Fo mbrtoc32 +.Fa "char32_t * restrict pc" "const char * restrict s" "size_t n" +.Fa "mbstate_t * restrict ps" +.Fc +.Sh DESCRIPTION +The +.Fn mbrtowc , +.Fn mbrtoc16 +and +.Fn mbrtoc32 +functions inspect at most +.Fa n +bytes pointed to by +.Fa s +to determine the number of bytes needed to complete the next multibyte +character. +If a character can be completed, and +.Fa pc +is not +.Dv NULL , +the wide character which is represented by +.Fa s +is stored in the +.Vt wchar_t , +.Vt char16_t +or +.Vt char32_t +it points to. +.Pp +If +.Fa s +is +.Dv NULL , +these functions behave as if +.Fa pc +was +.Dv NULL , +.Fa s +was an empty string +.Pq Qq +and +.Fa n +was 1. +.Pp +The +.Vt mbstate_t +argument, +.Fa ps , +is used to keep track of the shift state. +If it is +.Dv NULL , +these functions use an internal, static +.Vt mbstate_t +object, which is initialized to the initial conversion state +at program startup. +.Pp +As a single +.Vt char16_t +is not large enough to represent certain multibyte characters, the +.Fn mbrtoc16 +function may need to be invoked multiple times to convert a single +multibyte character sequence. +.Sh RETURN VALUES +The +.Fn mbrtowc , +.Fn mbrtoc16 +and +.Fn mbrtoc32 +functions return: +.Bl -tag -width indent +.It 0 +The next +.Fa n +or fewer bytes +represent the null wide character +.Pq Li "L'\e0'" . +.It >0 +The next +.Fa n +or fewer bytes represent a valid character, these functions +return the number of bytes used to complete the multibyte character. +.It Po Vt size_t Pc Ns \-1 +An encoding error has occurred. +The next +.Fa n +or fewer bytes do not contribute to a valid multibyte character. +.It Po Vt size_t Pc Ns \-2 +The next +.Fa n +contribute to, but do not complete, a valid multibyte character sequence, +and all +.Fa n +bytes have been processed. +.El +.Pp +The +.Fn mbrtoc16 +function also returns: +.Bl -tag -width indent +.It Po Vt size_t Pc Ns \-3 +The next character resulting from a previous call has been stored. +No bytes from the input have been consumed. +.El +.Sh ERRORS +The +.Fn mbrtowc , +.Fn mbrtoc16 +and +.Fn mbrtoc32 +functions will fail if: +.Bl -tag -width Er +.It Bq Er EILSEQ +An invalid multibyte sequence was detected. +.It Bq Er EINVAL +The conversion state is invalid. +.El +.Sh SEE ALSO +.Xr mbtowc 3 , +.Xr multibyte 3 , +.Xr setlocale 3 , +.Xr wcrtomb 3 +.Sh STANDARDS +The +.Fn mbrtowc , +.Fn mbrtoc16 +and +.Fn mbrtoc32 +functions conform to +.St -isoC-2011 . diff --git a/lib/libc/locale/mbrtowc.c b/lib/libc/locale/mbrtowc.c new file mode 100644 index 0000000000000..4171886c8efa0 --- /dev/null +++ b/lib/libc/locale/mbrtowc.c @@ -0,0 +1,55 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2002-2004 Tim J. Robbins. + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <wchar.h> +#include "mblocal.h" + +size_t +mbrtowc_l(wchar_t * __restrict pwc, const char * __restrict s, + size_t n, mbstate_t * __restrict ps, locale_t locale) +{ + FIX_LOCALE(locale); + if (ps == NULL) + ps = &locale->mbrtowc; + return (XLOCALE_CTYPE(locale)->__mbrtowc(pwc, s, n, ps)); +} + +size_t +mbrtowc(wchar_t * __restrict pwc, const char * __restrict s, + size_t n, mbstate_t * __restrict ps) +{ + return mbrtowc_l(pwc, s, n, ps, __get_locale()); +} diff --git a/lib/libc/locale/mbsinit.3 b/lib/libc/locale/mbsinit.3 new file mode 100644 index 0000000000000..afc2500c3a771 --- /dev/null +++ b/lib/libc/locale/mbsinit.3 @@ -0,0 +1,67 @@ +.\" Copyright (c) 2002-2004 Tim J. Robbins +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd April 8, 2004 +.Dt MBSINIT 3 +.Os +.Sh NAME +.Nm mbsinit +.Nd "determine conversion object status" +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In wchar.h +.Ft int +.Fn mbsinit "const mbstate_t *ps" +.Sh DESCRIPTION +The +.Fn mbsinit +function determines whether the +.Vt mbstate_t +object pointed to by +.Fa ps +describes an initial conversion state. +.Sh RETURN VALUES +The +.Fn mbsinit +function returns non-zero if +.Fa ps +is +.Dv NULL +or describes an initial conversion state, +otherwise it returns zero. +.Sh SEE ALSO +.Xr mbrlen 3 , +.Xr mbrtowc 3 , +.Xr mbsrtowcs 3 , +.Xr multibyte 3 , +.Xr wcrtomb 3 , +.Xr wcsrtombs 3 +.Sh STANDARDS +The +.Fn mbsinit +function conforms to +.St -isoC-99 . diff --git a/lib/libc/locale/mbsinit.c b/lib/libc/locale/mbsinit.c new file mode 100644 index 0000000000000..a7df81911b267 --- /dev/null +++ b/lib/libc/locale/mbsinit.c @@ -0,0 +1,50 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2002-2004 Tim J. Robbins. + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <wchar.h> +#include "mblocal.h" + +int +mbsinit_l(const mbstate_t *ps, locale_t locale) +{ + FIX_LOCALE(locale); + return (XLOCALE_CTYPE(locale)->__mbsinit(ps)); +} +int +mbsinit(const mbstate_t *ps) +{ + return mbsinit_l(ps, __get_locale()); +} diff --git a/lib/libc/locale/mbsnrtowcs.c b/lib/libc/locale/mbsnrtowcs.c new file mode 100644 index 0000000000000..59574386b0afb --- /dev/null +++ b/lib/libc/locale/mbsnrtowcs.c @@ -0,0 +1,106 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright 2013 Garrett D'Amore <garrett@damore.org> + * Copyright 2010 Nexenta Systems, Inc. All rights reserved. + * Copyright (c) 2002-2004 Tim J. Robbins. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <errno.h> +#include <limits.h> +#include <stdlib.h> +#include <wchar.h> +#include "mblocal.h" + +size_t +mbsnrtowcs_l(wchar_t * __restrict dst, const char ** __restrict src, + size_t nms, size_t len, mbstate_t * __restrict ps, locale_t locale) +{ + FIX_LOCALE(locale); + if (ps == NULL) + ps = &locale->mbsnrtowcs; + return (XLOCALE_CTYPE(locale)->__mbsnrtowcs(dst, src, nms, len, ps)); +} +size_t +mbsnrtowcs(wchar_t * __restrict dst, const char ** __restrict src, + size_t nms, size_t len, mbstate_t * __restrict ps) +{ + return mbsnrtowcs_l(dst, src, nms, len, ps, __get_locale()); +} + +size_t +__mbsnrtowcs_std(wchar_t * __restrict dst, const char ** __restrict src, + size_t nms, size_t len, mbstate_t * __restrict ps, + mbrtowc_pfn_t pmbrtowc) +{ + const char *s; + size_t nchr; + wchar_t wc; + size_t nb; + + s = *src; + nchr = 0; + + if (dst == NULL) { + for (;;) { + if ((nb = pmbrtowc(&wc, s, nms, ps)) == (size_t)-1) + /* Invalid sequence - mbrtowc() sets errno. */ + return ((size_t)-1); + else if (nb == 0 || nb == (size_t)-2) + return (nchr); + s += nb; + nms -= nb; + nchr++; + } + /*NOTREACHED*/ + } + + while (len-- > 0) { + if ((nb = pmbrtowc(dst, s, nms, ps)) == (size_t)-1) { + *src = s; + return ((size_t)-1); + } else if (nb == (size_t)-2) { + *src = s + nms; + return (nchr); + } else if (nb == 0) { + *src = NULL; + return (nchr); + } + s += nb; + nms -= nb; + nchr++; + dst++; + } + *src = s; + return (nchr); +} diff --git a/lib/libc/locale/mbsrtowcs.3 b/lib/libc/locale/mbsrtowcs.3 new file mode 100644 index 0000000000000..eaf9f9093056f --- /dev/null +++ b/lib/libc/locale/mbsrtowcs.3 @@ -0,0 +1,135 @@ +.\" Copyright (c) 2002-2004 Tim J. Robbins +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.Dd July 21, 2004 +.Dt MBSRTOWCS 3 +.Os +.Sh NAME +.Nm mbsrtowcs , +.Nm mbsnrtowcs +.Nd "convert a character string to a wide-character string (restartable)" +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In wchar.h +.Ft size_t +.Fo mbsrtowcs +.Fa "wchar_t * restrict dst" "const char ** restrict src" "size_t len" +.Fa "mbstate_t * restrict ps" +.Fc +.Ft size_t +.Fo mbsnrtowcs +.Fa "wchar_t * restrict dst" "const char ** restrict src" "size_t nms" +.Fa "size_t len" "mbstate_t * restrict ps" +.Fc +.Sh DESCRIPTION +The +.Fn mbsrtowcs +function converts a sequence of multibyte characters pointed to indirectly by +.Fa src +into a sequence of corresponding wide characters and stores at most +.Fa len +of them in the +.Vt wchar_t +array pointed to by +.Fa dst , +until it encounters a terminating null character +.Pq Li '\e0' . +.Pp +If +.Fa dst +is +.Dv NULL , +no characters are stored. +.Pp +If +.Fa dst +is not +.Dv NULL , +the pointer pointed to by +.Fa src +is updated to point to the character after the one that conversion stopped at. +If conversion stops because a null character is encountered, +.Fa *src +is set to +.Dv NULL . +.Pp +The +.Vt mbstate_t +argument, +.Fa ps , +is used to keep track of the shift state. +If it is +.Dv NULL , +.Fn mbsrtowcs +uses an internal, static +.Vt mbstate_t +object, which is initialized to the initial conversion state +at program startup. +.Pp +The +.Fn mbsnrtowcs +function behaves identically to +.Fn mbsrtowcs , +except that conversion stops after reading at most +.Fa nms +bytes from the buffer pointed to by +.Fa src . +.Sh RETURN VALUES +The +.Fn mbsrtowcs +and +.Fn mbsnrtowcs +functions return the number of wide characters stored in +the array pointed to by +.Fa dst +if successful, otherwise it returns +.Po Vt size_t Pc Ns \-1 . +.Sh ERRORS +The +.Fn mbsrtowcs +and +.Fn mbsnrtowcs +functions will fail if: +.Bl -tag -width Er +.It Bq Er EILSEQ +An invalid multibyte character sequence was encountered. +.It Bq Er EINVAL +The conversion state is invalid. +.El +.Sh SEE ALSO +.Xr mbrtowc 3 , +.Xr mbstowcs 3 , +.Xr multibyte 3 , +.Xr wcsrtombs 3 +.Sh STANDARDS +The +.Fn mbsrtowcs +function conforms to +.St -isoC-99 . +.Pp +The +.Fn mbsnrtowcs +function is an extension to the standard. diff --git a/lib/libc/locale/mbsrtowcs.c b/lib/libc/locale/mbsrtowcs.c new file mode 100644 index 0000000000000..aefbee1c2d09d --- /dev/null +++ b/lib/libc/locale/mbsrtowcs.c @@ -0,0 +1,57 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2002-2004 Tim J. Robbins. + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <errno.h> +#include <limits.h> +#include <stdlib.h> +#include <wchar.h> +#include "mblocal.h" + +size_t +mbsrtowcs_l(wchar_t * __restrict dst, const char ** __restrict src, size_t len, + mbstate_t * __restrict ps, locale_t locale) +{ + FIX_LOCALE(locale); + if (ps == NULL) + ps = &locale->mbsrtowcs; + return (XLOCALE_CTYPE(locale)->__mbsnrtowcs(dst, src, SIZE_T_MAX, len, ps)); +} +size_t +mbsrtowcs(wchar_t * __restrict dst, const char ** __restrict src, size_t len, + mbstate_t * __restrict ps) +{ + return mbsrtowcs_l(dst, src, len, ps, __get_locale()); +} diff --git a/lib/libc/locale/mbstowcs.3 b/lib/libc/locale/mbstowcs.3 new file mode 100644 index 0000000000000..d74d6a5df7480 --- /dev/null +++ b/lib/libc/locale/mbstowcs.3 @@ -0,0 +1,87 @@ +.\" Copyright (c) 2002-2004 Tim J. Robbins. All rights reserved. +.\" Copyright (c) 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" Donn Seeley of BSDI. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" From @(#)multibyte.3 8.1 (Berkeley) 6/4/93 +.\" From FreeBSD: src/lib/libc/locale/multibyte.3,v 1.22 2003/11/08 03:23:11 tjr Exp +.\" $FreeBSD$ +.\" +.Dd April 8, 2004 +.Dt MBSTOWCS 3 +.Os +.Sh NAME +.Nm mbstowcs +.Nd convert a character string to a wide-character string +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In stdlib.h +.Ft size_t +.Fo mbstowcs +.Fa "wchar_t * restrict wcstring" "const char * restrict mbstring" +.Fa "size_t nwchars" +.Fc +.Sh DESCRIPTION +The +.Fn mbstowcs +function converts a multibyte character string +.Fa mbstring +beginning in the initial conversion state +into a wide character string +.Fa wcstring . +No more than +.Fa nwchars +wide characters are stored. +A terminating null wide character is appended if there is room. +.Sh RETURN VALUES +The +.Fn mbstowcs +function returns the number of wide characters converted, +not counting any terminating null wide character, or \-1 +if an invalid multibyte character was encountered. +.Sh ERRORS +The +.Fn mbstowcs +function will fail if: +.Bl -tag -width Er +.It Bq Er EILSEQ +An invalid multibyte sequence was detected. +.It Bq Er EINVAL +The conversion state is invalid. +.El +.Sh SEE ALSO +.Xr mbsrtowcs 3 , +.Xr mbtowc 3 , +.Xr multibyte 3 +.Sh STANDARDS +The +.Fn mbstowcs +function conforms to +.St -isoC-99 . diff --git a/lib/libc/locale/mbstowcs.c b/lib/libc/locale/mbstowcs.c new file mode 100644 index 0000000000000..3226d22ea254a --- /dev/null +++ b/lib/libc/locale/mbstowcs.c @@ -0,0 +1,58 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2002-2004 Tim J. Robbins. + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <limits.h> +#include <stdlib.h> +#include <wchar.h> +#include "mblocal.h" + +size_t +mbstowcs_l(wchar_t * __restrict pwcs, const char * __restrict s, size_t n, locale_t locale) +{ + static const mbstate_t initial; + mbstate_t mbs; + const char *sp; + FIX_LOCALE(locale); + + mbs = initial; + sp = s; + return (XLOCALE_CTYPE(locale)->__mbsnrtowcs(pwcs, &sp, SIZE_T_MAX, n, &mbs)); +} +size_t +mbstowcs(wchar_t * __restrict pwcs, const char * __restrict s, size_t n) +{ + return mbstowcs_l(pwcs, s, n, __get_locale()); +} diff --git a/lib/libc/locale/mbtowc.3 b/lib/libc/locale/mbtowc.3 new file mode 100644 index 0000000000000..859cdf819c408 --- /dev/null +++ b/lib/libc/locale/mbtowc.3 @@ -0,0 +1,114 @@ +.\" Copyright (c) 2002-2004 Tim J. Robbins. All rights reserved. +.\" Copyright (c) 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" Donn Seeley of BSDI. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" From @(#)multibyte.3 8.1 (Berkeley) 6/4/93 +.\" From FreeBSD: src/lib/libc/locale/multibyte.3,v 1.22 2003/11/08 03:23:11 tjr Exp +.\" $FreeBSD$ +.\" +.Dd April 11, 2004 +.Dt MBTOWC 3 +.Os +.Sh NAME +.Nm mbtowc +.Nd convert a character to a wide-character code +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In stdlib.h +.Ft int +.Fo mbtowc +.Fa "wchar_t * restrict wcharp" "const char * restrict mbchar" +.Fa "size_t nbytes" +.Fc +.Sh DESCRIPTION +The +.Fn mbtowc +function converts a multibyte character +.Fa mbchar +into a wide character according to the current conversion state, +and stores the result +in the object pointed to by +.Fa wcharp . +Up to +.Fa nbytes +bytes are examined. +.Pp +A call with a null +.Fa mbchar +pointer returns nonzero if the current encoding requires shift states, +zero otherwise; +if shift states are required, the shift state is reset to the initial state. +.Sh RETURN VALUES +If +.Fa mbchar +is +.Dv NULL , +the +.Fn mbtowc +function returns nonzero if shift states are supported, +zero otherwise. +.Pp +Otherwise, if +.Fa mbchar +is not a null pointer, +.Fn mbtowc +either returns 0 if +.Fa mbchar +represents the null wide character, or returns +the number of bytes processed in +.Fa mbchar , +or returns \-1 if no multibyte character +could be recognized or converted. +In this case, +.Fn mbtowc Ns 's +internal conversion state is undefined. +.Sh ERRORS +The +.Fn mbtowc +function will fail if: +.Bl -tag -width Er +.It Bq Er EILSEQ +An invalid multibyte sequence was detected. +.It Bq Er EINVAL +The internal conversion state is invalid. +.El +.Sh SEE ALSO +.Xr btowc 3 , +.Xr mblen 3 , +.Xr mbrtowc 3 , +.Xr mbstowcs 3 , +.Xr multibyte 3 , +.Xr wctomb 3 +.Sh STANDARDS +The +.Fn mbtowc +function conforms to +.St -isoC-99 . diff --git a/lib/libc/locale/mbtowc.c b/lib/libc/locale/mbtowc.c new file mode 100644 index 0000000000000..df1b204187e92 --- /dev/null +++ b/lib/libc/locale/mbtowc.c @@ -0,0 +1,69 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2002-2004 Tim J. Robbins. + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <errno.h> +#include <stdlib.h> +#include <wchar.h> +#include "mblocal.h" + +int +mbtowc_l(wchar_t * __restrict pwc, const char * __restrict s, size_t n, locale_t locale) +{ + static const mbstate_t initial; + size_t rval; + FIX_LOCALE(locale); + + if (s == NULL) { + /* No support for state dependent encodings. */ + locale->mbtowc = initial; + return (0); + } + rval = XLOCALE_CTYPE(locale)->__mbrtowc(pwc, s, n, &locale->mbtowc); + switch (rval) { + case (size_t)-2: + errno = EILSEQ; + /* FALLTHROUGH */ + case (size_t)-1: + return (-1); + default: + return ((int)rval); + } +} +int +mbtowc(wchar_t * __restrict pwc, const char * __restrict s, size_t n) +{ + return mbtowc_l(pwc, s, n, __get_locale()); +} diff --git a/lib/libc/locale/mskanji.5 b/lib/libc/locale/mskanji.5 new file mode 100644 index 0000000000000..8ebaccdd7af2b --- /dev/null +++ b/lib/libc/locale/mskanji.5 @@ -0,0 +1,70 @@ +.\" Copyright (c) 2002, 2003 Tim J. Robbins +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd August 7, 2003 +.Dt MSKANJI 5 +.Os +.Sh NAME +.Nm mskanji +.Nd "Shift-JIS (MS Kanji) encoding for Japanese text" +.Sh SYNOPSIS +.Nm ENCODING +.Qq MSKanji +.Sh DESCRIPTION +Shift-JIS, also known as MS Kanji or SJIS, is an encoding system for +Japanese characters, developed by Microsoft Corporation. +It encodes the characters from the +.Tn JIS +X 0201 (ASCII/JIS-Roman) and +.Tn JIS +X 0208 (Japanese) character sets as sequences of either one or two bytes. +.Pp +Characters from the +.Tn ASCII Ns +/JIS-Roman character set are encoded as single bytes between 0x00 and 0x7F +(ASCII) or 0xA1 and 0xDF (Half-width katakana). +.Pp +Characters from the +.Tn JIS +X 0208 character set are encoded as two bytes. +The first ranges from +0x81 - 0x9F, 0xE0 - 0xEA, 0xED - 0xEE (not +.Tn JIS : +.Tn NEC Ns - Ns +selected +.Tn IBM +extended characters), +0xF0 - 0xF9 (not +.Tn JIS : +user defined), +or 0xFA - 0xFC (not +.Tn JIS : +.Tn IBM +extended characters). +The second byte ranges from 0x40 - 0xFC, excluding 0x7F (delete). +.Sh SEE ALSO +.Xr euc 5 , +.Xr utf8 5 diff --git a/lib/libc/locale/mskanji.c b/lib/libc/locale/mskanji.c new file mode 100644 index 0000000000000..1585886cea311 --- /dev/null +++ b/lib/libc/locale/mskanji.c @@ -0,0 +1,193 @@ +/*- + * SPDX-License-Identifier: BSD-4-Clause + * + * Copyright 2013 Garrett D'Amore <garrett@damore.org> + * Copyright 2010 Nexenta Systems, Inc. All rights reserved. + * Copyright (c) 2002-2004 Tim J. Robbins. All rights reserved. + * + * ja_JP.SJIS locale table for BSD4.4/rune + * version 1.0 + * (C) Sin'ichiro MIYATANI / Phase One, Inc + * May 12, 1995 + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. All advertising materials mentioning features or use of this software + * must display the following acknowledgement: + * This product includes software developed by Phase One, Inc. + * 4. The name of Phase One, Inc. may be used to endorse or promote products + * derived from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#if defined(LIBC_SCCS) && !defined(lint) +static char sccsid[] = "@(#)mskanji.c 1.0 (Phase One) 5/5/95"; +#endif /* LIBC_SCCS and not lint */ +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <sys/types.h> +#include <errno.h> +#include <runetype.h> +#include <stdlib.h> +#include <string.h> +#include <wchar.h> +#include "mblocal.h" + +extern int __mb_sb_limit; + +static size_t _MSKanji_mbrtowc(wchar_t * __restrict, const char * __restrict, + size_t, mbstate_t * __restrict); +static int _MSKanji_mbsinit(const mbstate_t *); +static size_t _MSKanji_wcrtomb(char * __restrict, wchar_t, + mbstate_t * __restrict); +static size_t _MSKanji_mbsnrtowcs(wchar_t * __restrict, + const char ** __restrict, size_t, size_t, + mbstate_t * __restrict); +static size_t _MSKanji_wcsnrtombs(char * __restrict, + const wchar_t ** __restrict, size_t, size_t, + mbstate_t * __restrict); + +typedef struct { + wchar_t ch; +} _MSKanjiState; + +int +_MSKanji_init(struct xlocale_ctype *l, _RuneLocale *rl) +{ + + l->__mbrtowc = _MSKanji_mbrtowc; + l->__wcrtomb = _MSKanji_wcrtomb; + l->__mbsnrtowcs = _MSKanji_mbsnrtowcs; + l->__wcsnrtombs = _MSKanji_wcsnrtombs; + l->__mbsinit = _MSKanji_mbsinit; + l->runes = rl; + l->__mb_cur_max = 2; + l->__mb_sb_limit = 224; + return (0); +} + +static int +_MSKanji_mbsinit(const mbstate_t *ps) +{ + + return (ps == NULL || ((const _MSKanjiState *)ps)->ch == 0); +} + +static size_t +_MSKanji_mbrtowc(wchar_t * __restrict pwc, const char * __restrict s, size_t n, + mbstate_t * __restrict ps) +{ + _MSKanjiState *ms; + wchar_t wc; + + ms = (_MSKanjiState *)ps; + + if ((ms->ch & ~0xFF) != 0) { + /* Bad conversion state. */ + errno = EINVAL; + return ((size_t)-1); + } + + if (s == NULL) { + s = ""; + n = 1; + pwc = NULL; + } + + if (n == 0) + /* Incomplete multibyte sequence */ + return ((size_t)-2); + + if (ms->ch != 0) { + if (*s == '\0') { + errno = EILSEQ; + return ((size_t)-1); + } + wc = (ms->ch << 8) | (*s & 0xFF); + if (pwc != NULL) + *pwc = wc; + ms->ch = 0; + return (1); + } + wc = *s++ & 0xff; + if ((wc > 0x80 && wc < 0xa0) || (wc >= 0xe0 && wc < 0xfd)) { + if (n < 2) { + /* Incomplete multibyte sequence */ + ms->ch = wc; + return ((size_t)-2); + } + if (*s == '\0') { + errno = EILSEQ; + return ((size_t)-1); + } + wc = (wc << 8) | (*s++ & 0xff); + if (pwc != NULL) + *pwc = wc; + return (2); + } else { + if (pwc != NULL) + *pwc = wc; + return (wc == L'\0' ? 0 : 1); + } +} + +static size_t +_MSKanji_wcrtomb(char * __restrict s, wchar_t wc, mbstate_t * __restrict ps) +{ + _MSKanjiState *ms; + int len, i; + + ms = (_MSKanjiState *)ps; + + if (ms->ch != 0) { + errno = EINVAL; + return ((size_t)-1); + } + + if (s == NULL) + /* Reset to initial shift state (no-op) */ + return (1); + len = (wc > 0x100) ? 2 : 1; + for (i = len; i-- > 0; ) + *s++ = wc >> (i << 3); + return (len); +} + +static size_t +_MSKanji_mbsnrtowcs(wchar_t * __restrict dst, + const char ** __restrict src, size_t nms, + size_t len, mbstate_t * __restrict ps) +{ + return (__mbsnrtowcs_std(dst, src, nms, len, ps, _MSKanji_mbrtowc)); +} + +static size_t +_MSKanji_wcsnrtombs(char * __restrict dst, + const wchar_t ** __restrict src, size_t nwc, + size_t len, mbstate_t * __restrict ps) +{ + return (__wcsnrtombs_std(dst, src, nwc, len, ps, _MSKanji_wcrtomb)); +} diff --git a/lib/libc/locale/multibyte.3 b/lib/libc/locale/multibyte.3 new file mode 100644 index 0000000000000..49abda8cc02ca --- /dev/null +++ b/lib/libc/locale/multibyte.3 @@ -0,0 +1,142 @@ +.\" Copyright (c) 2002-2004 Tim J. Robbins. All rights reserved. +.\" Copyright (c) 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" Donn Seeley of BSDI. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)multibyte.3 8.1 (Berkeley) 6/4/93 +.\" $FreeBSD$ +.\" +.Dd April 8, 2004 +.Dt MULTIBYTE 3 +.Os +.Sh NAME +.Nm multibyte +.Nd multibyte and wide character manipulation functions +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In limits.h +.In stdlib.h +.In wchar.h +.Sh DESCRIPTION +The basic elements of some written natural languages, such as Chinese, +cannot be represented uniquely with single C +.Vt char Ns s . +The C standard supports two different ways of dealing with +extended natural language encodings: +wide characters and +multibyte characters. +Wide characters are an internal representation +which allows each basic element to map +to a single object of type +.Vt wchar_t . +Multibyte characters are used for input and output +and code each basic element as a sequence of C +.Vt char Ns s . +Individual basic elements may map into one or more +(up to +.Dv MB_LEN_MAX ) +bytes in a multibyte character. +.Pp +The current locale +.Pq Xr setlocale 3 +governs the interpretation of wide and multibyte characters. +The locale category +.Dv LC_CTYPE +specifically controls this interpretation. +The +.Vt wchar_t +type is wide enough to hold the largest value +in the wide character representations for all locales. +.Pp +Multibyte strings may contain +.Sq shift +indicators to switch to and from +particular modes within the given representation. +If explicit bytes are used to signal shifting, +these are not recognized as separate characters +but are lumped with a neighboring character. +There is always a distinguished +.Sq initial +shift state. +Some functions (e.g., +.Xr mblen 3 , +.Xr mbtowc 3 +and +.Xr wctomb 3 ) +maintain static shift state internally, whereas +others store it in an +.Vt mbstate_t +object passed by the caller. +Shift states are undefined after a call to +.Xr setlocale 3 +with the +.Dv LC_CTYPE +or +.Dv LC_ALL +categories. +.Pp +For convenience in processing, +the wide character with value 0 +(the null wide character) +is recognized as the wide character string terminator, +and the character with value 0 +(the null byte) +is recognized as the multibyte character string terminator. +Null bytes are not permitted within multibyte characters. +.Pp +The C library provides the following functions for dealing with +multibyte characters: +.Bl -column "Description" +.It Sy "Function Description" +.It Xr mblen 3 Ta "get number of bytes in a character" +.It Xr mbrlen 3 Ta "get number of bytes in a character (restartable)" +.It Xr mbrtowc 3 Ta "convert a character to a wide-character code (restartable)" +.It Xr mbsrtowcs 3 Ta "convert a character string to a wide-character string (restartable)" +.It Xr mbstowcs 3 Ta "convert a character string to a wide-character string" +.It Xr mbtowc 3 Ta "convert a character to a wide-character code" +.It Xr wcrtomb 3 Ta "convert a wide-character code to a character (restartable)" +.It Xr wcstombs 3 Ta "convert a wide-character string to a character string" +.It Xr wcsrtombs 3 Ta "convert a wide-character string to a character string (restartable)" +.It Xr wctomb 3 Ta "convert a wide-character code to a character" +.El +.Sh SEE ALSO +.Xr mklocale 1 , +.Xr setlocale 3 , +.Xr stdio 3 , +.Xr big5 5 , +.Xr euc 5 , +.Xr gb18030 5 , +.Xr gb2312 5 , +.Xr gbk 5 , +.Xr mskanji 5 , +.Xr utf8 5 +.Sh STANDARDS +These functions conform to +.St -isoC-99 . diff --git a/lib/libc/locale/newlocale.3 b/lib/libc/locale/newlocale.3 new file mode 100644 index 0000000000000..c7414be73d3a8 --- /dev/null +++ b/lib/libc/locale/newlocale.3 @@ -0,0 +1,112 @@ +.\" Copyright (c) 2011 The FreeBSD Foundation +.\" All rights reserved. +.\" +.\" This documentation was written by David Chisnall under sponsorship from +.\" the FreeBSD Foundation. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.Dd September 17, 2011 +.Dt NEWLOCALE 3 +.Os +.Sh NAME +.Nm newlocale +.Nd Creates a new locale +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In locale.h +.Ft locale_t +.Fn newlocale "int mask" "const char * locale" "locale_t base" +.Sh DESCRIPTION +Creates a new locale, inheriting some properties from an existing locale. +The +.Fa mask +defines the components that the new locale will have set to the locale with the +name specified in the +.Fa locale +parameter. +Any other components will be inherited from +.Fa base . +The +.Fa mask +is either +.Fa LC_ALL_MASK , +indicating all possible locale components, +or the logical OR of some combination of the following: +.Bl -tag -width "LC_MESSAGES_MASK" -offset indent +.It LC_COLLATE_MASK +The locale for string collation routines. +This controls alphabetic ordering in +.Xr strcoll 3 +and +.Xr strxfrm 3 . +.It LC_CTYPE_MASK +The locale for the +.Xr ctype 3 +and +.Xr multibyte 3 +functions. +This controls recognition of upper and lower case, alphabetic or +non-alphabetic characters, and so on. +.It LC_MESSAGES_MASK +Set a locale for message catalogs, see +.Xr catopen 3 +function. +.It LC_MONETARY_MASK +Set a locale for formatting monetary values; this affects +the +.Xr localeconv 3 +function. +.It LC_NUMERIC_MASK +Set a locale for formatting numbers. +This controls the formatting of decimal points in input and output of floating +point numbers in functions such as +.Xr printf 3 +and +.Xr scanf 3 , +as well as values returned by +.Xr localeconv 3 . +.It LC_TIME_MASK +Set a locale for formatting dates and times using the +.Xr strftime 3 +function. +.El +This function uses the same rules for loading locale components as +.Xr setlocale 3 . +.Sh RETURN VALUES +Returns a new, valid, +.Fa locale_t +or NULL if an error occurs. +You must free the returned locale with +.Xr freelocale 3 . +.Sh SEE ALSO +.Xr duplocale 3 , +.Xr freelocale 3 , +.Xr localeconv 3 , +.Xr querylocale 3 , +.Xr uselocale 3 , +.Xr xlocale 3 +.Sh STANDARDS +This function conforms to +.St -p1003.1-2008 . diff --git a/lib/libc/locale/nextwctype.3 b/lib/libc/locale/nextwctype.3 new file mode 100644 index 0000000000000..0f6f88079f642 --- /dev/null +++ b/lib/libc/locale/nextwctype.3 @@ -0,0 +1,67 @@ +.\" +.\" Copyright (c) 2004 Tim J. Robbins +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd July 21, 2005 +.Dt NEXTWCTYPE 3 +.Os +.Sh NAME +.Nm nextwctype +.Nd "iterate through character classes" +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In wctype.h +.Ft wint_t +.Fn nextwctype "wint_t ch" "wctype_t wct" +.Sh DESCRIPTION +The +.Fn nextwctype +function determines the next character after +.Fa ch +that is a member of character class +.Fa wct . +If +.Fa ch +is \-1, the search begins at the first member of +.Fa wct . +.Sh RETURN VALUES +The +.Fn nextwctype +function returns the next character, or \-1 if there are no more. +.Sh COMPATIBILITY +This function is a non-standard +.Fx +extension and should not be used where the standard +.Fn iswctype +function would suffice. +.Sh SEE ALSO +.Xr wctype 3 +.Sh HISTORY +The +.Fn nextwctype +function appeared in +.Fx 5.4 . diff --git a/lib/libc/locale/nextwctype.c b/lib/libc/locale/nextwctype.c new file mode 100644 index 0000000000000..3ef73ddec7b6d --- /dev/null +++ b/lib/libc/locale/nextwctype.c @@ -0,0 +1,105 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2004 Tim J. Robbins. + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <runetype.h> +#include <wchar.h> +#include <wctype.h> +#include "mblocal.h" + +wint_t +nextwctype_l(wint_t wc, wctype_t wct, locale_t locale) +{ + size_t lim; + FIX_LOCALE(locale); + _RuneLocale *runes = XLOCALE_CTYPE(locale)->runes; + _RuneRange *rr = &runes->__runetype_ext; + _RuneEntry *base, *re; + int noinc; + + noinc = 0; + if (wc < _CACHED_RUNES) { + wc++; + while (wc < _CACHED_RUNES) { + if (runes->__runetype[wc] & wct) + return (wc); + wc++; + } + wc--; + } + if (rr->__ranges != NULL && wc < rr->__ranges[0].__min) { + wc = rr->__ranges[0].__min; + noinc = 1; + } + + /* Binary search -- see bsearch.c for explanation. */ + base = rr->__ranges; + for (lim = rr->__nranges; lim != 0; lim >>= 1) { + re = base + (lim >> 1); + if (re->__min <= wc && wc <= re->__max) + goto found; + else if (wc > re->__max) { + base = re + 1; + lim--; + } + } + return (-1); +found: + if (!noinc) + wc++; + if (re->__min <= wc && wc <= re->__max) { + if (re->__types != NULL) { + for (; wc <= re->__max; wc++) + if (re->__types[wc - re->__min] & wct) + return (wc); + } else if (re->__map & wct) + return (wc); + } + while (++re < rr->__ranges + rr->__nranges) { + wc = re->__min; + if (re->__types != NULL) { + for (; wc <= re->__max; wc++) + if (re->__types[wc - re->__min] & wct) + return (wc); + } else if (re->__map & wct) + return (wc); + } + return (-1); +} +wint_t +nextwctype(wint_t wc, wctype_t wct) +{ + return nextwctype_l(wc, wct, __get_locale()); +} diff --git a/lib/libc/locale/nl_langinfo.3 b/lib/libc/locale/nl_langinfo.3 new file mode 100644 index 0000000000000..d8c01b2f135f3 --- /dev/null +++ b/lib/libc/locale/nl_langinfo.3 @@ -0,0 +1,102 @@ +.\" Copyright (c) 2001 Alexey Zelkin <phantom@FreeBSD.org> +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd July 30, 2012 +.Dt NL_LANGINFO 3 +.Os +.Sh NAME +.Nm nl_langinfo +.Nd language information +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In langinfo.h +.Ft char * +.Fn nl_langinfo "nl_item item" +.Ft char * +.Fn nl_langinfo_l "nl_item item" "locale_t loc" +.Sh DESCRIPTION +The +.Fn nl_langinfo +function returns a pointer to a string containing information relevant to +the particular language or cultural area defined in the program or thread's +locale, or in the case of +.Fn nl_langinfo_l , +the locale passed as the second argument. +The manifest constant names and values of +.Fa item +are defined in +.In langinfo.h . +.Pp +Calls to +.Fn setlocale +with a category corresponding to the category of +.Fa item , +or to the +category +.Dv LC_ALL , +may overwrite the buffer pointed to by the return value. +.Sh RETURN VALUES +In a locale where langinfo data is not defined, +.Fn nl_langinfo +returns a pointer to the corresponding string in the +.Tn POSIX +locale. +.Fn nl_langinfo_l +returns the same values as +.Fn nl_langinfo . +In all locales, +.Fn nl_langinfo +returns a pointer to an empty string if +.Fa item +contains an invalid setting. +.Sh EXAMPLES +For example: +.Pp +.Dl nl_langinfo(ABDAY_1) +.Pp +would return a pointer to the string +.Qq Li Dom +if the identified language was +Portuguese, and +.Qq Li Sun +if the identified language was English. +.Sh SEE ALSO +.Xr setlocale 3 +.Sh STANDARDS +The +.Fn nl_langinfo +function conforms to +.St -susv2 . +The +.Fn nl_langinfo_l +function conforms to +.St -p1003.1-2008 . +.Sh HISTORY +The +.Fn nl_langinfo +function first appeared in +.Fx 4.6 . diff --git a/lib/libc/locale/nl_langinfo.c b/lib/libc/locale/nl_langinfo.c new file mode 100644 index 0000000000000..b51b0425515bc --- /dev/null +++ b/lib/libc/locale/nl_langinfo.c @@ -0,0 +1,209 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2001, 2003 Alexey Zelkin <phantom@FreeBSD.org> + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <langinfo.h> +#include <limits.h> +#include <locale.h> +#include <stdlib.h> +#include <string.h> +#include <runetype.h> +#include <wchar.h> + +#include "mblocal.h" +#include "lnumeric.h" +#include "lmessages.h" +#include "lmonetary.h" +#include "../stdtime/timelocal.h" + +#define _REL(BASE) ((int)item-BASE) + +char * +nl_langinfo_l(nl_item item, locale_t loc) +{ + char *ret, *cs; + const char *s; + FIX_LOCALE(loc); + + switch (item) { + case CODESET: + s = XLOCALE_CTYPE(loc)->runes->__encoding; + if (strcmp(s, "EUC-CN") == 0) + ret = "eucCN"; + else if (strcmp(s, "EUC-JP") == 0) + ret = "eucJP"; + else if (strcmp(s, "EUC-KR") == 0) + ret = "eucKR"; + else if (strcmp(s, "EUC-TW") == 0) + ret = "eucTW"; + else if (strcmp(s, "BIG5") == 0) + ret = "Big5"; + else if (strcmp(s, "MSKanji") == 0) + ret = "SJIS"; + else if (strcmp(s, "NONE") == 0) + ret = "US-ASCII"; + else if (strncmp(s, "NONE:", 5) == 0) + ret = (char *)(s + 5); + else + ret = (char *)s; + break; + case D_T_FMT: + ret = (char *) __get_current_time_locale(loc)->c_fmt; + break; + case D_FMT: + ret = (char *) __get_current_time_locale(loc)->x_fmt; + break; + case T_FMT: + ret = (char *) __get_current_time_locale(loc)->X_fmt; + break; + case T_FMT_AMPM: + ret = (char *) __get_current_time_locale(loc)->ampm_fmt; + break; + case AM_STR: + ret = (char *) __get_current_time_locale(loc)->am; + break; + case PM_STR: + ret = (char *) __get_current_time_locale(loc)->pm; + break; + case DAY_1: case DAY_2: case DAY_3: + case DAY_4: case DAY_5: case DAY_6: case DAY_7: + ret = (char*) __get_current_time_locale(loc)->weekday[_REL(DAY_1)]; + break; + case ABDAY_1: case ABDAY_2: case ABDAY_3: + case ABDAY_4: case ABDAY_5: case ABDAY_6: case ABDAY_7: + ret = (char*) __get_current_time_locale(loc)->wday[_REL(ABDAY_1)]; + break; + case MON_1: case MON_2: case MON_3: case MON_4: + case MON_5: case MON_6: case MON_7: case MON_8: + case MON_9: case MON_10: case MON_11: case MON_12: + ret = (char*) __get_current_time_locale(loc)->month[_REL(MON_1)]; + break; + case ABMON_1: case ABMON_2: case ABMON_3: case ABMON_4: + case ABMON_5: case ABMON_6: case ABMON_7: case ABMON_8: + case ABMON_9: case ABMON_10: case ABMON_11: case ABMON_12: + ret = (char*) __get_current_time_locale(loc)->mon[_REL(ABMON_1)]; + break; + case ALTMON_1: case ALTMON_2: case ALTMON_3: case ALTMON_4: + case ALTMON_5: case ALTMON_6: case ALTMON_7: case ALTMON_8: + case ALTMON_9: case ALTMON_10: case ALTMON_11: case ALTMON_12: + ret = (char*) + __get_current_time_locale(loc)->alt_month[_REL(ALTMON_1)]; + break; + case ERA: + /* XXX: need to be implemented */ + ret = ""; + break; + case ERA_D_FMT: + /* XXX: need to be implemented */ + ret = ""; + break; + case ERA_D_T_FMT: + /* XXX: need to be implemented */ + ret = ""; + break; + case ERA_T_FMT: + /* XXX: need to be implemented */ + ret = ""; + break; + case ALT_DIGITS: + /* XXX: need to be implemented */ + ret = ""; + break; + case RADIXCHAR: + ret = (char*) __get_current_numeric_locale(loc)->decimal_point; + break; + case THOUSEP: + ret = (char*) __get_current_numeric_locale(loc)->thousands_sep; + break; + case YESEXPR: + ret = (char*) __get_current_messages_locale(loc)->yesexpr; + break; + case NOEXPR: + ret = (char*) __get_current_messages_locale(loc)->noexpr; + break; + /* + * YESSTR and NOSTR items marked with LEGACY are available, but not + * recommended by SUSv2 to be used in portable applications since + * they're subject to remove in future specification editions. + */ + case YESSTR: /* LEGACY */ + ret = (char*) __get_current_messages_locale(loc)->yesstr; + break; + case NOSTR: /* LEGACY */ + ret = (char*) __get_current_messages_locale(loc)->nostr; + break; + /* + * SUSv2 special formatted currency string + */ + case CRNCYSTR: + ret = ""; + cs = (char*) __get_current_monetary_locale(loc)->currency_symbol; + if (*cs != '\0') { + char pos = localeconv_l(loc)->p_cs_precedes; + + if (pos == localeconv_l(loc)->n_cs_precedes) { + char psn = '\0'; + + if (pos == CHAR_MAX) { + if (strcmp(cs, __get_current_monetary_locale(loc)->mon_decimal_point) == 0) + psn = '.'; + } else + psn = pos ? '-' : '+'; + if (psn != '\0') { + int clen = strlen(cs); + + if ((loc->csym = reallocf(loc->csym, clen + 2)) != NULL) { + *loc->csym = psn; + strcpy(loc->csym + 1, cs); + ret = loc->csym; + } + } + } + } + break; + case D_MD_ORDER: /* FreeBSD local extension */ + ret = (char *) __get_current_time_locale(loc)->md_order; + break; + default: + ret = ""; + } + return (ret); +} + +char * +nl_langinfo(nl_item item) +{ + return nl_langinfo_l(item, __get_locale()); +} diff --git a/lib/libc/locale/nomacros.c b/lib/libc/locale/nomacros.c new file mode 100644 index 0000000000000..66cf40e61ec9a --- /dev/null +++ b/lib/libc/locale/nomacros.c @@ -0,0 +1,18 @@ +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +/* + * Tell <ctype.h> to generate extern versions of all its inline + * functions. The extern versions get called if the system doesn't + * support inlines or the user defines _DONT_USE_CTYPE_INLINE_ + * before including <ctype.h>. + */ +#define _EXTERNALIZE_CTYPE_INLINES_ + +/* + * Also make sure <runetype.h> does not generate an inline definition + * of __getCurrentRuneLocale(). + */ +#define __RUNETYPE_INTERNAL + +#include <ctype.h> diff --git a/lib/libc/locale/none.c b/lib/libc/locale/none.c new file mode 100644 index 0000000000000..d95bb0087c094 --- /dev/null +++ b/lib/libc/locale/none.c @@ -0,0 +1,217 @@ +/*- + * SPDX-License-Identifier: BSD-3-Clause + * + * Copyright 2013 Garrett D'Amore <garrett@damore.org> + * Copyright 2010 Nexenta Systems, Inc. All rights reserved. + * Copyright (c) 2002-2004 Tim J. Robbins. All rights reserved. + * Copyright (c) 1993 + * The Regents of the University of California. All rights reserved. + * + * This code is derived from software contributed to Berkeley by + * Paul Borman at Krystal Technologies. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#if defined(LIBC_SCCS) && !defined(lint) +static char sccsid[] = "@(#)none.c 8.1 (Berkeley) 6/4/93"; +#endif /* LIBC_SCCS and not lint */ +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <errno.h> +#include <limits.h> +#include <runetype.h> +#include <stddef.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <wchar.h> +#include "mblocal.h" + +static size_t _none_mbrtowc(wchar_t * __restrict, const char * __restrict, + size_t, mbstate_t * __restrict); +static int _none_mbsinit(const mbstate_t *); +static size_t _none_mbsnrtowcs(wchar_t * __restrict dst, + const char ** __restrict src, size_t nms, size_t len, + mbstate_t * __restrict ps __unused); +static size_t _none_wcrtomb(char * __restrict, wchar_t, + mbstate_t * __restrict); +static size_t _none_wcsnrtombs(char * __restrict, const wchar_t ** __restrict, + size_t, size_t, mbstate_t * __restrict); + +/* setup defaults */ + +int __mb_cur_max = 1; +int __mb_sb_limit = 256; /* Expected to be <= _CACHED_RUNES */ + +int +_none_init(struct xlocale_ctype *l, _RuneLocale *rl) +{ + + l->__mbrtowc = _none_mbrtowc; + l->__mbsinit = _none_mbsinit; + l->__mbsnrtowcs = _none_mbsnrtowcs; + l->__wcrtomb = _none_wcrtomb; + l->__wcsnrtombs = _none_wcsnrtombs; + l->runes = rl; + l->__mb_cur_max = 1; + l->__mb_sb_limit = 256; + return(0); +} + +static int +_none_mbsinit(const mbstate_t *ps __unused) +{ + + /* + * Encoding is not state dependent - we are always in the + * initial state. + */ + return (1); +} + +static size_t +_none_mbrtowc(wchar_t * __restrict pwc, const char * __restrict s, size_t n, + mbstate_t * __restrict ps __unused) +{ + + if (s == NULL) + /* Reset to initial shift state (no-op) */ + return (0); + if (n == 0) + /* Incomplete multibyte sequence */ + return ((size_t)-2); + if (pwc != NULL) + *pwc = (unsigned char)*s; + return (*s == '\0' ? 0 : 1); +} + +static size_t +_none_wcrtomb(char * __restrict s, wchar_t wc, + mbstate_t * __restrict ps __unused) +{ + + if (s == NULL) + /* Reset to initial shift state (no-op) */ + return (1); + if (wc < 0 || wc > UCHAR_MAX) { + errno = EILSEQ; + return ((size_t)-1); + } + *s = (unsigned char)wc; + return (1); +} + +static size_t +_none_mbsnrtowcs(wchar_t * __restrict dst, const char ** __restrict src, + size_t nms, size_t len, mbstate_t * __restrict ps __unused) +{ + const char *s; + size_t nchr; + + if (dst == NULL) { + s = memchr(*src, '\0', nms); + return (s != NULL ? s - *src : nms); + } + + s = *src; + nchr = 0; + while (len-- > 0 && nms-- > 0) { + if ((*dst++ = (unsigned char)*s++) == L'\0') { + *src = NULL; + return (nchr); + } + nchr++; + } + *src = s; + return (nchr); +} + +static size_t +_none_wcsnrtombs(char * __restrict dst, const wchar_t ** __restrict src, + size_t nwc, size_t len, mbstate_t * __restrict ps __unused) +{ + const wchar_t *s; + size_t nchr; + + if (dst == NULL) { + for (s = *src; nwc > 0 && *s != L'\0'; s++, nwc--) { + if (*s < 0 || *s > UCHAR_MAX) { + errno = EILSEQ; + return ((size_t)-1); + } + } + return (s - *src); + } + + s = *src; + nchr = 0; + while (len-- > 0 && nwc-- > 0) { + if (*s < 0 || *s > UCHAR_MAX) { + *src = s; + errno = EILSEQ; + return ((size_t)-1); + } + if ((*dst++ = *s++) == '\0') { + *src = NULL; + return (nchr); + } + nchr++; + } + *src = s; + return (nchr); +} + +/* setup defaults */ + +struct xlocale_ctype __xlocale_global_ctype = { + {{0}, "C"}, + (_RuneLocale*)&_DefaultRuneLocale, + _none_mbrtowc, + _none_mbsinit, + _none_mbsnrtowcs, + _none_wcrtomb, + _none_wcsnrtombs, + 1, /* __mb_cur_max, */ + 256 /* __mb_sb_limit */ +}; + +struct xlocale_ctype __xlocale_C_ctype = { + {{0}, "C"}, + (_RuneLocale*)&_DefaultRuneLocale, + _none_mbrtowc, + _none_mbsinit, + _none_mbsnrtowcs, + _none_wcrtomb, + _none_wcsnrtombs, + 1, /* __mb_cur_max, */ + 256 /* __mb_sb_limit */ +}; diff --git a/lib/libc/locale/querylocale.3 b/lib/libc/locale/querylocale.3 new file mode 100644 index 0000000000000..d1bb688ed907d --- /dev/null +++ b/lib/libc/locale/querylocale.3 @@ -0,0 +1,54 @@ +.\" Copyright (c) 2011 The FreeBSD Foundation +.\" All rights reserved. +.\" +.\" This documentation was written by David Chisnall under sponsorship from +.\" the FreeBSD Foundation. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd May 3, 2013 +.Dt QUERYLOCALE 3 +.Os +.Sh NAME +.Nm querylocale +.Nd Look up the locale name for a specified category +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In locale.h +.Ft const char * +.Fn querylocale "int mask" "locale_t locale" +.Sh DESCRIPTION +Returns the name of the locale for the category specified by +.Fa mask . +This possible values for the mask are the same as those in +.Xr newlocale 3 . +If more than one bit in the mask is set, the returned value is undefined. +.Sh SEE ALSO +.Xr duplocale 3 , +.Xr freelocale 3 , +.Xr localeconv 3 , +.Xr newlocale 3 , +.Xr uselocale 3 , +.Xr xlocale 3 diff --git a/lib/libc/locale/rpmatch.3 b/lib/libc/locale/rpmatch.3 new file mode 100644 index 0000000000000..b34c5bfa199a9 --- /dev/null +++ b/lib/libc/locale/rpmatch.3 @@ -0,0 +1,66 @@ +.\" +.\" Copyright (c) 2005 Tim J. Robbins +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd July 21, 2005 +.Dt RPMATCH 3 +.Os +.Sh NAME +.Nm rpmatch +.Nd "determine whether the response to a question is affirmative or negative" +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In stdlib.h +.Ft int +.Fn rpmatch "const char *response" +.Sh DESCRIPTION +The +.Fn rpmatch +function determines whether the +.Fa response +argument is an affirmative or negative response to a question +according to the current locale. +.Sh RETURN VALUES +The +.Fn rpmatch +functions returns: +.Bl -tag -width indent +.It 1 +The response is affirmative. +.It 0 +The response is negative. +.It \&-1 +The response is not recognized. +.El +.Sh SEE ALSO +.Xr nl_langinfo 3 , +.Xr setlocale 3 +.Sh HISTORY +The +.Fn rpmatch +function appeared in +.Fx 6.0 . diff --git a/lib/libc/locale/rpmatch.c b/lib/libc/locale/rpmatch.c new file mode 100644 index 0000000000000..e4c366a00afb5 --- /dev/null +++ b/lib/libc/locale/rpmatch.c @@ -0,0 +1,57 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2004-2005 Tim J. Robbins. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <langinfo.h> +#include <regex.h> +#include <stdlib.h> + +int +rpmatch(const char *response) +{ + regex_t yes, no; + int ret; + + if (regcomp(&yes, nl_langinfo(YESEXPR), REG_EXTENDED|REG_NOSUB) != 0) + return (-1); + if (regcomp(&no, nl_langinfo(NOEXPR), REG_EXTENDED|REG_NOSUB) != 0) { + regfree(&yes); + return (-1); + } + if (regexec(&yes, response, 0, NULL, 0) == 0) + ret = 1; + else if (regexec(&no, response, 0, NULL, 0) == 0) + ret = 0; + else + ret = -1; + regfree(&yes); + regfree(&no); + return (ret); +} diff --git a/lib/libc/locale/rune.c b/lib/libc/locale/rune.c new file mode 100644 index 0000000000000..03fe903f74329 --- /dev/null +++ b/lib/libc/locale/rune.c @@ -0,0 +1,254 @@ +/*- + * SPDX-License-Identifier: BSD-3-Clause + * + * Copyright 2014 Garrett D'Amore <garrett@damore.org> + * Copyright 2010 Nexenta Systems, Inc. All rights reserved. + * Copyright (c) 1993 + * The Regents of the University of California. All rights reserved. + * + * This code is derived from software contributed to Berkeley by + * Paul Borman at Krystal Technologies. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#if defined(LIBC_SCCS) && !defined(lint) +static char sccsid[] = "@(#)rune.c 8.1 (Berkeley) 6/4/93"; +#endif /* LIBC_SCCS and not lint */ +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include "namespace.h" +#include <arpa/inet.h> +#include <errno.h> +#include <runetype.h> +#include <stdio.h> +#include <string.h> +#include <stdlib.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <sys/mman.h> +#include <fcntl.h> +#include <unistd.h> +#include "un-namespace.h" + +#include "endian.h" +#include "runefile.h" + +_RuneLocale * +_Read_RuneMagi(const char *fname) +{ + char *fdata, *data; + void *lastp; + _FileRuneLocale *frl; + _RuneLocale *rl; + _FileRuneEntry *frr; + _RuneEntry *rr; + struct stat sb; + int x, saverr; + void *variable; + _FileRuneEntry *runetype_ext_ranges; + _FileRuneEntry *maplower_ext_ranges; + _FileRuneEntry *mapupper_ext_ranges; + int runetype_ext_len = 0; + int fd; + + if ((fd = _open(fname, O_RDONLY)) < 0) { + errno = EINVAL; + return (NULL); + } + + if (_fstat(fd, &sb) < 0) { + (void) _close(fd); + errno = EINVAL; + return (NULL); + } + + if ((size_t)sb.st_size < sizeof (_FileRuneLocale)) { + (void) _close(fd); + errno = EINVAL; + return (NULL); + } + + + fdata = mmap(NULL, sb.st_size, PROT_READ, MAP_PRIVATE, fd, 0); + (void) _close(fd); + if (fdata == NULL) { + errno = EINVAL; + return (NULL); + } + + frl = (_FileRuneLocale *)(void *)fdata; + lastp = fdata + sb.st_size; + + variable = frl + 1; + + if (memcmp(frl->magic, _FILE_RUNE_MAGIC_1, sizeof (frl->magic))) { + goto invalid; + } + + runetype_ext_ranges = (_FileRuneEntry *)variable; + variable = runetype_ext_ranges + BSWAP(frl->runetype_ext_nranges); + if (variable > lastp) { + goto invalid; + } + + maplower_ext_ranges = (_FileRuneEntry *)variable; + variable = maplower_ext_ranges + BSWAP(frl->maplower_ext_nranges); + if (variable > lastp) { + goto invalid; + } + + mapupper_ext_ranges = (_FileRuneEntry *)variable; + variable = mapupper_ext_ranges + BSWAP(frl->mapupper_ext_nranges); + if (variable > lastp) { + goto invalid; + } + + frr = runetype_ext_ranges; + for (x = 0; x < BSWAP(frl->runetype_ext_nranges); ++x) { + uint32_t *types; + + if (BSWAP(frr[x].map) == 0) { + int len = BSWAP(frr[x].max) - BSWAP(frr[x].min) + 1; + types = variable; + variable = types + len; + runetype_ext_len += len; + if (variable > lastp) { + goto invalid; + } + } + } + + if ((char *)variable + BSWAP(frl->variable_len) > (char *)lastp) { + goto invalid; + } + + /* + * Convert from disk format to host format. + */ + data = malloc(sizeof(_RuneLocale) + + (BSWAP(frl->runetype_ext_nranges) + BSWAP(frl->maplower_ext_nranges) + + BSWAP(frl->mapupper_ext_nranges)) * sizeof(_RuneEntry) + + runetype_ext_len * sizeof(*rr->__types) + + BSWAP(frl->variable_len)); + if (data == NULL) { + saverr = errno; + munmap(fdata, sb.st_size); + errno = saverr; + return (NULL); + } + + rl = (_RuneLocale *)data; + rl->__variable = rl + 1; + + memcpy(rl->__magic, _RUNE_MAGIC_1, sizeof(rl->__magic)); + memcpy(rl->__encoding, frl->encoding, sizeof(rl->__encoding)); + + rl->__variable_len = BSWAP(frl->variable_len); + rl->__runetype_ext.__nranges = BSWAP(frl->runetype_ext_nranges); + rl->__maplower_ext.__nranges = BSWAP(frl->maplower_ext_nranges); + rl->__mapupper_ext.__nranges = BSWAP(frl->mapupper_ext_nranges); + + for (x = 0; x < _CACHED_RUNES; ++x) { + rl->__runetype[x] = BSWAP(frl->runetype[x]); + rl->__maplower[x] = BSWAP(frl->maplower[x]); + rl->__mapupper[x] = BSWAP(frl->mapupper[x]); + } + + rl->__runetype_ext.__ranges = (_RuneEntry *)rl->__variable; + rl->__variable = rl->__runetype_ext.__ranges + + rl->__runetype_ext.__nranges; + + rl->__maplower_ext.__ranges = (_RuneEntry *)rl->__variable; + rl->__variable = rl->__maplower_ext.__ranges + + rl->__maplower_ext.__nranges; + + rl->__mapupper_ext.__ranges = (_RuneEntry *)rl->__variable; + rl->__variable = rl->__mapupper_ext.__ranges + + rl->__mapupper_ext.__nranges; + + variable = mapupper_ext_ranges + BSWAP(frl->mapupper_ext_nranges); + frr = runetype_ext_ranges; + rr = rl->__runetype_ext.__ranges; + for (x = 0; x < rl->__runetype_ext.__nranges; ++x) { + uint32_t *types; + + rr[x].__min = BSWAP(frr[x].min); + rr[x].__max = BSWAP(frr[x].max); + rr[x].__map = BSWAP(frr[x].map); + if (rr[x].__map == 0) { + int len = rr[x].__max - rr[x].__min + 1; + types = variable; + variable = types + len; + rr[x].__types = rl->__variable; + rl->__variable = rr[x].__types + len; + while (len-- > 0) + rr[x].__types[len] = types[len]; + } else + rr[x].__types = NULL; + } + + frr = maplower_ext_ranges; + rr = rl->__maplower_ext.__ranges; + for (x = 0; x < rl->__maplower_ext.__nranges; ++x) { + rr[x].__min = BSWAP(frr[x].min); + rr[x].__max = BSWAP(frr[x].max); + rr[x].__map = BSWAP(frr[x].map); + } + + frr = mapupper_ext_ranges; + rr = rl->__mapupper_ext.__ranges; + for (x = 0; x < rl->__mapupper_ext.__nranges; ++x) { + rr[x].__min = BSWAP(frr[x].min); + rr[x].__max = BSWAP(frr[x].max); + rr[x].__map = BSWAP(frr[x].map); + } + + memcpy(rl->__variable, variable, rl->__variable_len); + munmap(fdata, sb.st_size); + + /* + * Go out and zero pointers that should be zero. + */ + if (!rl->__variable_len) + rl->__variable = NULL; + + if (!rl->__runetype_ext.__nranges) + rl->__runetype_ext.__ranges = NULL; + + if (!rl->__maplower_ext.__nranges) + rl->__maplower_ext.__ranges = NULL; + + if (!rl->__mapupper_ext.__nranges) + rl->__mapupper_ext.__ranges = NULL; + + return (rl); + +invalid: + munmap(fdata, sb.st_size); + errno = EINVAL; + return (NULL); +} diff --git a/lib/libc/locale/runefile.h b/lib/libc/locale/runefile.h new file mode 100644 index 0000000000000..18a603c03e312 --- /dev/null +++ b/lib/libc/locale/runefile.h @@ -0,0 +1,63 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2005 Ruslan Ermilov + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#ifndef _RUNEFILE_H_ +#define _RUNEFILE_H_ + +#include <sys/types.h> + +#ifndef _CACHED_RUNES +#define _CACHED_RUNES (1 << 8) +#endif + +typedef struct { + int32_t min; + int32_t max; + int32_t map; +} _FileRuneEntry; + +typedef struct { + char magic[8]; + char encoding[32]; + + uint32_t runetype[_CACHED_RUNES]; + int32_t maplower[_CACHED_RUNES]; + int32_t mapupper[_CACHED_RUNES]; + + int32_t runetype_ext_nranges; + int32_t maplower_ext_nranges; + int32_t mapupper_ext_nranges; + + int32_t variable_len; +} _FileRuneLocale; + +#define _FILE_RUNE_MAGIC_1 "RuneMag1" + +#endif /* !_RUNEFILE_H_ */ diff --git a/lib/libc/locale/runetype.c b/lib/libc/locale/runetype.c new file mode 100644 index 0000000000000..8fe656aded05e --- /dev/null +++ b/lib/libc/locale/runetype.c @@ -0,0 +1,91 @@ +/*- + * SPDX-License-Identifier: BSD-3-Clause + * + * Copyright (c) 1993 + * The Regents of the University of California. All rights reserved. + * + * This code is derived from software contributed to Berkeley by + * Paul Borman at Krystal Technologies. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <ctype.h> +#include <stdio.h> +#include <runetype.h> +#include <wchar.h> +#include "mblocal.h" + +unsigned long +___runetype_l(__ct_rune_t c, locale_t locale) +{ + size_t lim; + FIX_LOCALE(locale); + _RuneRange *rr = &(XLOCALE_CTYPE(locale)->runes->__runetype_ext); + _RuneEntry *base, *re; + + if (c < 0 || c == EOF) + return(0L); + + /* Binary search -- see bsearch.c for explanation. */ + base = rr->__ranges; + for (lim = rr->__nranges; lim != 0; lim >>= 1) { + re = base + (lim >> 1); + if (re->__min <= c && c <= re->__max) { + if (re->__types) + return(re->__types[c - re->__min]); + else + return(re->__map); + } else if (c > re->__max) { + base = re + 1; + lim--; + } + } + + return(0L); +} +unsigned long +___runetype(__ct_rune_t c) +{ + return ___runetype_l(c, __get_locale()); +} + +int ___mb_cur_max(void) +{ + return XLOCALE_CTYPE(__get_locale())->__mb_cur_max; +} +int ___mb_cur_max_l(locale_t locale) +{ + FIX_LOCALE(locale); + return XLOCALE_CTYPE(locale)->__mb_cur_max; +} diff --git a/lib/libc/locale/setlocale.3 b/lib/libc/locale/setlocale.3 new file mode 100644 index 0000000000000..70615ad2675ee --- /dev/null +++ b/lib/libc/locale/setlocale.3 @@ -0,0 +1,173 @@ +.\" Copyright (c) 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" Donn Seeley at BSDI. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)setlocale.3 8.1 (Berkeley) 6/9/93 +.\" $FreeBSD$ +.\" +.Dd November 21, 2003 +.Dt SETLOCALE 3 +.Os +.Sh NAME +.Nm setlocale +.Nd natural language formatting for C +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In locale.h +.Ft char * +.Fn setlocale "int category" "const char *locale" +.Sh DESCRIPTION +The +.Fn setlocale +function sets the C library's notion +of natural language formatting style +for particular sets of routines. +Each such style is called a +.Sq locale +and is invoked using an appropriate name passed as a C string. +.Pp +The +.Fn setlocale +function recognizes several categories of routines. +These are the categories and the sets of routines they select: +.Bl -tag -width LC_MONETARY +.It Dv LC_ALL +Set the entire locale generically. +.It Dv LC_COLLATE +Set a locale for string collation routines. +This controls alphabetic ordering in +.Fn strcoll +and +.Fn strxfrm . +.It Dv LC_CTYPE +Set a locale for the +.Xr ctype 3 +and +.Xr multibyte 3 +functions. +This controls recognition of upper and lower case, +alphabetic or non-alphabetic characters, +and so on. +.It Dv LC_MESSAGES +Set a locale for message catalogs, see +.Xr catopen 3 +function. +.It Dv LC_MONETARY +Set a locale for formatting monetary values; +this affects the +.Fn localeconv +function. +.It Dv LC_NUMERIC +Set a locale for formatting numbers. +This controls the formatting of decimal points +in input and output of floating point numbers +in functions such as +.Fn printf +and +.Fn scanf , +as well as values returned by +.Fn localeconv . +.It Dv LC_TIME +Set a locale for formatting dates and times using the +.Fn strftime +function. +.El +.Pp +Only three locales are defined by default, +the empty string +.Li \&"\|" +which denotes the native environment, and the +.Li \&"C" +and +.Li \&"POSIX" +locales, which denote the C language environment. +A +.Fa locale +argument of +.Dv NULL +causes +.Fn setlocale +to return the current locale. +By default, C programs start in the +.Li \&"C" +locale. +The only function in the library that sets the locale is +.Fn setlocale ; +the locale is never changed as a side effect of some other routine. +.Sh RETURN VALUES +Upon successful completion, +.Fn setlocale +returns the string associated with the specified +.Fa category +for the requested +.Fa locale . +The +.Fn setlocale +function returns +.Dv NULL +and fails to change the locale +if the given combination of +.Fa category +and +.Fa locale +makes no sense. +.Sh FILES +.Bl -tag -width /usr/share/locale/locale/category -compact +.It Pa $PATH_LOCALE/ Ns Em locale/category +.It Pa /usr/share/locale/ Ns Em locale/category +locale file for the locale +.Em locale +and the category +.Em category . +.El +.Sh ERRORS +No errors are defined. +.Sh SEE ALSO +.Xr colldef 1 , +.Xr mklocale 1 , +.Xr catopen 3 , +.Xr ctype 3 , +.Xr localeconv 3 , +.Xr multibyte 3 , +.Xr strcoll 3 , +.Xr strxfrm 3 , +.Xr euc 5 , +.Xr utf8 5 , +.Xr environ 7 +.Sh STANDARDS +The +.Fn setlocale +function conforms to +.St -isoC-99 . +.Sh HISTORY +The +.Fn setlocale +function first appeared in +.Bx 4.4 . diff --git a/lib/libc/locale/setlocale.c b/lib/libc/locale/setlocale.c new file mode 100644 index 0000000000000..e0ba66e0e35ac --- /dev/null +++ b/lib/libc/locale/setlocale.c @@ -0,0 +1,328 @@ +/*- + * SPDX-License-Identifier: BSD-3-Clause + * + * Copyright (c) 1996 - 2002 FreeBSD Project + * Copyright (c) 1991, 1993 + * The Regents of the University of California. All rights reserved. + * + * This code is derived from software contributed to Berkeley by + * Paul Borman at Krystal Technologies. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#if defined(LIBC_SCCS) && !defined(lint) +static char sccsid[] = "@(#)setlocale.c 8.1 (Berkeley) 7/4/93"; +#endif /* LIBC_SCCS and not lint */ +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <sys/types.h> +#include <sys/stat.h> +#include <errno.h> +#include <limits.h> +#include <locale.h> +#include <paths.h> /* for _PATH_LOCALE */ +#include <stdlib.h> +#include <string.h> +#include <unistd.h> +#include "collate.h" +#include "lmonetary.h" /* for __monetary_load_locale() */ +#include "lnumeric.h" /* for __numeric_load_locale() */ +#include "lmessages.h" /* for __messages_load_locale() */ +#include "setlocale.h" +#include "ldpart.h" +#include "../stdtime/timelocal.h" /* for __time_load_locale() */ + +/* + * Category names for getenv() + */ +static const char categories[_LC_LAST][12] = { + "LC_ALL", + "LC_COLLATE", + "LC_CTYPE", + "LC_MONETARY", + "LC_NUMERIC", + "LC_TIME", + "LC_MESSAGES", +}; + +/* + * Current locales for each category + */ +static char current_categories[_LC_LAST][ENCODING_LEN + 1] = { + "C", + "C", + "C", + "C", + "C", + "C", + "C", +}; + +/* + * Path to locale storage directory + */ +char *_PathLocale; + +/* + * The locales we are going to try and load + */ +static char new_categories[_LC_LAST][ENCODING_LEN + 1]; +static char saved_categories[_LC_LAST][ENCODING_LEN + 1]; + +static char current_locale_string[_LC_LAST * (ENCODING_LEN + 1/*"/"*/ + 1)]; + +static char *currentlocale(void); +static char *loadlocale(int); +const char *__get_locale_env(int); + +char * +setlocale(int category, const char *locale) +{ + int i, j, len, saverr; + const char *env, *r; + + if (category < LC_ALL || category >= _LC_LAST) { + errno = EINVAL; + return (NULL); + } + if (locale == NULL) + return (category != LC_ALL ? + current_categories[category] : currentlocale()); + + /* + * Default to the current locale for everything. + */ + for (i = 1; i < _LC_LAST; ++i) + (void)strcpy(new_categories[i], current_categories[i]); + + /* + * Now go fill up new_categories from the locale argument + */ + if (!*locale) { + if (category == LC_ALL) { + for (i = 1; i < _LC_LAST; ++i) { + env = __get_locale_env(i); + if (strlen(env) > ENCODING_LEN) { + errno = EINVAL; + return (NULL); + } + (void)strcpy(new_categories[i], env); + } + } else { + env = __get_locale_env(category); + if (strlen(env) > ENCODING_LEN) { + errno = EINVAL; + return (NULL); + } + (void)strcpy(new_categories[category], env); + } + } else if (category != LC_ALL) { + if (strlen(locale) > ENCODING_LEN) { + errno = EINVAL; + return (NULL); + } + (void)strcpy(new_categories[category], locale); + } else { + if ((r = strchr(locale, '/')) == NULL) { + if (strlen(locale) > ENCODING_LEN) { + errno = EINVAL; + return (NULL); + } + for (i = 1; i < _LC_LAST; ++i) + (void)strcpy(new_categories[i], locale); + } else { + for (i = 1; r[1] == '/'; ++r) + ; + if (!r[1]) { + errno = EINVAL; + return (NULL); /* Hmm, just slashes... */ + } + do { + if (i == _LC_LAST) + break; /* Too many slashes... */ + if ((len = r - locale) > ENCODING_LEN) { + errno = EINVAL; + return (NULL); + } + (void)strlcpy(new_categories[i], locale, + len + 1); + i++; + while (*r == '/') + r++; + locale = r; + while (*r && *r != '/') + r++; + } while (*locale); + while (i < _LC_LAST) { + (void)strcpy(new_categories[i], + new_categories[i - 1]); + i++; + } + } + } + + if (category != LC_ALL) + return (loadlocale(category)); + + for (i = 1; i < _LC_LAST; ++i) { + (void)strcpy(saved_categories[i], current_categories[i]); + if (loadlocale(i) == NULL) { + saverr = errno; + for (j = 1; j < i; j++) { + (void)strcpy(new_categories[j], + saved_categories[j]); + if (loadlocale(j) == NULL) { + (void)strcpy(new_categories[j], "C"); + (void)loadlocale(j); + } + } + errno = saverr; + return (NULL); + } + } + return (currentlocale()); +} + +static char * +currentlocale(void) +{ + int i; + + (void)strcpy(current_locale_string, current_categories[1]); + + for (i = 2; i < _LC_LAST; ++i) + if (strcmp(current_categories[1], current_categories[i])) { + for (i = 2; i < _LC_LAST; ++i) { + (void)strcat(current_locale_string, "/"); + (void)strcat(current_locale_string, + current_categories[i]); + } + break; + } + return (current_locale_string); +} + +static char * +loadlocale(int category) +{ + char *new = new_categories[category]; + char *old = current_categories[category]; + int (*func) (const char *); + int saved_errno; + + if ((new[0] == '.' && + (new[1] == '\0' || (new[1] == '.' && new[2] == '\0'))) || + strchr(new, '/') != NULL) { + errno = EINVAL; + return (NULL); + } + saved_errno = errno; + errno = __detect_path_locale(); + if (errno != 0) + return (NULL); + errno = saved_errno; + + switch (category) { + case LC_CTYPE: + func = __wrap_setrunelocale; + break; + case LC_COLLATE: + func = __collate_load_tables; + break; + case LC_TIME: + func = __time_load_locale; + break; + case LC_NUMERIC: + func = __numeric_load_locale; + break; + case LC_MONETARY: + func = __monetary_load_locale; + break; + case LC_MESSAGES: + func = __messages_load_locale; + break; + default: + errno = EINVAL; + return (NULL); + } + + if (strcmp(new, old) == 0) + return (old); + + if (func(new) != _LDP_ERROR) { + (void)strcpy(old, new); + (void)strcpy(__xlocale_global_locale.components[category-1]->locale, new); + return (old); + } + + return (NULL); +} + +const char * +__get_locale_env(int category) +{ + const char *env; + + /* 1. check LC_ALL. */ + env = getenv(categories[0]); + + /* 2. check LC_* */ + if (env == NULL || !*env) + env = getenv(categories[category]); + + /* 3. check LANG */ + if (env == NULL || !*env) + env = getenv("LANG"); + + /* 4. if none is set, fall to "C" */ + if (env == NULL || !*env) + env = "C"; + + return (env); +} + +/* + * Detect locale storage location and store its value to _PathLocale variable + */ +int +__detect_path_locale(void) +{ + if (_PathLocale == NULL) { + char *p = getenv("PATH_LOCALE"); + + if (p != NULL && !issetugid()) { + if (strlen(p) + 1/*"/"*/ + ENCODING_LEN + + 1/*"/"*/ + CATEGORY_LEN >= PATH_MAX) + return (ENAMETOOLONG); + _PathLocale = strdup(p); + if (_PathLocale == NULL) + return (errno == 0 ? ENOMEM : errno); + } else + _PathLocale = _PATH_LOCALE; + } + return (0); +} diff --git a/lib/libc/locale/setlocale.h b/lib/libc/locale/setlocale.h new file mode 100644 index 0000000000000..c199cdf62dafd --- /dev/null +++ b/lib/libc/locale/setlocale.h @@ -0,0 +1,42 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (C) 1997 by Andrey A. Chernov, Moscow, Russia. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#ifndef _SETLOCALE_H_ +#define _SETLOCALE_H_ + +#define ENCODING_LEN 31 +#define CATEGORY_LEN 11 + +extern char *_PathLocale; + +int __detect_path_locale(void); +int __wrap_setrunelocale(const char *); + +#endif /* !_SETLOCALE_H_ */ diff --git a/lib/libc/locale/setrunelocale.c b/lib/libc/locale/setrunelocale.c new file mode 100644 index 0000000000000..97af903f27242 --- /dev/null +++ b/lib/libc/locale/setrunelocale.c @@ -0,0 +1,213 @@ +/*- + * SPDX-License-Identifier: BSD-3-Clause + * + * Copyright (c) 1993 + * The Regents of the University of California. All rights reserved. + * + * This code is derived from software contributed to Berkeley by + * Paul Borman at Krystal Technologies. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#define __RUNETYPE_INTERNAL 1 + +#include <runetype.h> +#include <errno.h> +#include <limits.h> +#include <string.h> +#include <stdio.h> +#include <stdlib.h> +#include <unistd.h> +#include <wchar.h> +#include "ldpart.h" +#include "mblocal.h" +#include "setlocale.h" + +#undef _CurrentRuneLocale +extern _RuneLocale const *_CurrentRuneLocale; +#ifndef __NO_TLS +/* + * A cached version of the runes for this thread. Used by ctype.h + */ +_Thread_local const _RuneLocale *_ThreadRuneLocale; +#endif + +extern int __mb_sb_limit; + +extern _RuneLocale *_Read_RuneMagi(const char *); + +static int __setrunelocale(struct xlocale_ctype *l, const char *); + +static void +destruct_ctype(void *v) +{ + struct xlocale_ctype *l = v; + + if (&_DefaultRuneLocale != l->runes) + free(l->runes); + free(l); +} + +const _RuneLocale * +__getCurrentRuneLocale(void) +{ + + return (XLOCALE_CTYPE(__get_locale())->runes); +} + +static void +free_runes(_RuneLocale *rl) +{ + if ((rl != &_DefaultRuneLocale) && (rl)) { + free(rl); + } +} + +static int +__setrunelocale(struct xlocale_ctype *l, const char *encoding) +{ + _RuneLocale *rl; + int ret; + char *path; + struct xlocale_ctype saved = *l; + + /* + * The "C" and "POSIX" locale are always here. + */ + if (strcmp(encoding, "C") == 0 || strcmp(encoding, "POSIX") == 0) { + free_runes(saved.runes); + (void) _none_init(l, (_RuneLocale*)&_DefaultRuneLocale); + return (0); + } + + /* Range checking not needed, encoding length already checked before */ + if (asprintf(&path, "%s/%s/LC_CTYPE", _PathLocale, encoding) == -1) + return (errno); + + if ((rl = _Read_RuneMagi(path)) == NULL) { + free(path); + errno = EINVAL; + return (errno); + } + free(path); + + l->__mbrtowc = NULL; + l->__mbsinit = NULL; + l->__mbsnrtowcs = NULL; + l->__wcrtomb = NULL; + l->__wcsnrtombs = NULL; + + rl->__sputrune = NULL; + rl->__sgetrune = NULL; + if (strcmp(rl->__encoding, "NONE:US-ASCII") == 0) + ret = _ascii_init(l, rl); + else if (strncmp(rl->__encoding, "NONE", 4) == 0) + ret = _none_init(l, rl); + else if (strcmp(rl->__encoding, "UTF-8") == 0) + ret = _UTF8_init(l, rl); + else if (strcmp(rl->__encoding, "EUC-CN") == 0) + ret = _EUC_CN_init(l, rl); + else if (strcmp(rl->__encoding, "EUC-JP") == 0) + ret = _EUC_JP_init(l, rl); + else if (strcmp(rl->__encoding, "EUC-KR") == 0) + ret = _EUC_KR_init(l, rl); + else if (strcmp(rl->__encoding, "EUC-TW") == 0) + ret = _EUC_TW_init(l, rl); + else if (strcmp(rl->__encoding, "GB18030") == 0) + ret = _GB18030_init(l, rl); + else if (strcmp(rl->__encoding, "GB2312") == 0) + ret = _GB2312_init(l, rl); + else if (strcmp(rl->__encoding, "GBK") == 0) + ret = _GBK_init(l, rl); + else if (strcmp(rl->__encoding, "BIG5") == 0) + ret = _BIG5_init(l, rl); + else if (strcmp(rl->__encoding, "MSKanji") == 0) + ret = _MSKanji_init(l, rl); + else + ret = EFTYPE; + + if (ret == 0) { + /* Free the old runes if it exists. */ + free_runes(saved.runes); + } else { + /* Restore the saved version if this failed. */ + memcpy(l, &saved, sizeof(struct xlocale_ctype)); + free(rl); + } + + return (ret); +} + +int +__wrap_setrunelocale(const char *locale) +{ + int ret = __setrunelocale(&__xlocale_global_ctype, locale); + + if (ret != 0) { + errno = ret; + return (_LDP_ERROR); + } + __mb_cur_max = __xlocale_global_ctype.__mb_cur_max; + __mb_sb_limit = __xlocale_global_ctype.__mb_sb_limit; + _CurrentRuneLocale = __xlocale_global_ctype.runes; + return (_LDP_LOADED); +} + +#ifndef __NO_TLS +void +__set_thread_rune_locale(locale_t loc) +{ + + if (loc == NULL) { + _ThreadRuneLocale = &_DefaultRuneLocale; + } else if (loc == LC_GLOBAL_LOCALE) { + _ThreadRuneLocale = 0; + } else { + _ThreadRuneLocale = XLOCALE_CTYPE(loc)->runes; + } +} +#endif + +void * +__ctype_load(const char *locale, locale_t unused __unused) +{ + struct xlocale_ctype *l = calloc(sizeof(struct xlocale_ctype), 1); + + l->header.header.destructor = destruct_ctype; + if (__setrunelocale(l, locale)) { + free(l); + return (NULL); + } + return (l); +} diff --git a/lib/libc/locale/table.c b/lib/libc/locale/table.c new file mode 100644 index 0000000000000..2f790afd4dd89 --- /dev/null +++ b/lib/libc/locale/table.c @@ -0,0 +1,265 @@ +/*- + * SPDX-License-Identifier: BSD-3-Clause + * + * Copyright (c) 1993 + * The Regents of the University of California. All rights reserved. + * + * This code is derived from software contributed to Berkeley by + * Paul Borman at Krystal Technologies. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#if defined(LIBC_SCCS) && !defined(lint) +static char sccsid[] = "@(#)table.c 8.1 (Berkeley) 6/27/93"; +#endif /* LIBC_SCCS and not lint */ +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <ctype.h> +#include <runetype.h> +#include <wchar.h> +#include "mblocal.h" + +const _RuneLocale _DefaultRuneLocale = { + _RUNE_MAGIC_1, + "NONE", + NULL, + NULL, + 0xFFFD, + + { /*00*/ _CTYPE_C, + _CTYPE_C, + _CTYPE_C, + _CTYPE_C, + _CTYPE_C, + _CTYPE_C, + _CTYPE_C, + _CTYPE_C, + /*08*/ _CTYPE_C, + _CTYPE_C|_CTYPE_S|_CTYPE_B, + _CTYPE_C|_CTYPE_S, + _CTYPE_C|_CTYPE_S, + _CTYPE_C|_CTYPE_S, + _CTYPE_C|_CTYPE_S, + _CTYPE_C, + _CTYPE_C, + /*10*/ _CTYPE_C, + _CTYPE_C, + _CTYPE_C, + _CTYPE_C, + _CTYPE_C, + _CTYPE_C, + _CTYPE_C, + _CTYPE_C, + /*18*/ _CTYPE_C, + _CTYPE_C, + _CTYPE_C, + _CTYPE_C, + _CTYPE_C, + _CTYPE_C, + _CTYPE_C, + _CTYPE_C, + /*20*/ _CTYPE_S|_CTYPE_B|_CTYPE_R, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + /*28*/ _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + /*30*/ _CTYPE_D|_CTYPE_R|_CTYPE_G|_CTYPE_X|_CTYPE_N|0, + _CTYPE_D|_CTYPE_R|_CTYPE_G|_CTYPE_X|_CTYPE_N|1, + _CTYPE_D|_CTYPE_R|_CTYPE_G|_CTYPE_X|_CTYPE_N|2, + _CTYPE_D|_CTYPE_R|_CTYPE_G|_CTYPE_X|_CTYPE_N|3, + _CTYPE_D|_CTYPE_R|_CTYPE_G|_CTYPE_X|_CTYPE_N|4, + _CTYPE_D|_CTYPE_R|_CTYPE_G|_CTYPE_X|_CTYPE_N|5, + _CTYPE_D|_CTYPE_R|_CTYPE_G|_CTYPE_X|_CTYPE_N|6, + _CTYPE_D|_CTYPE_R|_CTYPE_G|_CTYPE_X|_CTYPE_N|7, + /*38*/ _CTYPE_D|_CTYPE_R|_CTYPE_G|_CTYPE_X|_CTYPE_N|8, + _CTYPE_D|_CTYPE_R|_CTYPE_G|_CTYPE_X|_CTYPE_N|9, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + /*40*/ _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_U|_CTYPE_X|_CTYPE_R|_CTYPE_G|_CTYPE_A|10, + _CTYPE_U|_CTYPE_X|_CTYPE_R|_CTYPE_G|_CTYPE_A|11, + _CTYPE_U|_CTYPE_X|_CTYPE_R|_CTYPE_G|_CTYPE_A|12, + _CTYPE_U|_CTYPE_X|_CTYPE_R|_CTYPE_G|_CTYPE_A|13, + _CTYPE_U|_CTYPE_X|_CTYPE_R|_CTYPE_G|_CTYPE_A|14, + _CTYPE_U|_CTYPE_X|_CTYPE_R|_CTYPE_G|_CTYPE_A|15, + _CTYPE_U|_CTYPE_R|_CTYPE_G|_CTYPE_A, + /*48*/ _CTYPE_U|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_U|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_U|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_U|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_U|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_U|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_U|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_U|_CTYPE_R|_CTYPE_G|_CTYPE_A, + /*50*/ _CTYPE_U|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_U|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_U|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_U|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_U|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_U|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_U|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_U|_CTYPE_R|_CTYPE_G|_CTYPE_A, + /*58*/ _CTYPE_U|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_U|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_U|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + /*60*/ _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_L|_CTYPE_X|_CTYPE_R|_CTYPE_G|_CTYPE_A|10, + _CTYPE_L|_CTYPE_X|_CTYPE_R|_CTYPE_G|_CTYPE_A|11, + _CTYPE_L|_CTYPE_X|_CTYPE_R|_CTYPE_G|_CTYPE_A|12, + _CTYPE_L|_CTYPE_X|_CTYPE_R|_CTYPE_G|_CTYPE_A|13, + _CTYPE_L|_CTYPE_X|_CTYPE_R|_CTYPE_G|_CTYPE_A|14, + _CTYPE_L|_CTYPE_X|_CTYPE_R|_CTYPE_G|_CTYPE_A|15, + _CTYPE_L|_CTYPE_R|_CTYPE_G|_CTYPE_A, + /*68*/ _CTYPE_L|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_L|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_L|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_L|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_L|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_L|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_L|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_L|_CTYPE_R|_CTYPE_G|_CTYPE_A, + /*70*/ _CTYPE_L|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_L|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_L|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_L|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_L|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_L|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_L|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_L|_CTYPE_R|_CTYPE_G|_CTYPE_A, + /*78*/ _CTYPE_L|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_L|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_L|_CTYPE_R|_CTYPE_G|_CTYPE_A, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_P|_CTYPE_R|_CTYPE_G, + _CTYPE_C, + }, + { 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, + 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, + 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, + 0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f, + 0x20, 0x21, 0x22, 0x23, 0x24, 0x25, 0x26, 0x27, + 0x28, 0x29, 0x2a, 0x2b, 0x2c, 0x2d, 0x2e, 0x2f, + 0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, + 0x38, 0x39, 0x3a, 0x3b, 0x3c, 0x3d, 0x3e, 0x3f, + 0x40, 'a', 'b', 'c', 'd', 'e', 'f', 'g', + 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', + 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', + 'x', 'y', 'z', 0x5b, 0x5c, 0x5d, 0x5e, 0x5f, + 0x60, 'a', 'b', 'c', 'd', 'e', 'f', 'g', + 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', + 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', + 'x', 'y', 'z', 0x7b, 0x7c, 0x7d, 0x7e, 0x7f, + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f, + 0xa0, 0xa1, 0xa2, 0xa3, 0xa4, 0xa5, 0xa6, 0xa7, + 0xa8, 0xa9, 0xaa, 0xab, 0xac, 0xad, 0xae, 0xaf, + 0xb0, 0xb1, 0xb2, 0xb3, 0xb4, 0xb5, 0xb6, 0xb7, + 0xb8, 0xb9, 0xba, 0xbb, 0xbc, 0xbd, 0xbe, 0xbf, + 0xc0, 0xc1, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6, 0xc7, + 0xc8, 0xc9, 0xca, 0xcb, 0xcc, 0xcd, 0xce, 0xcf, + 0xd0, 0xd1, 0xd2, 0xd3, 0xd4, 0xd5, 0xd6, 0xd7, + 0xd8, 0xd9, 0xda, 0xdb, 0xdc, 0xdd, 0xde, 0xdf, + 0xe0, 0xe1, 0xe2, 0xe3, 0xe4, 0xe5, 0xe6, 0xe7, + 0xe8, 0xe9, 0xea, 0xeb, 0xec, 0xed, 0xee, 0xef, + 0xf0, 0xf1, 0xf2, 0xf3, 0xf4, 0xf5, 0xf6, 0xf7, + 0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0xff, + }, + { 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, + 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, + 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, + 0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f, + 0x20, 0x21, 0x22, 0x23, 0x24, 0x25, 0x26, 0x27, + 0x28, 0x29, 0x2a, 0x2b, 0x2c, 0x2d, 0x2e, 0x2f, + 0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, + 0x38, 0x39, 0x3a, 0x3b, 0x3c, 0x3d, 0x3e, 0x3f, + 0x40, 'A', 'B', 'C', 'D', 'E', 'F', 'G', + 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', + 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', + 'X', 'Y', 'Z', 0x5b, 0x5c, 0x5d, 0x5e, 0x5f, + 0x60, 'A', 'B', 'C', 'D', 'E', 'F', 'G', + 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', + 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', + 'X', 'Y', 'Z', 0x7b, 0x7c, 0x7d, 0x7e, 0x7f, + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f, + 0xa0, 0xa1, 0xa2, 0xa3, 0xa4, 0xa5, 0xa6, 0xa7, + 0xa8, 0xa9, 0xaa, 0xab, 0xac, 0xad, 0xae, 0xaf, + 0xb0, 0xb1, 0xb2, 0xb3, 0xb4, 0xb5, 0xb6, 0xb7, + 0xb8, 0xb9, 0xba, 0xbb, 0xbc, 0xbd, 0xbe, 0xbf, + 0xc0, 0xc1, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6, 0xc7, + 0xc8, 0xc9, 0xca, 0xcb, 0xcc, 0xcd, 0xce, 0xcf, + 0xd0, 0xd1, 0xd2, 0xd3, 0xd4, 0xd5, 0xd6, 0xd7, + 0xd8, 0xd9, 0xda, 0xdb, 0xdc, 0xdd, 0xde, 0xdf, + 0xe0, 0xe1, 0xe2, 0xe3, 0xe4, 0xe5, 0xe6, 0xe7, + 0xe8, 0xe9, 0xea, 0xeb, 0xec, 0xed, 0xee, 0xef, + 0xf0, 0xf1, 0xf2, 0xf3, 0xf4, 0xf5, 0xf6, 0xf7, + 0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0xff, + }, +}; + +#undef _CurrentRuneLocale +const _RuneLocale *_CurrentRuneLocale = &_DefaultRuneLocale; + +_RuneLocale * +__runes_for_locale(locale_t locale, int *mb_sb_limit) +{ + FIX_LOCALE(locale); + struct xlocale_ctype *c = XLOCALE_CTYPE(locale); + *mb_sb_limit = c->__mb_sb_limit; + return c->runes; +} diff --git a/lib/libc/locale/toascii.3 b/lib/libc/locale/toascii.3 new file mode 100644 index 0000000000000..0a11a017b3aea --- /dev/null +++ b/lib/libc/locale/toascii.3 @@ -0,0 +1,69 @@ +.\" Copyright (c) 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)toascii.3 8.1 (Berkeley) 6/4/93 +.\" $FreeBSD$ +.\" +.Dd June 4, 1993 +.Dt TOASCII 3 +.Os +.Sh NAME +.Nm toascii +.Nd convert a byte to 7-bit ASCII +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In ctype.h +.Ft int +.Fn toascii "int c" +.Sh DESCRIPTION +The +.Fn toascii +function strips all but the low 7 bits from a letter, +including parity or other marker bits. +.Sh RETURN VALUES +The +.Fn toascii +function always returns a valid ASCII character. +.Sh SEE ALSO +.Xr digittoint 3 , +.Xr isalnum 3 , +.Xr isalpha 3 , +.Xr isascii 3 , +.Xr iscntrl 3 , +.Xr isdigit 3 , +.Xr isgraph 3 , +.Xr islower 3 , +.Xr isprint 3 , +.Xr ispunct 3 , +.Xr isspace 3 , +.Xr isupper 3 , +.Xr isxdigit 3 , +.Xr stdio 3 , +.Xr tolower 3 , +.Xr toupper 3 , +.Xr ascii 7 diff --git a/lib/libc/locale/tolower.3 b/lib/libc/locale/tolower.3 new file mode 100644 index 0000000000000..0c6dfd9349fb6 --- /dev/null +++ b/lib/libc/locale/tolower.3 @@ -0,0 +1,79 @@ +.\" Copyright (c) 1989, 1991, 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" the American National Standards Committee X3, on Information +.\" Processing Systems. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)tolower.3 8.1 (Berkeley) 6/4/93 +.\" $FreeBSD$ +.\" +.Dd July 25, 2010 +.Dt TOLOWER 3 +.Os +.Sh NAME +.Nm tolower +.Nd upper case to lower case letter conversion +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In ctype.h +.Ft int +.Fn tolower "int c" +.Sh DESCRIPTION +The +.Fn tolower +function converts an upper-case letter to the corresponding lower-case +letter. +The argument must be representable as an +.Vt "unsigned char" +or the value of +.Dv EOF . +.Sh RETURN VALUES +If the argument is an upper-case letter, the +.Fn tolower +function returns the corresponding lower-case letter if there is +one; otherwise, the argument is returned unchanged. +.Sh COMPATIBILITY +The +.Bx 4.4 +extension of accepting arguments outside of the range of the +.Vt "unsigned char" +type in locales with large character sets is considered obsolete +and may not be supported in future releases. +The +.Fn towlower +function should be used instead. +.Sh SEE ALSO +.Xr ctype 3 , +.Xr islower 3 , +.Xr towlower 3 +.Sh STANDARDS +The +.Fn tolower +function conforms to +.St -isoC . diff --git a/lib/libc/locale/tolower.c b/lib/libc/locale/tolower.c new file mode 100644 index 0000000000000..00ac7279fbb4e --- /dev/null +++ b/lib/libc/locale/tolower.c @@ -0,0 +1,78 @@ +/*- + * SPDX-License-Identifier: BSD-3-Clause + * + * Copyright (c) 1993 + * The Regents of the University of California. All rights reserved. + * + * This code is derived from software contributed to Berkeley by + * Paul Borman at Krystal Technologies. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <ctype.h> +#include <stdio.h> +#include <runetype.h> +#include <wchar.h> +#include "mblocal.h" + +__ct_rune_t +___tolower_l(__ct_rune_t c, locale_t l) +{ + size_t lim; + FIX_LOCALE(l); + _RuneRange *rr = &XLOCALE_CTYPE(l)->runes->__maplower_ext; + _RuneEntry *base, *re; + + if (c < 0 || c == EOF) + return(c); + + /* Binary search -- see bsearch.c for explanation. */ + base = rr->__ranges; + for (lim = rr->__nranges; lim != 0; lim >>= 1) { + re = base + (lim >> 1); + if (re->__min <= c && c <= re->__max) + return (re->__map + c - re->__min); + else if (c > re->__max) { + base = re + 1; + lim--; + } + } + + return(c); +} +__ct_rune_t +___tolower(__ct_rune_t c) +{ + return ___tolower_l(c, __get_locale()); +} diff --git a/lib/libc/locale/toupper.3 b/lib/libc/locale/toupper.3 new file mode 100644 index 0000000000000..5a3cf3fadeff6 --- /dev/null +++ b/lib/libc/locale/toupper.3 @@ -0,0 +1,79 @@ +.\" Copyright (c) 1989, 1991, 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" the American National Standards Committee X3, on Information +.\" Processing Systems. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)toupper.3 8.1 (Berkeley) 6/4/93 +.\" $FreeBSD$ +.\" +.Dd July 25, 2010 +.Dt TOUPPER 3 +.Os +.Sh NAME +.Nm toupper +.Nd lower case to upper case letter conversion +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In ctype.h +.Ft int +.Fn toupper "int c" +.Sh DESCRIPTION +The +.Fn toupper +function converts a lower-case letter to the corresponding +upper-case letter. +The argument must be representable as an +.Vt "unsigned char" +or the value of +.Dv EOF . +.Sh RETURN VALUES +If the argument is a lower-case letter, the +.Fn toupper +function returns the corresponding upper-case letter if there is +one; otherwise, the argument is returned unchanged. +.Sh COMPATIBILITY +The +.Bx 4.4 +extension of accepting arguments outside of the range of the +.Vt "unsigned char" +type in locales with large character sets is considered obsolete +and may not be supported in future releases. +The +.Fn towupper +function should be used instead. +.Sh SEE ALSO +.Xr ctype 3 , +.Xr isupper 3 , +.Xr towupper 3 +.Sh STANDARDS +The +.Fn toupper +function conforms to +.St -isoC . diff --git a/lib/libc/locale/toupper.c b/lib/libc/locale/toupper.c new file mode 100644 index 0000000000000..26bb93f3f2da1 --- /dev/null +++ b/lib/libc/locale/toupper.c @@ -0,0 +1,80 @@ +/*- + * SPDX-License-Identifier: BSD-3-Clause + * + * Copyright (c) 1993 + * The Regents of the University of California. All rights reserved. + * + * This code is derived from software contributed to Berkeley by + * Paul Borman at Krystal Technologies. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <ctype.h> +#include <stdio.h> +#include <runetype.h> +#include <wchar.h> +#include "mblocal.h" + +__ct_rune_t +___toupper_l(__ct_rune_t c, locale_t l) +{ + size_t lim; + FIX_LOCALE(l); + _RuneRange *rr = &XLOCALE_CTYPE(l)->runes->__mapupper_ext; + _RuneEntry *base, *re; + + if (c < 0 || c == EOF) + return(c); + + /* Binary search -- see bsearch.c for explanation. */ + base = rr->__ranges; + for (lim = rr->__nranges; lim != 0; lim >>= 1) { + re = base + (lim >> 1); + if (re->__min <= c && c <= re->__max) + { + return (re->__map + c - re->__min); + } + else if (c > re->__max) { + base = re + 1; + lim--; + } + } + + return(c); +} +__ct_rune_t +___toupper(__ct_rune_t c) +{ + return ___toupper_l(c, __get_locale()); +} diff --git a/lib/libc/locale/towlower.3 b/lib/libc/locale/towlower.3 new file mode 100644 index 0000000000000..d8742aba1aace --- /dev/null +++ b/lib/libc/locale/towlower.3 @@ -0,0 +1,66 @@ +.\" Copyright (c) 1989, 1991, 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" the American National Standards Committee X3, on Information +.\" Processing Systems. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)tolower.3 8.1 (Berkeley) 6/4/93 +.\" $FreeBSD$ +.\" +.Dd October 3, 2002 +.Dt TOWLOWER 3 +.Os +.Sh NAME +.Nm towlower +.Nd "upper case to lower case letter conversion (wide character version)" +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In wctype.h +.Ft wint_t +.Fn towlower "wint_t wc" +.Sh DESCRIPTION +The +.Fn towlower +function converts an upper-case letter to the corresponding lower-case +letter. +.Sh RETURN VALUES +If the argument is an upper-case letter, the +.Fn towlower +function returns the corresponding lower-case letter if there is +one; otherwise the argument is returned unchanged. +.Sh SEE ALSO +.Xr iswlower 3 , +.Xr tolower 3 , +.Xr towupper 3 , +.Xr wctrans 3 +.Sh STANDARDS +The +.Fn towlower +function conforms to +.St -isoC-99 . diff --git a/lib/libc/locale/towupper.3 b/lib/libc/locale/towupper.3 new file mode 100644 index 0000000000000..f139b94a6ef6d --- /dev/null +++ b/lib/libc/locale/towupper.3 @@ -0,0 +1,66 @@ +.\" Copyright (c) 1989, 1991, 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" the American National Standards Committee X3, on Information +.\" Processing Systems. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)toupper.3 8.1 (Berkeley) 6/4/93 +.\" $FreeBSD$ +.\" +.Dd October 3, 2002 +.Dt TOWUPPER 3 +.Os +.Sh NAME +.Nm towupper +.Nd "lower case to upper case letter conversion (wide character version)" +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In wctype.h +.Ft wint_t +.Fn towupper "wint_t wc" +.Sh DESCRIPTION +The +.Fn towupper +function converts a lower-case letter to the corresponding +upper-case letter. +.Sh RETURN VALUES +If the argument is a lower-case letter, the +.Fn towupper +function returns the corresponding upper-case letter if there is +one; otherwise the argument is returned unchanged. +.Sh SEE ALSO +.Xr iswupper 3 , +.Xr toupper 3 , +.Xr towlower 3 , +.Xr wctrans 3 +.Sh STANDARDS +The +.Fn towupper +function conforms to +.St -isoC-99 . diff --git a/lib/libc/locale/uselocale.3 b/lib/libc/locale/uselocale.3 new file mode 100644 index 0000000000000..96d0008f7837a --- /dev/null +++ b/lib/libc/locale/uselocale.3 @@ -0,0 +1,60 @@ +.\" Copyright (c) 2011 The FreeBSD Foundation +.\" All rights reserved. +.\" +.\" This documentation was written by David Chisnall under sponsorship from +.\" the FreeBSD Foundation. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd September 17, 2011 +.Dt USELOCALE 3 +.Os +.Sh NAME +.Nm uselocale +.Nd Sets a thread-local locale +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In locale.h +.Ft locale_t +.Fn uselocale "locale_t locale" +.Sh DESCRIPTION +Specifies the locale for this thread to use. +Specifying +.Fa LC_GLOBAL_LOCALE +disables the per-thread locale, +while NULL returns the current locale without setting a new one. +.Sh RETURN VALUES +Returns the previous locale, +or LC_GLOBAL_LOCALE if this thread has no locale associated with it. +.Sh SEE ALSO +.Xr duplocale 3 , +.Xr freelocale 3 , +.Xr localeconv 3 , +.Xr newlocale 3 , +.Xr querylocale 3 , +.Xr xlocale 3 +.Sh STANDARDS +This function conforms to +.St -p1003.1-2008 . diff --git a/lib/libc/locale/utf8.5 b/lib/libc/locale/utf8.5 new file mode 100644 index 0000000000000..ca7f3f81e2f9b --- /dev/null +++ b/lib/libc/locale/utf8.5 @@ -0,0 +1,102 @@ +.\" Copyright (c) 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" Paul Borman at Krystal Technologies. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)utf2.4 8.1 (Berkeley) 6/4/93 +.\" $FreeBSD$ +.\" +.Dd April 7, 2004 +.Dt UTF8 5 +.Os +.Sh NAME +.Nm utf8 +.Nd "UTF-8, a transformation format of ISO 10646" +.Sh SYNOPSIS +.Nm ENCODING +.Qq UTF-8 +.Sh DESCRIPTION +The +.Nm UTF-8 +encoding represents UCS-4 characters as a sequence of octets, using +between 1 and 6 for each character. +It is backwards compatible with +.Tn ASCII , +so 0x00-0x7f refer to the +.Tn ASCII +character set. +The multibyte encoding of +.No non- Ns Tn ASCII +characters +consist entirely of bytes whose high order bit is set. +The actual +encoding is represented by the following table: +.Bd -literal +[0x00000000 - 0x0000007f] [00000000.0bbbbbbb] -> 0bbbbbbb +[0x00000080 - 0x000007ff] [00000bbb.bbbbbbbb] -> 110bbbbb, 10bbbbbb +[0x00000800 - 0x0000ffff] [bbbbbbbb.bbbbbbbb] -> + 1110bbbb, 10bbbbbb, 10bbbbbb +[0x00010000 - 0x001fffff] [00000000.000bbbbb.bbbbbbbb.bbbbbbbb] -> + 11110bbb, 10bbbbbb, 10bbbbbb, 10bbbbbb +[0x00200000 - 0x03ffffff] [000000bb.bbbbbbbb.bbbbbbbb.bbbbbbbb] -> + 111110bb, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb +[0x04000000 - 0x7fffffff] [0bbbbbbb.bbbbbbbb.bbbbbbbb.bbbbbbbb] -> + 1111110b, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb +.Ed +.Pp +If more than a single representation of a value exists (for example, +0x00; 0xC0 0x80; 0xE0 0x80 0x80) the shortest representation is always +used. +Longer ones are detected as an error as they pose a potential +security risk, and destroy the 1:1 character:octet sequence mapping. +.Sh SEE ALSO +.Xr euc 5 +.Rs +.%A "Rob Pike" +.%A "Ken Thompson" +.%T "Hello World" +.%J "Proceedings of the Winter 1993 USENIX Technical Conference" +.%Q "USENIX Association" +.%D "January 1993" +.Re +.Rs +.%A "F. Yergeau" +.%T "UTF-8, a transformation format of ISO 10646" +.%O "RFC 2279" +.%D "January 1998" +.Re +.Rs +.%Q "The Unicode Consortium" +.%T "The Unicode Standard, Version 3.0" +.%D "2000" +.%O "as amended by the Unicode Standard Annex #27: Unicode 3.1 and by the Unicode Standard Annex #28: Unicode 3.2" +.Re +.Sh STANDARDS +The +.Nm +encoding is compatible with RFC 2279 and Unicode 3.2. diff --git a/lib/libc/locale/utf8.c b/lib/libc/locale/utf8.c new file mode 100644 index 0000000000000..13f957989fa20 --- /dev/null +++ b/lib/libc/locale/utf8.c @@ -0,0 +1,426 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright 2013 Garrett D'Amore <garrett@damore.org> + * Copyright 2011 Nexenta Systems, Inc. All rights reserved. + * Copyright (c) 2002-2004 Tim J. Robbins + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/param.h> +__FBSDID("$FreeBSD$"); + +#include <errno.h> +#include <limits.h> +#include <runetype.h> +#include <stdlib.h> +#include <string.h> +#include <wchar.h> +#include "mblocal.h" + +extern int __mb_sb_limit; + +static size_t _UTF8_mbrtowc(wchar_t * __restrict, const char * __restrict, + size_t, mbstate_t * __restrict); +static int _UTF8_mbsinit(const mbstate_t *); +static size_t _UTF8_mbsnrtowcs(wchar_t * __restrict, + const char ** __restrict, size_t, size_t, + mbstate_t * __restrict); +static size_t _UTF8_wcrtomb(char * __restrict, wchar_t, + mbstate_t * __restrict); +static size_t _UTF8_wcsnrtombs(char * __restrict, const wchar_t ** __restrict, + size_t, size_t, mbstate_t * __restrict); + +typedef struct { + wchar_t ch; + int want; + wchar_t lbound; +} _UTF8State; + +int +_UTF8_init(struct xlocale_ctype *l, _RuneLocale *rl) +{ + + l->__mbrtowc = _UTF8_mbrtowc; + l->__wcrtomb = _UTF8_wcrtomb; + l->__mbsinit = _UTF8_mbsinit; + l->__mbsnrtowcs = _UTF8_mbsnrtowcs; + l->__wcsnrtombs = _UTF8_wcsnrtombs; + l->runes = rl; + l->__mb_cur_max = 4; + /* + * UCS-4 encoding used as the internal representation, so + * slots 0x0080-0x00FF are occuped and must be excluded + * from the single byte ctype by setting the limit. + */ + l->__mb_sb_limit = 128; + + return (0); +} + +static int +_UTF8_mbsinit(const mbstate_t *ps) +{ + + return (ps == NULL || ((const _UTF8State *)ps)->want == 0); +} + +static size_t +_UTF8_mbrtowc(wchar_t * __restrict pwc, const char * __restrict s, size_t n, + mbstate_t * __restrict ps) +{ + _UTF8State *us; + int ch, i, mask, want; + wchar_t lbound, wch; + + us = (_UTF8State *)ps; + + if (us->want < 0 || us->want > 6) { + errno = EINVAL; + return ((size_t)-1); + } + + if (s == NULL) { + s = ""; + n = 1; + pwc = NULL; + } + + if (n == 0) + /* Incomplete multibyte sequence */ + return ((size_t)-2); + + if (us->want == 0) { + /* + * Determine the number of octets that make up this character + * from the first octet, and a mask that extracts the + * interesting bits of the first octet. We already know + * the character is at least two bytes long. + * + * We also specify a lower bound for the character code to + * detect redundant, non-"shortest form" encodings. For + * example, the sequence C0 80 is _not_ a legal representation + * of the null character. This enforces a 1-to-1 mapping + * between character codes and their multibyte representations. + */ + ch = (unsigned char)*s; + if ((ch & 0x80) == 0) { + /* Fast path for plain ASCII characters. */ + if (pwc != NULL) + *pwc = ch; + return (ch != '\0' ? 1 : 0); + } + if ((ch & 0xe0) == 0xc0) { + mask = 0x1f; + want = 2; + lbound = 0x80; + } else if ((ch & 0xf0) == 0xe0) { + mask = 0x0f; + want = 3; + lbound = 0x800; + } else if ((ch & 0xf8) == 0xf0) { + mask = 0x07; + want = 4; + lbound = 0x10000; + } else { + /* + * Malformed input; input is not UTF-8. + */ + errno = EILSEQ; + return ((size_t)-1); + } + } else { + want = us->want; + lbound = us->lbound; + } + + /* + * Decode the octet sequence representing the character in chunks + * of 6 bits, most significant first. + */ + if (us->want == 0) + wch = (unsigned char)*s++ & mask; + else + wch = us->ch; + + for (i = (us->want == 0) ? 1 : 0; i < MIN(want, n); i++) { + if ((*s & 0xc0) != 0x80) { + /* + * Malformed input; bad characters in the middle + * of a character. + */ + errno = EILSEQ; + return ((size_t)-1); + } + wch <<= 6; + wch |= *s++ & 0x3f; + } + if (i < want) { + /* Incomplete multibyte sequence. */ + us->want = want - i; + us->lbound = lbound; + us->ch = wch; + return ((size_t)-2); + } + if (wch < lbound) { + /* + * Malformed input; redundant encoding. + */ + errno = EILSEQ; + return ((size_t)-1); + } + if ((wch >= 0xd800 && wch <= 0xdfff) || wch > 0x10ffff) { + /* + * Malformed input; invalid code points. + */ + errno = EILSEQ; + return ((size_t)-1); + } + if (pwc != NULL) + *pwc = wch; + us->want = 0; + return (wch == L'\0' ? 0 : want); +} + +static size_t +_UTF8_mbsnrtowcs(wchar_t * __restrict dst, const char ** __restrict src, + size_t nms, size_t len, mbstate_t * __restrict ps) +{ + _UTF8State *us; + const char *s; + size_t nchr; + wchar_t wc; + size_t nb; + + us = (_UTF8State *)ps; + + s = *src; + nchr = 0; + + if (dst == NULL) { + /* + * The fast path in the loop below is not safe if an ASCII + * character appears as anything but the first byte of a + * multibyte sequence. Check now to avoid doing it in the loop. + */ + if (nms > 0 && us->want > 0 && (signed char)*s > 0) { + errno = EILSEQ; + return ((size_t)-1); + } + for (;;) { + if (nms > 0 && (signed char)*s > 0) + /* + * Fast path for plain ASCII characters + * excluding NUL. + */ + nb = 1; + else if ((nb = _UTF8_mbrtowc(&wc, s, nms, ps)) == + (size_t)-1) + /* Invalid sequence - mbrtowc() sets errno. */ + return ((size_t)-1); + else if (nb == 0 || nb == (size_t)-2) + return (nchr); + s += nb; + nms -= nb; + nchr++; + } + /*NOTREACHED*/ + } + + /* + * The fast path in the loop below is not safe if an ASCII + * character appears as anything but the first byte of a + * multibyte sequence. Check now to avoid doing it in the loop. + */ + if (nms > 0 && len > 0 && us->want > 0 && (signed char)*s > 0) { + errno = EILSEQ; + return ((size_t)-1); + } + while (len-- > 0) { + if (nms > 0 && (signed char)*s > 0) { + /* + * Fast path for plain ASCII characters + * excluding NUL. + */ + *dst = (wchar_t)*s; + nb = 1; + } else if ((nb = _UTF8_mbrtowc(dst, s, nms, ps)) == + (size_t)-1) { + *src = s; + return ((size_t)-1); + } else if (nb == (size_t)-2) { + *src = s + nms; + return (nchr); + } else if (nb == 0) { + *src = NULL; + return (nchr); + } + s += nb; + nms -= nb; + nchr++; + dst++; + } + *src = s; + return (nchr); +} + +static size_t +_UTF8_wcrtomb(char * __restrict s, wchar_t wc, mbstate_t * __restrict ps) +{ + _UTF8State *us; + unsigned char lead; + int i, len; + + us = (_UTF8State *)ps; + + if (us->want != 0) { + errno = EINVAL; + return ((size_t)-1); + } + + if (s == NULL) + /* Reset to initial shift state (no-op) */ + return (1); + + /* + * Determine the number of octets needed to represent this character. + * We always output the shortest sequence possible. Also specify the + * first few bits of the first octet, which contains the information + * about the sequence length. + */ + if ((wc & ~0x7f) == 0) { + /* Fast path for plain ASCII characters. */ + *s = (char)wc; + return (1); + } else if ((wc & ~0x7ff) == 0) { + lead = 0xc0; + len = 2; + } else if ((wc & ~0xffff) == 0) { + if (wc >= 0xd800 && wc <= 0xdfff) { + errno = EILSEQ; + return ((size_t)-1); + } + lead = 0xe0; + len = 3; + } else if (wc >= 0 && wc <= 0x10ffff) { + lead = 0xf0; + len = 4; + } else { + errno = EILSEQ; + return ((size_t)-1); + } + + /* + * Output the octets representing the character in chunks + * of 6 bits, least significant last. The first octet is + * a special case because it contains the sequence length + * information. + */ + for (i = len - 1; i > 0; i--) { + s[i] = (wc & 0x3f) | 0x80; + wc >>= 6; + } + *s = (wc & 0xff) | lead; + + return (len); +} + +static size_t +_UTF8_wcsnrtombs(char * __restrict dst, const wchar_t ** __restrict src, + size_t nwc, size_t len, mbstate_t * __restrict ps) +{ + _UTF8State *us; + char buf[MB_LEN_MAX]; + const wchar_t *s; + size_t nbytes; + size_t nb; + + us = (_UTF8State *)ps; + + if (us->want != 0) { + errno = EINVAL; + return ((size_t)-1); + } + + s = *src; + nbytes = 0; + + if (dst == NULL) { + while (nwc-- > 0) { + if (0 <= *s && *s < 0x80) + /* Fast path for plain ASCII characters. */ + nb = 1; + else if ((nb = _UTF8_wcrtomb(buf, *s, ps)) == + (size_t)-1) + /* Invalid character - wcrtomb() sets errno. */ + return ((size_t)-1); + if (*s == L'\0') + return (nbytes + nb - 1); + s++; + nbytes += nb; + } + return (nbytes); + } + + while (len > 0 && nwc-- > 0) { + if (0 <= *s && *s < 0x80) { + /* Fast path for plain ASCII characters. */ + nb = 1; + *dst = *s; + } else if (len > (size_t)MB_CUR_MAX) { + /* Enough space to translate in-place. */ + if ((nb = _UTF8_wcrtomb(dst, *s, ps)) == (size_t)-1) { + *src = s; + return ((size_t)-1); + } + } else { + /* + * May not be enough space; use temp. buffer. + */ + if ((nb = _UTF8_wcrtomb(buf, *s, ps)) == (size_t)-1) { + *src = s; + return ((size_t)-1); + } + if (nb > (int)len) + /* MB sequence for character won't fit. */ + break; + memcpy(dst, buf, nb); + } + if (*s == L'\0') { + *src = NULL; + return (nbytes + nb - 1); + } + s++; + dst += nb; + len -= nb; + nbytes += nb; + } + *src = s; + return (nbytes); +} diff --git a/lib/libc/locale/wcrtomb.3 b/lib/libc/locale/wcrtomb.3 new file mode 100644 index 0000000000000..bc741740a249e --- /dev/null +++ b/lib/libc/locale/wcrtomb.3 @@ -0,0 +1,123 @@ +.\" Copyright (c) 2002-2004 Tim J. Robbins +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd May 21, 2013 +.Dt WCRTOMB 3 +.Os +.Sh NAME +.Nm wcrtomb , +.Nm c16rtomb , +.Nm c32rtomb +.Nd "convert a wide-character code to a character (restartable)" +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In wchar.h +.Ft size_t +.Fn wcrtomb "char * restrict s" "wchar_t c" "mbstate_t * restrict ps" +.In uchar.h +.Ft size_t +.Fn c16rtomb "char * restrict s" "char16_t c" "mbstate_t * restrict ps" +.Ft size_t +.Fn c32rtomb "char * restrict s" "char32_t c" "mbstate_t * restrict ps" +.Sh DESCRIPTION +The +.Fn wcrtomb , +.Fn c16rtomb +and +.Fn c32rtomb +functions store a multibyte sequence representing the +wide character +.Fa c , +including any necessary shift sequences, to the +character array +.Fa s , +storing a maximum of +.Dv MB_CUR_MAX +bytes. +.Pp +If +.Fa s +is +.Dv NULL , +these functions behave as if +.Fa s +pointed to an internal buffer and +.Fa c +was a null wide character (L'\e0'). +.Pp +The +.Ft mbstate_t +argument, +.Fa ps , +is used to keep track of the shift state. +If it is +.Dv NULL , +these functions use an internal, static +.Vt mbstate_t +object, which is initialized to the initial conversion state +at program startup. +.Pp +As certain multibyte characters may only be represented by a series of +16-bit characters, the +.Fn c16rtomb +may need to invoked multiple times before a multibyte sequence is +returned. +.Sh RETURN VALUES +These functions return the length (in bytes) of the multibyte sequence +needed to represent +.Fa c , +or +.Po Vt size_t Pc Ns \-1 +if +.Fa c +is not a valid wide character code. +.Sh ERRORS +The +.Fn wcrtomb , +.Fn c16rtomb +and +.Fn c32rtomb +functions will fail if: +.Bl -tag -width Er +.It Bq Er EILSEQ +An invalid wide character code was specified. +.It Bq Er EINVAL +The conversion state is invalid. +.El +.Sh SEE ALSO +.Xr mbrtowc 3 , +.Xr multibyte 3 , +.Xr setlocale 3 , +.Xr wctomb 3 +.Sh STANDARDS +The +.Fn wcrtomb , +.Fn c16rtomb +and +.Fn c32rtomb +functions conform to +.St -isoC-2011 . diff --git a/lib/libc/locale/wcrtomb.c b/lib/libc/locale/wcrtomb.c new file mode 100644 index 0000000000000..1afa8f77acc98 --- /dev/null +++ b/lib/libc/locale/wcrtomb.c @@ -0,0 +1,54 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2002-2004 Tim J. Robbins. + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <wchar.h> +#include "mblocal.h" + +size_t +wcrtomb_l(char * __restrict s, wchar_t wc, mbstate_t * __restrict ps, + locale_t locale) +{ + FIX_LOCALE(locale); + if (ps == NULL) + ps = &locale->wcrtomb; + return (XLOCALE_CTYPE(locale)->__wcrtomb(s, wc, ps)); +} + +size_t +wcrtomb(char * __restrict s, wchar_t wc, mbstate_t * __restrict ps) +{ + return wcrtomb_l(s, wc, ps, __get_locale()); +} diff --git a/lib/libc/locale/wcsftime.3 b/lib/libc/locale/wcsftime.3 new file mode 100644 index 0000000000000..92aee9313f592 --- /dev/null +++ b/lib/libc/locale/wcsftime.3 @@ -0,0 +1,66 @@ +.\" Copyright (c) 2002 Tim J. Robbins +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd September 8, 2002 +.Dt WCSFTIME 3 +.Os +.Sh NAME +.Nm wcsftime +.Nd "convert date and time to a wide-character string" +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In wchar.h +.Ft size_t +.Fo wcsftime +.Fa "wchar_t * restrict wcs" "size_t maxsize" +.Fa "const wchar_t * restrict format" "const struct tm * restrict timeptr" +.Fc +.Sh DESCRIPTION +The +.Fn wcsftime +function is equivalent to the +.Fn strftime +function except for the types of its arguments. +Refer to +.Xr strftime 3 +for a detailed description. +.Sh COMPATIBILITY +Some early implementations of +.Fn wcsftime +had a +.Fa format +argument with type +.Vt "const char *" +instead of +.Vt "const wchar_t *" . +.Sh SEE ALSO +.Xr strftime 3 +.Sh STANDARDS +The +.Fn wcsftime +function conforms to +.St -isoC-99 . diff --git a/lib/libc/locale/wcsftime.c b/lib/libc/locale/wcsftime.c new file mode 100644 index 0000000000000..772c96c2f9f4b --- /dev/null +++ b/lib/libc/locale/wcsftime.c @@ -0,0 +1,124 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2002 Tim J. Robbins + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <errno.h> +#include <limits.h> +#include <stdlib.h> +#include <time.h> +#include <wchar.h> +#include "xlocale_private.h" + +/* + * Convert date and time to a wide-character string. + * + * This is the wide-character counterpart of strftime(). So that we do not + * have to duplicate the code of strftime(), we convert the format string to + * multibyte, call strftime(), then convert the result back into wide + * characters. + * + * This technique loses in the presence of stateful multibyte encoding if any + * of the conversions in the format string change conversion state. When + * stateful encoding is implemented, we will need to reset the state between + * format specifications in the format string. + */ +size_t +wcsftime_l(wchar_t * __restrict wcs, size_t maxsize, + const wchar_t * __restrict format, const struct tm * __restrict timeptr, + locale_t locale) +{ + static const mbstate_t initial; + mbstate_t mbs; + char *dst, *sformat; + const char *dstp; + const wchar_t *formatp; + size_t n, sflen; + int sverrno; + FIX_LOCALE(locale); + + sformat = dst = NULL; + + /* + * Convert the supplied format string to a multibyte representation + * for strftime(), which only handles single-byte characters. + */ + mbs = initial; + formatp = format; + sflen = wcsrtombs_l(NULL, &formatp, 0, &mbs, locale); + if (sflen == (size_t)-1) + goto error; + if ((sformat = malloc(sflen + 1)) == NULL) + goto error; + mbs = initial; + wcsrtombs_l(sformat, &formatp, sflen + 1, &mbs, locale); + + /* + * Allocate memory for longest multibyte sequence that will fit + * into the caller's buffer and call strftime() to fill it. + * Then, copy and convert the result back into wide characters in + * the caller's buffer. + */ + if (SIZE_T_MAX / MB_CUR_MAX <= maxsize) { + /* maxsize is prepostorously large - avoid int. overflow. */ + errno = EINVAL; + goto error; + } + if ((dst = malloc(maxsize * MB_CUR_MAX)) == NULL) + goto error; + if (strftime_l(dst, maxsize, sformat, timeptr, locale) == 0) + goto error; + dstp = dst; + mbs = initial; + n = mbsrtowcs_l(wcs, &dstp, maxsize, &mbs, locale); + if (n == (size_t)-2 || n == (size_t)-1 || dstp != NULL) + goto error; + + free(sformat); + free(dst); + return (n); + +error: + sverrno = errno; + free(sformat); + free(dst); + errno = sverrno; + return (0); +} +size_t +wcsftime(wchar_t * __restrict wcs, size_t maxsize, + const wchar_t * __restrict format, const struct tm * __restrict timeptr) +{ + return wcsftime_l(wcs, maxsize, format, timeptr, __get_locale()); +} diff --git a/lib/libc/locale/wcsnrtombs.c b/lib/libc/locale/wcsnrtombs.c new file mode 100644 index 0000000000000..8d90445aacf16 --- /dev/null +++ b/lib/libc/locale/wcsnrtombs.c @@ -0,0 +1,127 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright 2013 Garrett D'Amore <garrett@damore.org> + * Copyright 2010 Nexenta Systems, Inc. All rights reserved. + * Copyright (c) 2002-2004 Tim J. Robbins. + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <limits.h> +#include <stdlib.h> +#include <string.h> +#include <wchar.h> +#include "mblocal.h" + +size_t +wcsnrtombs_l(char * __restrict dst, const wchar_t ** __restrict src, size_t nwc, + size_t len, mbstate_t * __restrict ps, locale_t locale) +{ + FIX_LOCALE(locale); + if (ps == NULL) + ps = &locale->wcsnrtombs; + return (XLOCALE_CTYPE(locale)->__wcsnrtombs(dst, src, nwc, len, ps)); +} +size_t +wcsnrtombs(char * __restrict dst, const wchar_t ** __restrict src, size_t nwc, + size_t len, mbstate_t * __restrict ps) +{ + return wcsnrtombs_l(dst, src, nwc, len, ps, __get_locale()); +} + + +size_t +__wcsnrtombs_std(char * __restrict dst, const wchar_t ** __restrict src, + size_t nwc, size_t len, mbstate_t * __restrict ps, + wcrtomb_pfn_t pwcrtomb) +{ + mbstate_t mbsbak; + char buf[MB_LEN_MAX]; + const wchar_t *s; + size_t nbytes; + size_t nb; + + s = *src; + nbytes = 0; + + if (dst == NULL) { + while (nwc-- > 0) { + if ((nb = pwcrtomb(buf, *s, ps)) == (size_t)-1) + /* Invalid character - wcrtomb() sets errno. */ + return ((size_t)-1); + else if (*s == L'\0') + return (nbytes + nb - 1); + s++; + nbytes += nb; + } + return (nbytes); + } + + while (len > 0 && nwc-- > 0) { + if (len > (size_t)MB_CUR_MAX) { + /* Enough space to translate in-place. */ + if ((nb = pwcrtomb(dst, *s, ps)) == (size_t)-1) { + *src = s; + return ((size_t)-1); + } + } else { + /* + * May not be enough space; use temp. buffer. + * + * We need to save a copy of the conversion state + * here so we can restore it if the multibyte + * character is too long for the buffer. + */ + mbsbak = *ps; + if ((nb = pwcrtomb(buf, *s, ps)) == (size_t)-1) { + *src = s; + return ((size_t)-1); + } + if (nb > (int)len) { + /* MB sequence for character won't fit. */ + *ps = mbsbak; + break; + } + memcpy(dst, buf, nb); + } + if (*s == L'\0') { + *src = NULL; + return (nbytes + nb - 1); + } + s++; + dst += nb; + len -= nb; + nbytes += nb; + } + *src = s; + return (nbytes); +} diff --git a/lib/libc/locale/wcsrtombs.3 b/lib/libc/locale/wcsrtombs.3 new file mode 100644 index 0000000000000..ff607c2dad11e --- /dev/null +++ b/lib/libc/locale/wcsrtombs.3 @@ -0,0 +1,134 @@ +.\" Copyright (c) 2002-2004 Tim J. Robbins +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd July 21, 2004 +.Dt WCSRTOMBS 3 +.Os +.Sh NAME +.Nm wcsrtombs , +.Nm wcsnrtombs +.Nd "convert a wide-character string to a character string (restartable)" +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In wchar.h +.Ft size_t +.Fo wcsrtombs +.Fa "char * restrict dst" "const wchar_t ** restrict src" +.Fa "size_t len" "mbstate_t * restrict ps" +.Fc +.Ft size_t +.Fo wcsnrtombs +.Fa "char * restrict dst" "const wchar_t ** restrict src" "size_t nwc" +.Fa "size_t len" "mbstate_t * restrict ps" +.Fc +.Sh DESCRIPTION +The +.Fn wcsrtombs +function converts a string of wide characters indirectly pointed to by +.Fa src +to a corresponding multibyte character string stored in the array +pointed to by +.Fa dst . +No more than +.Fa len +bytes are written to +.Fa dst . +.Pp +If +.Fa dst +is +.Dv NULL , +no characters are stored. +.Pp +If +.Fa dst +is not +.Dv NULL , +the pointer pointed to by +.Fa src +is updated to point to the character after the one that conversion stopped at. +If conversion stops because a null character is encountered, +.Fa *src +is set to +.Dv NULL . +.Pp +The +.Vt mbstate_t +argument, +.Fa ps , +is used to keep track of the shift state. +If it is +.Dv NULL , +.Fn wcsrtombs +uses an internal, static +.Vt mbstate_t +object, which is initialized to the initial conversion state +at program startup. +.Pp +The +.Fn wcsnrtombs +function behaves identically to +.Fn wcsrtombs , +except that conversion stops after reading at most +.Fa nwc +characters from the buffer pointed to by +.Fa src . +.Sh RETURN VALUES +The +.Fn wcsrtombs +and +.Fn wcsnrtombs +functions return the number of bytes stored in +the array pointed to by +.Fa dst +(not including any terminating null), if successful, otherwise it returns +.Po Vt size_t Pc Ns \-1 . +.Sh ERRORS +The +.Fn wcsrtombs +and +.Fn wcsnrtombs +functions will fail if: +.Bl -tag -width Er +.It Bq Er EILSEQ +An invalid wide character was encountered. +.It Bq Er EINVAL +The conversion state is invalid. +.El +.Sh SEE ALSO +.Xr mbsrtowcs 3 , +.Xr wcrtomb 3 , +.Xr wcstombs 3 +.Sh STANDARDS +The +.Fn wcsrtombs +function conforms to +.St -isoC-99 . +.Pp +The +.Fn wcsnrtombs +function is an extension to the standard. diff --git a/lib/libc/locale/wcsrtombs.c b/lib/libc/locale/wcsrtombs.c new file mode 100644 index 0000000000000..ca9875799a3de --- /dev/null +++ b/lib/libc/locale/wcsrtombs.c @@ -0,0 +1,58 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2002-2004 Tim J. Robbins. + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <limits.h> +#include <stdlib.h> +#include <string.h> +#include <wchar.h> +#include "mblocal.h" + +size_t +wcsrtombs_l(char * __restrict dst, const wchar_t ** __restrict src, size_t len, + mbstate_t * __restrict ps, locale_t locale) +{ + FIX_LOCALE(locale); + if (ps == NULL) + ps = &locale->wcsrtombs; + return (XLOCALE_CTYPE(locale)->__wcsnrtombs(dst, src, SIZE_T_MAX, len, ps)); +} + +size_t +wcsrtombs(char * __restrict dst, const wchar_t ** __restrict src, size_t len, + mbstate_t * __restrict ps) +{ + return wcsrtombs_l(dst, src, len, ps, __get_locale()); +} diff --git a/lib/libc/locale/wcstod.3 b/lib/libc/locale/wcstod.3 new file mode 100644 index 0000000000000..f8c5135ee86b5 --- /dev/null +++ b/lib/libc/locale/wcstod.3 @@ -0,0 +1,73 @@ +.\" Copyright (c) 2002, 2003 Tim J. Robbins +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd February 22, 2003 +.Dt WCSTOD 3 +.Os +.Sh NAME +.Nm wcstof , +.Nm wcstod , +.Nm wcstold +.Nd convert string to +.Vt float , double +or +.Vt "long double" +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In wchar.h +.Ft float +.Fn wcstof "const wchar_t * restrict nptr" "wchar_t ** restrict endptr" +.Ft "long double" +.Fn wcstold "const wchar_t * restrict nptr" "wchar_t ** restrict endptr" +.Ft double +.Fn wcstod "const wchar_t * restrict nptr" "wchar_t ** restrict endptr" +.Sh DESCRIPTION +The +.Fn wcstof , +.Fn wcstod +and +.Fn wcstold +functions are the wide-character versions of the +.Fn strtof , +.Fn strtod +and +.Fn strtold +functions. +Refer to +.Xr strtod 3 +for details. +.Sh SEE ALSO +.Xr strtod 3 , +.Xr wcstol 3 +.Sh STANDARDS +The +.Fn wcstof , +.Fn wcstod +and +.Fn wcstold +functions conform to +.St -isoC-99 . diff --git a/lib/libc/locale/wcstod.c b/lib/libc/locale/wcstod.c new file mode 100644 index 0000000000000..42d8acdafcdb4 --- /dev/null +++ b/lib/libc/locale/wcstod.c @@ -0,0 +1,118 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2002 Tim J. Robbins + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <stdlib.h> +#include <wchar.h> +#include <wctype.h> +#include "xlocale_private.h" + +/* + * Convert a string to a double-precision number. + * + * This is the wide-character counterpart of strtod(). So that we do not + * have to duplicate the code of strtod() here, we convert the supplied + * wide character string to multibyte and call strtod() on the result. + * This assumes that the multibyte encoding is compatible with ASCII + * for at least the digits, radix character and letters. + */ +double +wcstod_l(const wchar_t * __restrict nptr, wchar_t ** __restrict endptr, + locale_t locale) +{ + static const mbstate_t initial; + mbstate_t mbs; + double val; + char *buf, *end; + const wchar_t *wcp; + size_t len; + size_t spaces; + FIX_LOCALE(locale); + + wcp = nptr; + spaces = 0; + while (iswspace_l(*wcp, locale)) { + wcp++; + spaces++; + } + + /* + * Convert the supplied numeric wide char. string to multibyte. + * + * We could attempt to find the end of the numeric portion of the + * wide char. string to avoid converting unneeded characters but + * choose not to bother; optimising the uncommon case where + * the input string contains a lot of text after the number + * duplicates a lot of strtod()'s functionality and slows down the + * most common cases. + */ + mbs = initial; + if ((len = wcsrtombs_l(NULL, &wcp, 0, &mbs, locale)) == (size_t)-1) { + if (endptr != NULL) + *endptr = (wchar_t *)nptr; + return (0.0); + } + if ((buf = malloc(len + 1)) == NULL) { + if (endptr != NULL) + *endptr = (wchar_t *)nptr; + return (0.0); + } + mbs = initial; + wcsrtombs_l(buf, &wcp, len + 1, &mbs, locale); + + /* Let strtod() do most of the work for us. */ + val = strtod_l(buf, &end, locale); + + /* + * We only know where the number ended in the _multibyte_ + * representation of the string. If the caller wants to know + * where it ended, count multibyte characters to find the + * corresponding position in the wide char string. + */ + if (endptr != NULL) { + *endptr = (wchar_t *)nptr + (end - buf); + if (buf != end) + *endptr += spaces; + } + + free(buf); + + return (val); +} +double +wcstod(const wchar_t * __restrict nptr, wchar_t ** __restrict endptr) +{ + return wcstod_l(nptr, endptr, __get_locale()); +} diff --git a/lib/libc/locale/wcstof.c b/lib/libc/locale/wcstof.c new file mode 100644 index 0000000000000..34117557ee0fd --- /dev/null +++ b/lib/libc/locale/wcstof.c @@ -0,0 +1,95 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2002, 2003 Tim J. Robbins + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <stdlib.h> +#include <wchar.h> +#include <wctype.h> +#include "xlocale_private.h" + +/* + * See wcstod() for comments as to the logic used. + */ +float +wcstof_l(const wchar_t * __restrict nptr, wchar_t ** __restrict endptr, + locale_t locale) +{ + static const mbstate_t initial; + mbstate_t mbs; + float val; + char *buf, *end; + const wchar_t *wcp; + size_t len; + size_t spaces; + FIX_LOCALE(locale); + + wcp = nptr; + spaces = 0; + while (iswspace_l(*wcp, locale)) { + wcp++; + spaces++; + } + + mbs = initial; + if ((len = wcsrtombs_l(NULL, &wcp, 0, &mbs, locale)) == (size_t)-1) { + if (endptr != NULL) + *endptr = (wchar_t *)nptr; + return (0.0); + } + if ((buf = malloc(len + 1)) == NULL) { + if (endptr != NULL) + *endptr = (wchar_t *)nptr; + return (0.0); + } + mbs = initial; + wcsrtombs_l(buf, &wcp, len + 1, &mbs, locale); + + val = strtof_l(buf, &end, locale); + + if (endptr != NULL) { + *endptr = (wchar_t *)nptr + (end - buf); + if (buf != end) + *endptr += spaces; + } + + free(buf); + + return (val); +} +float +wcstof(const wchar_t * __restrict nptr, wchar_t ** __restrict endptr) +{ + return wcstof_l(nptr, endptr, __get_locale()); +} diff --git a/lib/libc/locale/wcstoimax.c b/lib/libc/locale/wcstoimax.c new file mode 100644 index 0000000000000..5e4d9af26a805 --- /dev/null +++ b/lib/libc/locale/wcstoimax.c @@ -0,0 +1,139 @@ +/*- + * SPDX-License-Identifier: BSD-3-Clause + * + * Copyright (c) 1992, 1993 + * The Regents of the University of California. All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +#if 0 +#if defined(LIBC_SCCS) && !defined(lint) +static char sccsid[] = "from @(#)strtol.c 8.1 (Berkeley) 6/4/93"; +#endif /* LIBC_SCCS and not lint */ +__FBSDID("FreeBSD: src/lib/libc/stdlib/strtoimax.c,v 1.8 2002/09/06 11:23:59 tjr Exp "); +#endif +__FBSDID("$FreeBSD$"); + +#include <errno.h> +#include <inttypes.h> +#include <stdlib.h> +#include <wchar.h> +#include <wctype.h> +#include "xlocale_private.h" + +/* + * Convert a wide character string to an intmax_t integer. + */ +intmax_t +wcstoimax_l(const wchar_t * __restrict nptr, wchar_t ** __restrict endptr, + int base, locale_t locale) +{ + const wchar_t *s; + uintmax_t acc; + wchar_t c; + uintmax_t cutoff; + int neg, any, cutlim; + FIX_LOCALE(locale); + + /* + * See strtoimax for comments as to the logic used. + */ + s = nptr; + do { + c = *s++; + } while (iswspace_l(c, locale)); + if (c == L'-') { + neg = 1; + c = *s++; + } else { + neg = 0; + if (c == L'+') + c = *s++; + } + if ((base == 0 || base == 16) && + c == L'0' && (*s == L'x' || *s == L'X')) { + c = s[1]; + s += 2; + base = 16; + } + if (base == 0) + base = c == L'0' ? 8 : 10; + acc = any = 0; + if (base < 2 || base > 36) + goto noconv; + + cutoff = neg ? (uintmax_t)-(INTMAX_MIN + INTMAX_MAX) + INTMAX_MAX + : INTMAX_MAX; + cutlim = cutoff % base; + cutoff /= base; + for ( ; ; c = *s++) { +#ifdef notyet + if (iswdigit_l(c, locale)) + c = digittoint_l(c, locale); + else +#endif + if (c >= L'0' && c <= L'9') + c -= L'0'; + else if (c >= L'A' && c <= L'Z') + c -= L'A' - 10; + else if (c >= 'a' && c <= 'z') + c -= L'a' - 10; + else + break; + if (c >= base) + break; + if (any < 0 || acc > cutoff || (acc == cutoff && c > cutlim)) + any = -1; + else { + any = 1; + acc *= base; + acc += c; + } + } + if (any < 0) { + acc = neg ? INTMAX_MIN : INTMAX_MAX; + errno = ERANGE; + } else if (!any) { +noconv: + errno = EINVAL; + } else if (neg) + acc = -acc; + if (endptr != NULL) + *endptr = (wchar_t *)(any ? s - 1 : nptr); + return (acc); +} +intmax_t +wcstoimax(const wchar_t * __restrict nptr, wchar_t ** __restrict endptr, + int base) +{ + return wcstoimax_l(nptr, endptr, base, __get_locale()); +} diff --git a/lib/libc/locale/wcstol.3 b/lib/libc/locale/wcstol.3 new file mode 100644 index 0000000000000..e69a72672d545 --- /dev/null +++ b/lib/libc/locale/wcstol.3 @@ -0,0 +1,94 @@ +.\" Copyright (c) 2002 Tim J. Robbins +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd September 7, 2002 +.Dt WCSTOL 3 +.Os +.Sh NAME +.Nm wcstol , wcstoul , +.Nm wcstoll , wcstoull , +.Nm wcstoimax , wcstoumax +.Nd "convert a wide character string value to a" +.Vt long , +.Vt "unsigned long" , +.Vt "long long" , +.Vt "unsigned long long" , +.Vt intmax_t +or +.Vt uintmax_t +integer +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In wchar.h +.Ft long +.Fn wcstol "const wchar_t * restrict nptr" "wchar_t ** restrict endptr" "int base" +.Ft "unsigned long" +.Fn wcstoul "const wchar_t * restrict nptr" "wchar_t ** restrict endptr" "int base" +.Ft "long long" +.Fn wcstoll "const wchar_t * restrict nptr" "wchar_t ** restrict endptr" "int base" +.Ft "unsigned long long" +.Fn wcstoull "const wchar_t * restrict nptr" "wchar_t ** restrict endptr" "int base" +.In inttypes.h +.Ft intmax_t +.Fn wcstoimax "const wchar_t * restrict nptr" "wchar_t ** restrict endptr" "int base" +.Ft uintmax_t +.Fn wcstoumax "const wchar_t * restrict nptr" "wchar_t ** restrict endptr" "int base" +.Sh DESCRIPTION +The +.Fn wcstol , +.Fn wcstoul , +.Fn wcstoll , +.Fn wcstoull , +.Fn wcstoimax +and +.Fn wcstoumax +functions are wide-character versions of the +.Fn strtol , +.Fn strtoul , +.Fn strtoll , +.Fn strtoull , +.Fn strtoimax +and +.Fn strtoumax +functions, respectively. +Refer to their manual pages (for example +.Xr strtol 3 ) +for details. +.Sh SEE ALSO +.Xr strtol 3 , +.Xr strtoul 3 +.Sh STANDARDS +The +.Fn wcstol , +.Fn wcstoul , +.Fn wcstoll , +.Fn wcstoull , +.Fn wcstoimax +and +.Fn wcstoumax +functions conform to +.St -isoC-99 . diff --git a/lib/libc/locale/wcstol.c b/lib/libc/locale/wcstol.c new file mode 100644 index 0000000000000..98bd5f85a4d12 --- /dev/null +++ b/lib/libc/locale/wcstol.c @@ -0,0 +1,132 @@ +/*- + * SPDX-License-Identifier: BSD-3-Clause + * + * Copyright (c) 1990, 1993 + * The Regents of the University of California. All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <ctype.h> +#include <errno.h> +#include <limits.h> +#include <wchar.h> +#include <wctype.h> +#include "xlocale_private.h" + +/* + * Convert a string to a long integer. + */ +long +wcstol_l(const wchar_t * __restrict nptr, wchar_t ** __restrict endptr, int + base, locale_t locale) +{ + const wchar_t *s; + unsigned long acc; + wchar_t c; + unsigned long cutoff; + int neg, any, cutlim; + FIX_LOCALE(locale); + + /* + * See strtol for comments as to the logic used. + */ + s = nptr; + do { + c = *s++; + } while (iswspace_l(c, locale)); + if (c == '-') { + neg = 1; + c = *s++; + } else { + neg = 0; + if (c == L'+') + c = *s++; + } + if ((base == 0 || base == 16) && + c == L'0' && (*s == L'x' || *s == L'X')) { + c = s[1]; + s += 2; + base = 16; + } + if (base == 0) + base = c == L'0' ? 8 : 10; + acc = any = 0; + if (base < 2 || base > 36) + goto noconv; + + cutoff = neg ? (unsigned long)-(LONG_MIN + LONG_MAX) + LONG_MAX + : LONG_MAX; + cutlim = cutoff % base; + cutoff /= base; + for ( ; ; c = *s++) { +#ifdef notyet + if (iswdigit_l(c, locale)) + c = digittoint_l(c, locale); + else +#endif + if (c >= L'0' && c <= L'9') + c -= L'0'; + else if (c >= L'A' && c <= L'Z') + c -= L'A' - 10; + else if (c >= L'a' && c <= L'z') + c -= L'a' - 10; + else + break; + if (c >= base) + break; + if (any < 0 || acc > cutoff || (acc == cutoff && c > cutlim)) + any = -1; + else { + any = 1; + acc *= base; + acc += c; + } + } + if (any < 0) { + acc = neg ? LONG_MIN : LONG_MAX; + errno = ERANGE; + } else if (!any) { +noconv: + errno = EINVAL; + } else if (neg) + acc = -acc; + if (endptr != NULL) + *endptr = (wchar_t *)(any ? s - 1 : nptr); + return (acc); +} +long +wcstol(const wchar_t * __restrict nptr, wchar_t ** __restrict endptr, int base) +{ + return wcstol_l(nptr, endptr, base, __get_locale()); +} diff --git a/lib/libc/locale/wcstold.c b/lib/libc/locale/wcstold.c new file mode 100644 index 0000000000000..c31f84118a2f1 --- /dev/null +++ b/lib/libc/locale/wcstold.c @@ -0,0 +1,95 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2002, 2003 Tim J. Robbins + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <stdlib.h> +#include <wchar.h> +#include <wctype.h> +#include "xlocale_private.h" + +/* + * See wcstod() for comments as to the logic used. + */ +long double +wcstold_l(const wchar_t * __restrict nptr, wchar_t ** __restrict endptr, + locale_t locale) +{ + static const mbstate_t initial; + mbstate_t mbs; + long double val; + char *buf, *end; + const wchar_t *wcp; + size_t len; + size_t spaces; + FIX_LOCALE(locale); + + wcp = nptr; + spaces = 0; + while (iswspace_l(*wcp, locale)) { + wcp++; + spaces++; + } + + mbs = initial; + if ((len = wcsrtombs_l(NULL, &wcp, 0, &mbs, locale)) == (size_t)-1) { + if (endptr != NULL) + *endptr = (wchar_t *)nptr; + return (0.0); + } + if ((buf = malloc(len + 1)) == NULL) { + if (endptr != NULL) + *endptr = (wchar_t *)nptr; + return (0.0); + } + mbs = initial; + wcsrtombs_l(buf, &wcp, len + 1, &mbs, locale); + + val = strtold_l(buf, &end, locale); + + if (endptr != NULL) { + *endptr = (wchar_t *)nptr + (end - buf); + if (buf != end) + *endptr += spaces; + } + + free(buf); + + return (val); +} +long double +wcstold(const wchar_t * __restrict nptr, wchar_t ** __restrict endptr) +{ + return wcstold_l(nptr, endptr, __get_locale()); +} diff --git a/lib/libc/locale/wcstoll.c b/lib/libc/locale/wcstoll.c new file mode 100644 index 0000000000000..6c3184a0de34f --- /dev/null +++ b/lib/libc/locale/wcstoll.c @@ -0,0 +1,138 @@ +/*- + * SPDX-License-Identifier: BSD-3-Clause + * + * Copyright (c) 1992, 1993 + * The Regents of the University of California. All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +#if 0 +#if defined(LIBC_SCCS) && !defined(lint) +static char sccsid[] = "@(#)strtoq.c 8.1 (Berkeley) 6/4/93"; +#endif /* LIBC_SCCS and not lint */ +__FBSDID("FreeBSD: src/lib/libc/stdlib/strtoll.c,v 1.19 2002/09/06 11:23:59 tjr Exp "); +#endif +__FBSDID("$FreeBSD$"); + +#include <errno.h> +#include <limits.h> +#include <stdlib.h> +#include <wchar.h> +#include <wctype.h> +#include "xlocale_private.h" + +/* + * Convert a wide character string to a long long integer. + */ +long long +wcstoll_l(const wchar_t * __restrict nptr, wchar_t ** __restrict endptr, + int base, locale_t locale) +{ + const wchar_t *s; + unsigned long long acc; + wchar_t c; + unsigned long long cutoff; + int neg, any, cutlim; + FIX_LOCALE(locale); + + /* + * See strtoll for comments as to the logic used. + */ + s = nptr; + do { + c = *s++; + } while (iswspace_l(c, locale)); + if (c == L'-') { + neg = 1; + c = *s++; + } else { + neg = 0; + if (c == L'+') + c = *s++; + } + if ((base == 0 || base == 16) && + c == L'0' && (*s == L'x' || *s == L'X')) { + c = s[1]; + s += 2; + base = 16; + } + if (base == 0) + base = c == L'0' ? 8 : 10; + acc = any = 0; + if (base < 2 || base > 36) + goto noconv; + + cutoff = neg ? (unsigned long long)-(LLONG_MIN + LLONG_MAX) + LLONG_MAX + : LLONG_MAX; + cutlim = cutoff % base; + cutoff /= base; + for ( ; ; c = *s++) { +#ifdef notyet + if (iswdigit_l(c, locale)) + c = digittoint_l(c, locale); + else +#endif + if (c >= L'0' && c <= L'9') + c -= L'0'; + else if (c >= L'A' && c <= L'Z') + c -= L'A' - 10; + else if (c >= L'a' && c <= L'z') + c -= L'a' - 10; + else + break; + if (c >= base) + break; + if (any < 0 || acc > cutoff || (acc == cutoff && c > cutlim)) + any = -1; + else { + any = 1; + acc *= base; + acc += c; + } + } + if (any < 0) { + acc = neg ? LLONG_MIN : LLONG_MAX; + errno = ERANGE; + } else if (!any) { +noconv: + errno = EINVAL; + } else if (neg) + acc = -acc; + if (endptr != NULL) + *endptr = (wchar_t *)(any ? s - 1 : nptr); + return (acc); +} +long long +wcstoll(const wchar_t * __restrict nptr, wchar_t ** __restrict endptr, int base) +{ + return wcstoll_l(nptr, endptr, base, __get_locale()); +} diff --git a/lib/libc/locale/wcstombs.3 b/lib/libc/locale/wcstombs.3 new file mode 100644 index 0000000000000..0f216f62d11a7 --- /dev/null +++ b/lib/libc/locale/wcstombs.3 @@ -0,0 +1,90 @@ +.\" Copyright (c) 2002-2004 Tim J. Robbins. All rights reserved. +.\" Copyright (c) 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" Donn Seeley of BSDI. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" From @(#)multibyte.3 8.1 (Berkeley) 6/4/93 +.\" From FreeBSD: src/lib/libc/locale/multibyte.3,v 1.22 2003/11/08 03:23:11 tjr Exp +.\" $FreeBSD$ +.\" +.Dd April 8, 2004 +.Dt WCSTOMBS 3 +.Os +.Sh NAME +.Nm wcstombs +.Nd convert a wide-character string to a character string +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In stdlib.h +.Ft size_t +.Fo wcstombs +.Fa "char * restrict mbstring" "const wchar_t * restrict wcstring" +.Fa "size_t nbytes" +.Fc +.Sh DESCRIPTION +The +.Fn wcstombs +function converts a wide character string +.Fa wcstring +into a multibyte character string, +.Fa mbstring , +beginning in the initial conversion state. +Up to +.Fa nbytes +bytes are stored in +.Fa mbstring . +Partial multibyte characters at the end of the string are not stored. +The multibyte character string is null terminated if there is room. +.Sh RETURN VALUES +The +.Fn wcstombs +function returns the number of bytes converted +(not including any terminating null), if successful, otherwise it returns +.Po Vt size_t Pc Ns \-1 . +.Sh ERRORS +The +.Fn wcstombs +function will fail if: +.Bl -tag -width Er +.It Bq Er EILSEQ +An invalid wide character was encountered. +.It Bq Er EINVAL +The conversion state is invalid. +.El +.Sh SEE ALSO +.Xr mbstowcs 3 , +.Xr multibyte 3 , +.Xr wcsrtombs 3 , +.Xr wctomb 3 +.Sh STANDARDS +The +.Fn wcstombs +function conforms to +.St -isoC-99 . diff --git a/lib/libc/locale/wcstombs.c b/lib/libc/locale/wcstombs.c new file mode 100644 index 0000000000000..a34b58ecdb05a --- /dev/null +++ b/lib/libc/locale/wcstombs.c @@ -0,0 +1,60 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2002-2004 Tim J. Robbins. + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <limits.h> +#include <stdlib.h> +#include <wchar.h> +#include "mblocal.h" + +size_t +wcstombs_l(char * __restrict s, const wchar_t * __restrict pwcs, size_t n, + locale_t locale) +{ + static const mbstate_t initial; + mbstate_t mbs; + const wchar_t *pwcsp; + FIX_LOCALE(locale); + + mbs = initial; + pwcsp = pwcs; + return (XLOCALE_CTYPE(locale)->__wcsnrtombs(s, &pwcsp, SIZE_T_MAX, n, &mbs)); +} +size_t +wcstombs(char * __restrict s, const wchar_t * __restrict pwcs, size_t n) +{ + return wcstombs_l(s, pwcs, n, __get_locale()); +} + diff --git a/lib/libc/locale/wcstoul.c b/lib/libc/locale/wcstoul.c new file mode 100644 index 0000000000000..b550e869f7c54 --- /dev/null +++ b/lib/libc/locale/wcstoul.c @@ -0,0 +1,130 @@ +/*- + * SPDX-License-Identifier: BSD-3-Clause + * + * Copyright (c) 1990, 1993 + * The Regents of the University of California. All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <ctype.h> +#include <errno.h> +#include <limits.h> +#include <wchar.h> +#include <wctype.h> +#include "xlocale_private.h" + +/* + * Convert a wide character string to an unsigned long integer. + */ +unsigned long +wcstoul_l(const wchar_t * __restrict nptr, wchar_t ** __restrict endptr, + int base, locale_t locale) +{ + const wchar_t *s; + unsigned long acc; + wchar_t c; + unsigned long cutoff; + int neg, any, cutlim; + FIX_LOCALE(locale); + + /* + * See strtol for comments as to the logic used. + */ + s = nptr; + do { + c = *s++; + } while (iswspace_l(c, locale)); + if (c == L'-') { + neg = 1; + c = *s++; + } else { + neg = 0; + if (c == L'+') + c = *s++; + } + if ((base == 0 || base == 16) && + c == L'0' && (*s == L'x' || *s == L'X')) { + c = s[1]; + s += 2; + base = 16; + } + if (base == 0) + base = c == L'0' ? 8 : 10; + acc = any = 0; + if (base < 2 || base > 36) + goto noconv; + + cutoff = ULONG_MAX / base; + cutlim = ULONG_MAX % base; + for ( ; ; c = *s++) { +#ifdef notyet + if (iswdigit_l(c, locale)) + c = digittoint_l(c, locale); + else +#endif + if (c >= L'0' && c <= L'9') + c -= L'0'; + else if (c >= L'A' && c <= L'Z') + c -= L'A' - 10; + else if (c >= L'a' && c <= L'z') + c -= L'a' - 10; + else + break; + if (c >= base) + break; + if (any < 0 || acc > cutoff || (acc == cutoff && c > cutlim)) + any = -1; + else { + any = 1; + acc *= base; + acc += c; + } + } + if (any < 0) { + acc = ULONG_MAX; + errno = ERANGE; + } else if (!any) { +noconv: + errno = EINVAL; + } else if (neg) + acc = -acc; + if (endptr != NULL) + *endptr = (wchar_t *)(any ? s - 1 : nptr); + return (acc); +} +unsigned long +wcstoul(const wchar_t * __restrict nptr, wchar_t ** __restrict endptr, int base) +{ + return wcstoul_l(nptr, endptr, base, __get_locale()); +} diff --git a/lib/libc/locale/wcstoull.c b/lib/libc/locale/wcstoull.c new file mode 100644 index 0000000000000..6a04d213ff9b5 --- /dev/null +++ b/lib/libc/locale/wcstoull.c @@ -0,0 +1,137 @@ +/*- + * SPDX-License-Identifier: BSD-3-Clause + * + * Copyright (c) 1992, 1993 + * The Regents of the University of California. All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +#if 0 +#if defined(LIBC_SCCS) && !defined(lint) +static char sccsid[] = "@(#)strtouq.c 8.1 (Berkeley) 6/4/93"; +#endif /* LIBC_SCCS and not lint */ +__FBSDID("FreeBSD: src/lib/libc/stdlib/strtoull.c,v 1.18 2002/09/06 11:23:59 tjr Exp "); +#endif +__FBSDID("$FreeBSD$"); + +#include <errno.h> +#include <limits.h> +#include <stdlib.h> +#include <wchar.h> +#include <wctype.h> +#include "xlocale_private.h" + +/* + * Convert a wide character string to an unsigned long long integer. + */ +unsigned long long +wcstoull_l(const wchar_t * __restrict nptr, wchar_t ** __restrict endptr, + int base, locale_t locale) +{ + const wchar_t *s; + unsigned long long acc; + wchar_t c; + unsigned long long cutoff; + int neg, any, cutlim; + FIX_LOCALE(locale); + + /* + * See strtoull for comments as to the logic used. + */ + s = nptr; + do { + c = *s++; + } while (iswspace_l(c, locale)); + if (c == L'-') { + neg = 1; + c = *s++; + } else { + neg = 0; + if (c == L'+') + c = *s++; + } + if ((base == 0 || base == 16) && + c == L'0' && (*s == L'x' || *s == L'X')) { + c = s[1]; + s += 2; + base = 16; + } + if (base == 0) + base = c == L'0' ? 8 : 10; + acc = any = 0; + if (base < 2 || base > 36) + goto noconv; + + cutoff = ULLONG_MAX / base; + cutlim = ULLONG_MAX % base; + for ( ; ; c = *s++) { +#ifdef notyet + if (iswdigit_l(c, locale)) + c = digittoint_l(c, locale); + else +#endif + if (c >= L'0' && c <= L'9') + c -= L'0'; + else if (c >= L'A' && c <= L'Z') + c -= L'A' - 10; + else if (c >= L'a' && c <= L'z') + c -= L'a' - 10; + else + break; + if (c >= base) + break; + if (any < 0 || acc > cutoff || (acc == cutoff && c > cutlim)) + any = -1; + else { + any = 1; + acc *= base; + acc += c; + } + } + if (any < 0) { + acc = ULLONG_MAX; + errno = ERANGE; + } else if (!any) { +noconv: + errno = EINVAL; + } else if (neg) + acc = -acc; + if (endptr != NULL) + *endptr = (wchar_t *)(any ? s - 1 : nptr); + return (acc); +} +unsigned long long +wcstoull(const wchar_t * __restrict nptr, wchar_t ** __restrict endptr, + int base) +{ + return wcstoull_l(nptr, endptr, base, __get_locale()); +} diff --git a/lib/libc/locale/wcstoumax.c b/lib/libc/locale/wcstoumax.c new file mode 100644 index 0000000000000..0c1cd0b4b8dc4 --- /dev/null +++ b/lib/libc/locale/wcstoumax.c @@ -0,0 +1,137 @@ +/*- + * SPDX-License-Identifier: BSD-3-Clause + * + * Copyright (c) 1992, 1993 + * The Regents of the University of California. All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +#if 0 +#if defined(LIBC_SCCS) && !defined(lint) +static char sccsid[] = "from @(#)strtoul.c 8.1 (Berkeley) 6/4/93"; +#endif /* LIBC_SCCS and not lint */ +__FBSDID("FreeBSD: src/lib/libc/stdlib/strtoumax.c,v 1.8 2002/09/06 11:23:59 tjr Exp "); +#endif +__FBSDID("$FreeBSD$"); + +#include <errno.h> +#include <inttypes.h> +#include <stdlib.h> +#include <wchar.h> +#include <wctype.h> +#include "xlocale_private.h" + +/* + * Convert a wide character string to a uintmax_t integer. + */ +uintmax_t +wcstoumax_l(const wchar_t * __restrict nptr, wchar_t ** __restrict endptr, + int base, locale_t locale) +{ + const wchar_t *s; + uintmax_t acc; + wchar_t c; + uintmax_t cutoff; + int neg, any, cutlim; + FIX_LOCALE(locale); + + /* + * See strtoimax for comments as to the logic used. + */ + s = nptr; + do { + c = *s++; + } while (iswspace_l(c, locale)); + if (c == L'-') { + neg = 1; + c = *s++; + } else { + neg = 0; + if (c == L'+') + c = *s++; + } + if ((base == 0 || base == 16) && + c == L'0' && (*s == L'x' || *s == L'X')) { + c = s[1]; + s += 2; + base = 16; + } + if (base == 0) + base = c == L'0' ? 8 : 10; + acc = any = 0; + if (base < 2 || base > 36) + goto noconv; + + cutoff = UINTMAX_MAX / base; + cutlim = UINTMAX_MAX % base; + for ( ; ; c = *s++) { +#ifdef notyet + if (iswdigit_l(c, locale)) + c = digittoint_l(c, locale); + else +#endif + if (c >= L'0' && c <= L'9') + c -= L'0'; + else if (c >= L'A' && c <= L'Z') + c -= L'A' - 10; + else if (c >= L'a' && c <= L'z') + c -= L'a' - 10; + else + break; + if (c >= base) + break; + if (any < 0 || acc > cutoff || (acc == cutoff && c > cutlim)) + any = -1; + else { + any = 1; + acc *= base; + acc += c; + } + } + if (any < 0) { + acc = UINTMAX_MAX; + errno = ERANGE; + } else if (!any) { +noconv: + errno = EINVAL; + } else if (neg) + acc = -acc; + if (endptr != NULL) + *endptr = (wchar_t *)(any ? s - 1 : nptr); + return (acc); +} +uintmax_t +wcstoumax(const wchar_t * __restrict nptr, wchar_t ** __restrict endptr, + int base) +{ + return wcstoumax_l(nptr, endptr, base, __get_locale()); +} diff --git a/lib/libc/locale/wctob.c b/lib/libc/locale/wctob.c new file mode 100644 index 0000000000000..b43761c48d458 --- /dev/null +++ b/lib/libc/locale/wctob.c @@ -0,0 +1,58 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2002-2004 Tim J. Robbins. + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <limits.h> +#include <stdio.h> +#include <wchar.h> +#include "mblocal.h" + +int +wctob_l(wint_t c, locale_t locale) +{ + static const mbstate_t initial; + mbstate_t mbs = initial; + char buf[MB_LEN_MAX]; + FIX_LOCALE(locale); + + if (c == WEOF || XLOCALE_CTYPE(locale)->__wcrtomb(buf, c, &mbs) != 1) + return (EOF); + return ((unsigned char)*buf); +} +int +wctob(wint_t c) +{ + return wctob_l(c, __get_locale()); +} diff --git a/lib/libc/locale/wctomb.3 b/lib/libc/locale/wctomb.3 new file mode 100644 index 0000000000000..e70a2bde71d03 --- /dev/null +++ b/lib/libc/locale/wctomb.3 @@ -0,0 +1,108 @@ +.\" Copyright (c) 2002-2004 Tim J. Robbins. All rights reserved. +.\" Copyright (c) 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This code is derived from software contributed to Berkeley by +.\" Donn Seeley of BSDI. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" From @(#)multibyte.3 8.1 (Berkeley) 6/4/93 +.\" From FreeBSD: src/lib/libc/locale/multibyte.3,v 1.22 2003/11/08 03:23:11 tjr Exp +.\" $FreeBSD$ +.\" +.Dd April 8, 2004 +.Dt WCTOMB 3 +.Os +.Sh NAME +.Nm wctomb +.Nd convert a wide-character code to a character +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In stdlib.h +.Ft int +.Fn wctomb "char *mbchar" "wchar_t wchar" +.Sh DESCRIPTION +The +.Fn wctomb +function converts a wide character +.Fa wchar +into a multibyte character and stores +the result in +.Fa mbchar . +The object pointed to by +.Fa mbchar +must be large enough to accommodate the multibyte character, which +may be up to +.Dv MB_LEN_MAX +bytes. +.Pp +A call with a null +.Fa mbchar +pointer returns nonzero if the current locale requires shift states, +zero otherwise; +if shift states are required, the shift state is reset to the initial state. +.Sh RETURN VALUES +If +.Fa mbchar +is +.Dv NULL , +the +.Fn wctomb +function returns nonzero if shift states are supported, +zero otherwise. +If +.Fa mbchar +is valid, +.Fn wctomb +returns +the number of bytes processed in +.Fa mbchar , +or \-1 if no multibyte character +could be recognized or converted. +In this case, +.Fn wctomb Ns 's +internal conversion state is undefined. +.Sh ERRORS +The +.Fn wctomb +function will fail if: +.Bl -tag -width Er +.It Bq Er EILSEQ +An invalid multibyte sequence was detected. +.It Bq Er EINVAL +The internal conversion state is invalid. +.El +.Sh SEE ALSO +.Xr mbtowc 3 , +.Xr wcrtomb 3 , +.Xr wcstombs 3 , +.Xr wctob 3 +.Sh STANDARDS +The +.Fn wctomb +function conforms to +.St -isoC-99 . diff --git a/lib/libc/locale/wctomb.c b/lib/libc/locale/wctomb.c new file mode 100644 index 0000000000000..151d67997548d --- /dev/null +++ b/lib/libc/locale/wctomb.c @@ -0,0 +1,61 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2002-2004 Tim J. Robbins. + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <stdlib.h> +#include <wchar.h> +#include "mblocal.h" + +int +wctomb_l(char *s, wchar_t wchar, locale_t locale) +{ + static const mbstate_t initial; + size_t rval; + FIX_LOCALE(locale); + + if (s == NULL) { + /* No support for state dependent encodings. */ + locale->wctomb = initial; + return (0); + } + if ((rval = XLOCALE_CTYPE(locale)->__wcrtomb(s, wchar, &locale->wctomb)) == (size_t)-1) + return (-1); + return ((int)rval); +} +int +wctomb(char *s, wchar_t wchar) +{ + return wctomb_l(s, wchar, __get_locale()); +} diff --git a/lib/libc/locale/wctrans.3 b/lib/libc/locale/wctrans.3 new file mode 100644 index 0000000000000..ce3e68c9eb340 --- /dev/null +++ b/lib/libc/locale/wctrans.3 @@ -0,0 +1,122 @@ +.\" Copyright (c) 2002 Tim J. Robbins +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd October 3, 2002 +.Dt WCTRANS 3 +.Os +.Sh NAME +.Nm towctrans , wctrans +.Nd "wide character mapping functions" +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In wctype.h +.Ft wint_t +.Fn towctrans "wint_t wc" "wctrans_t desc" +.Ft wctrans_t +.Fn wctrans "const char *charclass" +.Sh DESCRIPTION +The +.Fn wctrans +function returns a value of type +.Vt wctrans_t +which represents the requested wide character mapping operation and +may be used as the second argument for calls to +.Fn towctrans . +.Pp +The following character mapping names are recognised: +.Bl -column -offset indent ".Li tolower" ".Li toupper" +.It Li "tolower toupper" +.El +.Pp +The +.Fn towctrans +function transliterates the wide character +.Fa wc +according to the mapping described by +.Fa desc . +.Sh RETURN VALUES +The +.Fn towctrans +function returns the transliterated character if successful, otherwise +it returns the character unchanged and sets +.Va errno . +.Pp +The +.Fn wctrans +function returns non-zero if successful, otherwise it returns zero +and sets +.Va errno . +.Sh EXAMPLES +Reimplement +.Fn towupper +in terms of +.Fn towctrans +and +.Fn wctrans : +.Bd -literal -offset indent +wint_t +mytowupper(wint_t wc) +{ + return (towctrans(wc, wctrans("toupper"))); +} +.Ed +.Sh ERRORS +The +.Fn towctrans +function will fail if: +.Bl -tag -width Er +.It Bq Er EINVAL +The supplied +.Fa desc +argument is invalid. +.El +.Pp +The +.Fn wctrans +function will fail if: +.Bl -tag -width Er +.It Bq Er EINVAL +The requested mapping name is invalid. +.El +.Sh SEE ALSO +.Xr tolower 3 , +.Xr toupper 3 , +.Xr wctype 3 +.Sh STANDARDS +The +.Fn towctrans +and +.Fn wctrans +functions conform to +.St -p1003.1-2001 . +.Sh HISTORY +The +.Fn towctrans +and +.Fn wctrans +functions first appeared in +.Fx 5.0 . diff --git a/lib/libc/locale/wctrans.c b/lib/libc/locale/wctrans.c new file mode 100644 index 0000000000000..36cb25e455491 --- /dev/null +++ b/lib/libc/locale/wctrans.c @@ -0,0 +1,103 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2002 Tim J. Robbins. + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <errno.h> +#include <string.h> +#include <wctype.h> +#include "xlocale_private.h" + +enum { + _WCT_ERROR = 0, + _WCT_TOLOWER = 1, + _WCT_TOUPPER = 2 +}; + +wint_t +towctrans_l(wint_t wc, wctrans_t desc, locale_t locale) +{ + switch (desc) { + case _WCT_TOLOWER: + wc = towlower_l(wc, locale); + break; + case _WCT_TOUPPER: + wc = towupper_l(wc, locale); + break; + case _WCT_ERROR: + default: + errno = EINVAL; + break; + } + + return (wc); +} +wint_t +towctrans(wint_t wc, wctrans_t desc) +{ + return towctrans_l(wc, desc, __get_locale()); +} + +/* + * wctrans() calls this will a 0 locale. If this is ever modified to actually + * use the locale, wctrans() must be modified to call __get_locale(). + */ +wctrans_t +wctrans_l(const char *charclass, locale_t locale) +{ + struct { + const char *name; + wctrans_t trans; + } ccls[] = { + { "tolower", _WCT_TOLOWER }, + { "toupper", _WCT_TOUPPER }, + { NULL, _WCT_ERROR }, /* Default */ + }; + int i; + + i = 0; + while (ccls[i].name != NULL && strcmp(ccls[i].name, charclass) != 0) + i++; + + if (ccls[i].trans == _WCT_ERROR) + errno = EINVAL; + return (ccls[i].trans); +} + +wctrans_t +wctrans(const char *charclass) +{ + return wctrans_l(charclass, 0); +} + diff --git a/lib/libc/locale/wctype.3 b/lib/libc/locale/wctype.3 new file mode 100644 index 0000000000000..099631d517904 --- /dev/null +++ b/lib/libc/locale/wctype.3 @@ -0,0 +1,119 @@ +.\" Copyright (c) 2002 Tim J. Robbins +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd March 27, 2004 +.Dt WCTYPE 3 +.Os +.Sh NAME +.Nm iswctype , wctype +.Nd "wide character class functions" +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In wctype.h +.Ft int +.Fn iswctype "wint_t wc" "wctype_t charclass" +.Ft wctype_t +.Fn wctype "const char *property" +.Sh DESCRIPTION +The +.Fn wctype +function returns a value of type +.Vt wctype_t +which represents the requested wide character class and +may be used as the second argument for calls to +.Fn iswctype . +.Pp +The following character class names are recognised: +.Bl -column -offset indent ".Li alnum" ".Li cntrl" ".Li ideogram" ".Li print" ".Li space" +.It Li "alnum cntrl ideogram print space xdigit" +.It Li "alpha digit lower punct special" +.It Li "blank graph phonogram rune upper" +.El +.Pp +The +.Fn iswctype +function checks whether the wide character +.Fa wc +is in the character class +.Fa charclass . +.Sh RETURN VALUES +The +.Fn iswctype +function returns non-zero if and only if +.Fa wc +has the property described by +.Fa charclass , +or +.Fa charclass +is zero. +.Pp +The +.Fn wctype +function returns 0 if +.Fa property +is invalid, otherwise it returns a value of type +.Vt wctype_t +that can be used in subsequent calls to +.Fn iswctype . +.Sh EXAMPLES +Reimplement +.Xr iswalpha 3 +in terms of +.Fn iswctype +and +.Fn wctype : +.Bd -literal -offset indent +int +myiswalpha(wint_t wc) +{ + return (iswctype(wc, wctype("alpha"))); +} +.Ed +.Sh SEE ALSO +.Xr ctype 3 , +.Xr nextwctype 3 +.Sh STANDARDS +The +.Fn iswctype +and +.Fn wctype +functions conform to +.St -p1003.1-2001 . +The +.Dq Li ideogram , +.Dq Li phonogram , +.Dq Li special , +and +.Dq Li rune +character classes are extensions. +.Sh HISTORY +The +.Fn iswctype +and +.Fn wctype +functions first appeared in +.Fx 5.0 . diff --git a/lib/libc/locale/wctype.c b/lib/libc/locale/wctype.c new file mode 100644 index 0000000000000..dd8f218306b27 --- /dev/null +++ b/lib/libc/locale/wctype.c @@ -0,0 +1,117 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2002 Tim J. Robbins. + * All rights reserved. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <ctype.h> +#include <string.h> +#include <wctype.h> +#include <xlocale.h> + +#undef iswctype +int +iswctype(wint_t wc, wctype_t charclass) +{ + return (__istype(wc, charclass)); +} +int +iswctype_l(wint_t wc, wctype_t charclass, locale_t locale) +{ + return __istype_l(wc, charclass, locale); +} + +/* + * IMPORTANT: The 0 in the call to this function in wctype() must be changed to + * __get_locale() if wctype_l() is ever modified to actually use the locale + * parameter. + */ +wctype_t +wctype_l(const char *property, locale_t locale) +{ + const char *propnames = + "alnum\0" + "alpha\0" + "blank\0" + "cntrl\0" + "digit\0" + "graph\0" + "lower\0" + "print\0" + "punct\0" + "space\0" + "upper\0" + "xdigit\0" + "ideogram\0" /* BSD extension */ + "special\0" /* BSD extension */ + "phonogram\0" /* BSD extension */ + "number\0" /* BSD extension */ + "rune\0"; /* BSD extension */ + static const wctype_t propmasks[] = { + _CTYPE_A|_CTYPE_N, + _CTYPE_A, + _CTYPE_B, + _CTYPE_C, + _CTYPE_D, + _CTYPE_G, + _CTYPE_L, + _CTYPE_R, + _CTYPE_P, + _CTYPE_S, + _CTYPE_U, + _CTYPE_X, + _CTYPE_I, + _CTYPE_T, + _CTYPE_Q, + _CTYPE_N, + 0xFFFFFF00L + }; + size_t len1, len2; + const char *p; + const wctype_t *q; + + len1 = strlen(property); + q = propmasks; + for (p = propnames; (len2 = strlen(p)) != 0; p += len2 + 1) { + if (len1 == len2 && memcmp(property, p, len1) == 0) + return (*q); + q++; + } + + return (0UL); +} + +wctype_t wctype(const char *property) +{ + return wctype_l(property, 0); +} diff --git a/lib/libc/locale/wcwidth.3 b/lib/libc/locale/wcwidth.3 new file mode 100644 index 0000000000000..0c7c74fed155c --- /dev/null +++ b/lib/libc/locale/wcwidth.3 @@ -0,0 +1,87 @@ +.\" Copyright (c) 2002 Tim J. Robbins +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd August 17, 2004 +.Dt WCWIDTH 3 +.Os +.Sh NAME +.Nm wcwidth +.Nd "number of column positions of a wide-character code" +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In wchar.h +.Ft int +.Fn wcwidth "wchar_t wc" +.Sh DESCRIPTION +The +.Fn wcwidth +function determines the number of column positions required to +display the wide character +.Fa wc . +.Sh RETURN VALUES +The +.Fn wcwidth +function returns 0 if the +.Fa wc +argument is a null wide character (L'\e0'), +\-1 if +.Fa wc +is not printable, +otherwise it returns the number of column positions the +character occupies. +.Sh EXAMPLES +This code fragment reads text from standard input and +breaks lines that are more than 20 column positions wide, +similar to the +.Xr fold 1 +utility: +.Bd -literal -offset indent +wint_t ch; +int column, w; + +column = 0; +while ((ch = getwchar()) != WEOF) { + w = wcwidth(ch); + if (w > 0 && column + w >= 20) { + putwchar(L'\en'); + column = 0; + } + putwchar(ch); + if (ch == L'\en') + column = 0; + else if (w > 0) + column += w; +} +.Ed +.Sh SEE ALSO +.Xr iswprint 3 , +.Xr wcswidth 3 +.Sh STANDARDS +The +.Fn wcwidth +function conforms to +.St -p1003.1-2001 . diff --git a/lib/libc/locale/wcwidth.c b/lib/libc/locale/wcwidth.c new file mode 100644 index 0000000000000..53145e9b82543 --- /dev/null +++ b/lib/libc/locale/wcwidth.c @@ -0,0 +1,63 @@ +/*- + * SPDX-License-Identifier: BSD-3-Clause + * + * Copyright (c) 1989, 1993 + * The Regents of the University of California. All rights reserved. + * (c) UNIX System Laboratories, Inc. + * All or some portions of this file are derived from material licensed + * to the University of California by American Telephone and Telegraph + * Co. or Unix System Laboratories, Inc. and are reproduced herein with + * the permission of UNIX System Laboratories, Inc. + * + * This code is derived from software contributed to Berkeley by + * Paul Borman at Krystal Technologies. + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * Portions of this software were developed by David Chisnall + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <wchar.h> +#include <wctype.h> +#include <xlocale.h> + +#undef wcwidth + +int +wcwidth(wchar_t wc) +{ + return (__wcwidth(wc)); +} +int +wcwidth_l(wchar_t wc, locale_t locale) +{ + return (__wcwidth_l(wc, locale)); +} diff --git a/lib/libc/locale/xlocale.3 b/lib/libc/locale/xlocale.3 new file mode 100644 index 0000000000000..da217c601100a --- /dev/null +++ b/lib/libc/locale/xlocale.3 @@ -0,0 +1,280 @@ +.\" Copyright (c) 2011 The FreeBSD Foundation +.\" All rights reserved. +.\" +.\" This documentation was written by David Chisnall under sponsorship from +.\" the FreeBSD Foundation. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd September 17, 2011 +.Dt XLOCALE 3 +.Os +.Sh NAME +.Nm xlocale +.Nd Thread-safe extended locale support +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In xlocale.h +.Sh DESCRIPTION +The extended locale support includes a set of functions for setting +thread-local locales, +as well convenience functions for performing locale-aware +calls with a specified locale. +.Pp +The core of the xlocale API is the +.Fa locale_t +type. +This is an opaque type encapsulating a locale. +Instances of this can be either set as the locale for a specific thread or +passed directly to the +.Fa _l +suffixed variants of various standard C functions. +Two special +.Fa locale_t +values are available: +.Bl -bullet -offset indent +.It +NULL refers to the current locale for the thread, +or to the global locale if no locale has been set for this thread. +.It +LC_GLOBAL_LOCALE refers to the global locale. +.El +.Pp +The global locale is the locale set with the +.Xr setlocale 3 +function. +.Sh SEE ALSO +.Xr duplocale 3 , +.Xr freelocale 3 , +.Xr localeconv 3 , +.Xr newlocale 3 , +.Xr querylocale 3 , +.Xr uselocale 3 +.Sh CONVENIENCE FUNCTIONS +The xlocale API includes a number of +.Fa _l +suffixed convenience functions. +These are variants of standard C functions +that have been modified to take an explicit +.Fa locale_t +parameter as the final argument or, in the case of variadic functions, +as an additional argument directly before the format string. +Each of these functions accepts either NULL or LC_GLOBAL_LOCALE. +In these functions, NULL refers to the C locale, +rather than the thread's current locale. +If you wish to use the thread's current locale, +then use the unsuffixed version of the function. +.Pp +These functions are exposed by including +.In xlocale.h +.Em after +including the relevant headers for the standard variant. +For example, the +.Xr strtol_l 3 +function is exposed by including +.In xlocale.h +after +.In stdlib.h , +which defines +.Xr strtol 3 . +.Pp +For reference, +a complete list of the locale-aware functions that are available in this form, +along with the headers that expose them, is provided here: +.Bl -tag -width "<monetary.h> " +.It In wctype.h +.Xr iswalnum_l 3 , +.Xr iswalpha_l 3 , +.Xr iswcntrl_l 3 , +.Xr iswctype_l 3 , +.Xr iswdigit_l 3 , +.Xr iswgraph_l 3 , +.Xr iswlower_l 3 , +.Xr iswprint_l 3 , +.Xr iswpunct_l 3 , +.Xr iswspace_l 3 , +.Xr iswupper_l 3 , +.Xr iswxdigit_l 3 , +.Xr towlower_l 3 , +.Xr towupper_l 3 , +.Xr wctype_l 3 , +.It In ctype.h +.Xr digittoint_l 3 , +.Xr isalnum_l 3 , +.Xr isalpha_l 3 , +.Xr isblank_l 3 , +.Xr iscntrl_l 3 , +.Xr isdigit_l 3 , +.Xr isgraph_l 3 , +.Xr ishexnumber_l 3 , +.Xr isideogram_l 3 , +.Xr islower_l 3 , +.Xr isnumber_l 3 , +.Xr isphonogram_l 3 , +.Xr isprint_l 3 , +.Xr ispunct_l 3 , +.Xr isrune_l 3 , +.Xr isspace_l 3 , +.Xr isspecial_l 3 , +.Xr isupper_l 3 , +.Xr isxdigit_l 3 , +.Xr tolower_l 3 , +.Xr toupper_l 3 +.It In inttypes.h +.Xr strtoimax_l 3 , +.Xr strtoumax_l 3 , +.Xr wcstoimax_l 3 , +.Xr wcstoumax_l 3 +.It In langinfo.h +.Xr nl_langinfo_l 3 +.It In monetary.h +.Xr strfmon_l 3 +.It In stdio.h +.Xr asprintf_l 3 , +.Xr fprintf_l 3 , +.Xr fscanf_l 3 , +.Xr printf_l 3 , +.Xr scanf_l 3 , +.Xr snprintf_l 3 , +.Xr sprintf_l 3 , +.Xr sscanf_l 3 , +.Xr vasprintf_l 3 , +.Xr vfprintf_l 3 , +.Xr vfscanf_l 3 , +.Xr vprintf_l 3 , +.Xr vscanf_l 3 , +.Xr vsnprintf_l 3 , +.Xr vsprintf_l 3 , +.Xr vsscanf_l 3 +.It In stdlib.h +.Xr atof_l 3 , +.Xr atoi_l 3 , +.Xr atol_l 3 , +.Xr atoll_l 3 , +.Xr mblen_l 3 , +.Xr mbstowcs_l 3 , +.Xr mbtowc_l 3 , +.Xr strtod_l 3 , +.Xr strtof_l 3 , +.Xr strtol_l 3 , +.Xr strtold_l 3 , +.Xr strtoll_l 3 , +.Xr strtoq_l 3 , +.Xr strtoul_l 3 , +.Xr strtoull_l 3 , +.Xr strtouq_l 3 , +.Xr wcstombs_l 3 , +.Xr wctomb_l 3 +.It In string.h +.Xr strcoll_l 3 , +.Xr strxfrm_l 3 , +.Xr strcasecmp_l 3 , +.Xr strcasestr_l 3 , +.Xr strncasecmp_l 3 +.It In time.h +.Xr strftime_l 3 +.Xr strptime_l 3 +.It In wchar.h +.Xr btowc_l 3 , +.Xr fgetwc_l 3 , +.Xr fgetws_l 3 , +.Xr fputwc_l 3 , +.Xr fputws_l 3 , +.Xr fwprintf_l 3 , +.Xr fwscanf_l 3 , +.Xr getwc_l 3 , +.Xr getwchar_l 3 , +.Xr mbrlen_l 3 , +.Xr mbrtowc_l 3 , +.Xr mbsinit_l 3 , +.Xr mbsnrtowcs_l 3 , +.Xr mbsrtowcs_l 3 , +.Xr putwc_l 3 , +.Xr putwchar_l 3 , +.Xr swprintf_l 3 , +.Xr swscanf_l 3 , +.Xr ungetwc_l 3 , +.Xr vfwprintf_l 3 , +.Xr vfwscanf_l 3 , +.Xr vswprintf_l 3 , +.Xr vswscanf_l 3 , +.Xr vwprintf_l 3 , +.Xr vwscanf_l 3 , +.Xr wcrtomb_l 3 , +.Xr wcscoll_l 3 , +.Xr wcsftime_l 3 , +.Xr wcsnrtombs_l 3 , +.Xr wcsrtombs_l 3 , +.Xr wcstod_l 3 , +.Xr wcstof_l 3 , +.Xr wcstol_l 3 , +.Xr wcstold_l 3 , +.Xr wcstoll_l 3 , +.Xr wcstoul_l 3 , +.Xr wcstoull_l 3 , +.Xr wcswidth_l 3 , +.Xr wcsxfrm_l 3 , +.Xr wctob_l 3 , +.Xr wcwidth_l 3 , +.Xr wprintf_l 3 , +.Xr wscanf_l 3 +.It In wctype.h +.Xr iswblank_l 3 , +.Xr iswhexnumber_l 3 , +.Xr iswideogram_l 3 , +.Xr iswnumber_l 3 , +.Xr iswphonogram_l 3 , +.Xr iswrune_l 3 , +.Xr iswspecial_l 3 , +.Xr nextwctype_l 3 , +.Xr towctrans_l 3 , +.Xr wctrans_l 3 +.It In xlocale.h +.Xr localeconv_l 3 +.El +.Sh STANDARDS +The functions +conform to +.St -p1003.1-2008 . +.Sh HISTORY +The xlocale APIs first appeared in Darwin 8.0. +This implementation was written by David Chisnall, +under sponsorship from the FreeBSD Foundation and first appeared in +.Fx 9.1 . +.Sh CAVEATS +The +.Xr setlocale 3 +function, and others in the family, refer to the global locale. +Other functions that depend on the locale, however, +will take the thread-local locale if one has been set. +This means that the idiom of setting the locale using +.Xr setlocale 3 , +calling a locale-dependent function, +and then restoring the locale will not +have the expected behavior if the current thread has had a locale set using +.Xr uselocale 3 . +You should avoid this idiom and prefer to use the +.Fa _l +suffixed versions instead. diff --git a/lib/libc/locale/xlocale.c b/lib/libc/locale/xlocale.c new file mode 100644 index 0000000000000..87160b26b4dc9 --- /dev/null +++ b/lib/libc/locale/xlocale.c @@ -0,0 +1,369 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * + * This software was developed by David Chisnall under sponsorship from + * the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#include <pthread.h> +#include <stdio.h> +#include <string.h> +#include <runetype.h> +#include "libc_private.h" +#include "xlocale_private.h" + +/** + * Each locale loader declares a global component. This is used by setlocale() + * and also by xlocale with LC_GLOBAL_LOCALE.. + */ +extern struct xlocale_component __xlocale_global_collate; +extern struct xlocale_component __xlocale_global_ctype; +extern struct xlocale_component __xlocale_global_monetary; +extern struct xlocale_component __xlocale_global_numeric; +extern struct xlocale_component __xlocale_global_time; +extern struct xlocale_component __xlocale_global_messages; +/* + * And another version for the statically-allocated C locale. We only have + * components for the parts that are expected to be sensible. + */ +extern struct xlocale_component __xlocale_C_collate; +extern struct xlocale_component __xlocale_C_ctype; + +#ifndef __NO_TLS +/* + * The locale for this thread. + */ +_Thread_local locale_t __thread_locale; +#endif +/* + * Flag indicating that one or more per-thread locales exist. + */ +int __has_thread_locale; +/* + * Private functions in setlocale.c. + */ +const char * +__get_locale_env(int category); +int +__detect_path_locale(void); + +struct _xlocale __xlocale_global_locale = { + {0}, + { + &__xlocale_global_collate, + &__xlocale_global_ctype, + &__xlocale_global_monetary, + &__xlocale_global_numeric, + &__xlocale_global_time, + &__xlocale_global_messages + }, + 1, + 0, + 1, + 0 +}; + +struct _xlocale __xlocale_C_locale = { + {0}, + { + &__xlocale_C_collate, + &__xlocale_C_ctype, + 0, 0, 0, 0 + }, + 1, + 0, + 1, + 0 +}; + +static void*(*constructors[])(const char*, locale_t) = +{ + __collate_load, + __ctype_load, + __monetary_load, + __numeric_load, + __time_load, + __messages_load +}; + +static pthread_key_t locale_info_key; +static int fake_tls; +static locale_t thread_local_locale; + +static void init_key(void) +{ + + pthread_key_create(&locale_info_key, xlocale_release); + pthread_setspecific(locale_info_key, (void*)42); + if (pthread_getspecific(locale_info_key) == (void*)42) { + pthread_setspecific(locale_info_key, 0); + } else { + fake_tls = 1; + } + /* At least one per-thread locale has now been set. */ + __has_thread_locale = 1; + __detect_path_locale(); +} + +static pthread_once_t once_control = PTHREAD_ONCE_INIT; + +static locale_t +get_thread_locale(void) +{ + + _once(&once_control, init_key); + + return (fake_tls ? thread_local_locale : + pthread_getspecific(locale_info_key)); +} + +#ifdef __NO_TLS +locale_t +__get_locale(void) +{ + locale_t l = get_thread_locale(); + return (l ? l : &__xlocale_global_locale); + +} +#endif + +static void +set_thread_locale(locale_t loc) +{ + locale_t l = (loc == LC_GLOBAL_LOCALE) ? 0 : loc; + + _once(&once_control, init_key); + + if (NULL != l) { + xlocale_retain((struct xlocale_refcounted*)l); + } + locale_t old = pthread_getspecific(locale_info_key); + if ((NULL != old) && (l != old)) { + xlocale_release((struct xlocale_refcounted*)old); + } + if (fake_tls) { + thread_local_locale = l; + } else { + pthread_setspecific(locale_info_key, l); + } +#ifndef __NO_TLS + __thread_locale = l; + __set_thread_rune_locale(loc); +#endif +} + +/** + * Clean up a locale, once its reference count reaches zero. This function is + * called by xlocale_release(), it should not be called directly. + */ +static void +destruct_locale(void *l) +{ + locale_t loc = l; + + for (int type=0 ; type<XLC_LAST ; type++) { + if (loc->components[type]) { + xlocale_release(loc->components[type]); + } + } + if (loc->csym) { + free(loc->csym); + } + free(l); +} + +/** + * Allocates a new, uninitialised, locale. + */ +static locale_t +alloc_locale(void) +{ + locale_t new = calloc(sizeof(struct _xlocale), 1); + + new->header.destructor = destruct_locale; + new->monetary_locale_changed = 1; + new->numeric_locale_changed = 1; + return (new); +} +static void +copyflags(locale_t new, locale_t old) +{ + new->using_monetary_locale = old->using_monetary_locale; + new->using_numeric_locale = old->using_numeric_locale; + new->using_time_locale = old->using_time_locale; + new->using_messages_locale = old->using_messages_locale; +} + +static int dupcomponent(int type, locale_t base, locale_t new) +{ + /* Always copy from the global locale, since it has mutable components. + */ + struct xlocale_component *src = base->components[type]; + + if (&__xlocale_global_locale == base) { + new->components[type] = constructors[type](src->locale, new); + if (new->components[type]) { + strncpy(new->components[type]->locale, src->locale, + ENCODING_LEN); + } + } else if (base->components[type]) { + new->components[type] = xlocale_retain(base->components[type]); + } else { + /* If the component was NULL, return success - if base is a + * valid locale then the flag indicating that this isn't + * present should be set. If it isn't a valid locale, then + * we're stuck anyway. */ + return 1; + } + return (0 != new->components[type]); +} + +/* + * Public interfaces. These are the five public functions described by the + * xlocale interface. + */ + +locale_t newlocale(int mask, const char *locale, locale_t base) +{ + int type; + const char *realLocale = locale; + int useenv = 0; + int success = 1; + + _once(&once_control, init_key); + + locale_t new = alloc_locale(); + if (NULL == new) { + return (NULL); + } + + FIX_LOCALE(base); + copyflags(new, base); + + if (NULL == locale) { + realLocale = "C"; + } else if ('\0' == locale[0]) { + useenv = 1; + } + + for (type=0 ; type<XLC_LAST ; type++) { + if (mask & 1) { + if (useenv) { + realLocale = __get_locale_env(type + 1); + } + new->components[type] = + constructors[type](realLocale, new); + if (new->components[type]) { + strncpy(new->components[type]->locale, + realLocale, ENCODING_LEN); + } else { + success = 0; + break; + } + } else { + if (!dupcomponent(type, base, new)) { + success = 0; + break; + } + } + mask >>= 1; + } + if (0 == success) { + xlocale_release(new); + new = NULL; + } + + return (new); +} + +locale_t duplocale(locale_t base) +{ + locale_t new = alloc_locale(); + int type; + + _once(&once_control, init_key); + + if (NULL == new) { + return (NULL); + } + + FIX_LOCALE(base); + copyflags(new, base); + + for (type=0 ; type<XLC_LAST ; type++) { + dupcomponent(type, base, new); + } + + return (new); +} + +/* + * Free a locale_t. This is quite a poorly named function. It actually + * disclaims a reference to a locale_t, rather than freeing it. + */ +void +freelocale(locale_t loc) +{ + + /* + * Fail if we're passed something that isn't a locale. If we're + * passed the global locale, pretend that we freed it but don't + * actually do anything. + */ + if (loc != NULL && loc != LC_GLOBAL_LOCALE && + loc != &__xlocale_global_locale) + xlocale_release(loc); +} + +/* + * Returns the name of the locale for a particular component of a locale_t. + */ +const char *querylocale(int mask, locale_t loc) +{ + int type = ffs(mask) - 1; + FIX_LOCALE(loc); + if (type >= XLC_LAST) + return (NULL); + if (loc->components[type]) + return (loc->components[type]->locale); + return ("C"); +} + +/* + * Installs the specified locale_t as this thread's locale. + */ +locale_t uselocale(locale_t loc) +{ + locale_t old = get_thread_locale(); + if (NULL != loc) { + set_thread_locale(loc); + } + return (old ? old : LC_GLOBAL_LOCALE); +} + diff --git a/lib/libc/locale/xlocale_private.h b/lib/libc/locale/xlocale_private.h new file mode 100644 index 0000000000000..9aa4d86c87caf --- /dev/null +++ b/lib/libc/locale/xlocale_private.h @@ -0,0 +1,254 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2011 The FreeBSD Foundation + * All rights reserved. + * + * This software was developed by David Chisnall under sponsorship from + * the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#ifndef _XLOCALE_PRIVATE__H_ +#define _XLOCALE_PRIVATE__H_ + +#include <xlocale.h> +#include <locale.h> +#include <stdlib.h> +#include <stdint.h> +#include <sys/types.h> +#include <machine/atomic.h> +#include "setlocale.h" + +/** + * The XLC_ values are indexes into the components array. They are defined in + * the same order as the LC_ values in locale.h, but without the LC_ALL zero + * value. Translating from LC_X to XLC_X is done by subtracting one. + * + * Any reordering of this enum should ensure that these invariants are not + * violated. + */ +enum { + XLC_COLLATE = 0, + XLC_CTYPE, + XLC_MONETARY, + XLC_NUMERIC, + XLC_TIME, + XLC_MESSAGES, + XLC_LAST +}; + +_Static_assert(XLC_LAST - XLC_COLLATE == 6, "XLC values should be contiguous"); +_Static_assert(XLC_COLLATE == LC_COLLATE - 1, + "XLC_COLLATE doesn't match the LC_COLLATE value."); +_Static_assert(XLC_CTYPE == LC_CTYPE - 1, + "XLC_CTYPE doesn't match the LC_CTYPE value."); +_Static_assert(XLC_MONETARY == LC_MONETARY - 1, + "XLC_MONETARY doesn't match the LC_MONETARY value."); +_Static_assert(XLC_NUMERIC == LC_NUMERIC - 1, + "XLC_NUMERIC doesn't match the LC_NUMERIC value."); +_Static_assert(XLC_TIME == LC_TIME - 1, + "XLC_TIME doesn't match the LC_TIME value."); +_Static_assert(XLC_MESSAGES == LC_MESSAGES - 1, + "XLC_MESSAGES doesn't match the LC_MESSAGES value."); + +/** + * Header used for objects that are reference counted. Objects may optionally + * have a destructor associated, which is responsible for destroying the + * structure. Global / static versions of the structure should have no + * destructor set - they can then have their reference counts manipulated as + * normal, but will not do anything with them. + * + * The header stores a retain count - objects are assumed to have a reference + * count of 1 when they are created, but the retain count is 0. When the + * retain count is less than 0, they are freed. + */ +struct xlocale_refcounted { + /** Number of references to this component. */ + long retain_count; + /** Function used to destroy this component, if one is required*/ + void(*destructor)(void*); +}; +/** + * Header for a locale component. All locale components must begin with this + * header. + */ +struct xlocale_component { + struct xlocale_refcounted header; + /** Name of the locale used for this component. */ + char locale[ENCODING_LEN+1]; +}; + +/** + * xlocale structure, stores per-thread locale information. + */ +struct _xlocale { + struct xlocale_refcounted header; + /** Components for the locale. */ + struct xlocale_component *components[XLC_LAST]; + /** Flag indicating if components[XLC_MONETARY] has changed since the + * last call to localeconv_l() with this locale. */ + int monetary_locale_changed; + /** Flag indicating whether this locale is actually using a locale for + * LC_MONETARY (1), or if it should use the C default instead (0). */ + int using_monetary_locale; + /** Flag indicating if components[XLC_NUMERIC] has changed since the + * last call to localeconv_l() with this locale. */ + int numeric_locale_changed; + /** Flag indicating whether this locale is actually using a locale for + * LC_NUMERIC (1), or if it should use the C default instead (0). */ + int using_numeric_locale; + /** Flag indicating whether this locale is actually using a locale for + * LC_TIME (1), or if it should use the C default instead (0). */ + int using_time_locale; + /** Flag indicating whether this locale is actually using a locale for + * LC_MESSAGES (1), or if it should use the C default instead (0). */ + int using_messages_locale; + /** The structure to be returned from localeconv_l() for this locale. */ + struct lconv lconv; + /** Persistent state used by mblen() calls. */ + __mbstate_t mblen; + /** Persistent state used by mbrlen() calls. */ + __mbstate_t mbrlen; + /** Persistent state used by mbrtoc16() calls. */ + __mbstate_t mbrtoc16; + /** Persistent state used by mbrtoc32() calls. */ + __mbstate_t mbrtoc32; + /** Persistent state used by mbrtowc() calls. */ + __mbstate_t mbrtowc; + /** Persistent state used by mbsnrtowcs() calls. */ + __mbstate_t mbsnrtowcs; + /** Persistent state used by mbsrtowcs() calls. */ + __mbstate_t mbsrtowcs; + /** Persistent state used by mbtowc() calls. */ + __mbstate_t mbtowc; + /** Persistent state used by c16rtomb() calls. */ + __mbstate_t c16rtomb; + /** Persistent state used by c32rtomb() calls. */ + __mbstate_t c32rtomb; + /** Persistent state used by wcrtomb() calls. */ + __mbstate_t wcrtomb; + /** Persistent state used by wcsnrtombs() calls. */ + __mbstate_t wcsnrtombs; + /** Persistent state used by wcsrtombs() calls. */ + __mbstate_t wcsrtombs; + /** Persistent state used by wctomb() calls. */ + __mbstate_t wctomb; + /** Buffer used by nl_langinfo_l() */ + char *csym; +}; + +/** + * Increments the reference count of a reference-counted structure. + */ +__attribute__((unused)) static void* +xlocale_retain(void *val) +{ + struct xlocale_refcounted *obj = val; + atomic_add_long(&(obj->retain_count), 1); + return (val); +} +/** + * Decrements the reference count of a reference-counted structure, freeing it + * if this is the last reference, calling its destructor if it has one. + */ +__attribute__((unused)) static void +xlocale_release(void *val) +{ + struct xlocale_refcounted *obj = val; + long count; + + count = atomic_fetchadd_long(&(obj->retain_count), -1) - 1; + if (count < 0 && obj->destructor != NULL) + obj->destructor(obj); +} + +/** + * Load functions. Each takes the name of a locale and a pointer to the data + * to be initialised as arguments. Two special values are allowed for the + */ +extern void* __collate_load(const char*, locale_t); +extern void* __ctype_load(const char*, locale_t); +extern void* __messages_load(const char*, locale_t); +extern void* __monetary_load(const char*, locale_t); +extern void* __numeric_load(const char*, locale_t); +extern void* __time_load(const char*, locale_t); + +extern struct _xlocale __xlocale_global_locale; +extern struct _xlocale __xlocale_C_locale; + +/** + * Caches the rune table in TLS for fast access. + */ +void __set_thread_rune_locale(locale_t loc); +/** + * Flag indicating whether a per-thread locale has been set. If no per-thread + * locale has ever been set, then we always use the global locale. + */ +extern int __has_thread_locale; +#ifndef __NO_TLS +/** + * The per-thread locale. Avoids the need to use pthread lookup functions when + * getting the per-thread locale. + */ +extern _Thread_local locale_t __thread_locale; + +/** + * Returns the current locale for this thread, or the global locale if none is + * set. The caller does not have to free the locale. The return value from + * this call is not guaranteed to remain valid after the locale changes. As + * such, this should only be called within libc functions. + */ +static inline locale_t __get_locale(void) +{ + + if (!__has_thread_locale) { + return (&__xlocale_global_locale); + } + return (__thread_locale ? __thread_locale : &__xlocale_global_locale); +} +#else +locale_t __get_locale(void); +#endif + +/** + * Two magic values are allowed for locale_t objects. NULL and -1. This + * function maps those to the real locales that they represent. + */ +static inline locale_t get_real_locale(locale_t locale) +{ + switch ((intptr_t)locale) { + case 0: return (&__xlocale_C_locale); + case -1: return (&__xlocale_global_locale); + default: return (locale); + } +} + +/** + * Replace a placeholder locale with the real global or thread-local locale_t. + */ +#define FIX_LOCALE(l) (l = get_real_locale(l)) + +#endif |