Issue28604
Created on 2016-11-03 21:26 by Guillaume Pasquet (Etenil), last changed 2018-11-28 16:52 by vstinner. This issue is now closed.
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 10606 | merged | vstinner, 2018-11-20 12:36 | |
| PR 10619 | merged | vstinner, 2018-11-20 20:14 | |
| PR 10621 | merged | vstinner, 2018-11-20 21:08 | |
| Messages (11) | |||
|---|---|---|---|
| msg280023 - (view) | Author: Guillaume Pasquet (Etenil) (Guillaume Pasquet (Etenil)) | Date: 2016-11-03 21:26 | |
This issue was originally reported on Fedora's Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1391280 Description of problem: After switching the monetary locale to en_GB, python then raises an exception when calling locale.localeconv() Version-Release number of selected component (if applicable): 3.5.2-4.fc25 How reproducible: Every time Steps to Reproduce: 1. Write a python3 script or open the interactive interpreter with "python3" 2. Enter the following import locale locale.setlocale(locale.LC_MONETARY, 'en_GB') locale.localeconv() 3. Observe that python raises an encoding exception Actual results: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib64/python3.5/locale.py", line 110, in localeconv d = _localeconv() UnicodeDecodeError: 'locale' codec can't decode byte 0xa3 in position 0: Invalid or incomplete multibyte or wide character Expected results: A dictionary of locale data similar to (for en_US): {'mon_thousands_sep': ',', 'currency_symbol': '$', 'negative_sign': '-', 'p_sep_by_space': 0, 'frac_digits': 2, 'int_frac_digits': 2, 'decimal_point': '.', 'mon_decimal_point': '.', 'positive_sign': '', 'p_cs_precedes': 1, 'p_sign_posn': 1, 'mon_grouping': [3, 3, 0], 'n_cs_precedes': 1, 'n_sign_posn': 1, 'grouping': [3, 3, 0], 'thousands_sep': ',', 'int_curr_symbol': 'USD ', 'n_sep_by_space': 0} Note: This was reproduced on Linux Mint 18 (python 3.5.2), and also on Fedora with python 3.4 and python 3.6 (compiled). |
|||
| msg280028 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2016-11-03 22:21 | |
I suspect this issue is similar to issue25812. en_GB has non-ut8 encoding (likely iso8859-1). Currency symbol £ is encoded with this encoding as b'\xa3'. But Python tries to decode b'\xa3' with an encoding determined by other locale setting (LC_CTYPE). |
|||
| msg303419 - (view) | Author: Andreas Schwab (schwab) * | Date: 2017-09-30 19:24 | |
This causes test_float.py to fail with glibc > 2.26. ERROR: test_float_with_comma (__main__.GeneralFloatCases) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/abuild/rpmbuild/BUILD/Python-3.6.2/Lib/test/support/__init__.py", line 1590, in inner return func(*args, **kwds) File "Lib/test/test_float.py", line 150, in test_float_with_comma if not locale.localeconv()['decimal_point'] == ',': File "/home/abuild/rpmbuild/BUILD/Python-3.6.2/Lib/locale.py", line 110, in localeconv d = _localeconv() UnicodeDecodeError: 'locale' codec can't decode byte 0xa0 in position 0: Invalid or incomplete multibyte or wide character ---------------------------------------------------------------------- |
|||
| msg330128 - (view) | Author: STINNER Victor (vstinner) * | Date: 2018-11-20 12:27 | |
Example of the bug:
import locale
# LC_CTYPE: latin1 encoding
locale.setlocale(locale.LC_ALL, "en_GB")
# LC_MONETARY: utf8 encoding
locale.setlocale(locale.LC_MONETARY, "ar_SA.UTF-8")
lc = locale.localeconv()
for attr in (
"mon_grouping",
"int_curr_symbol",
"currency_symbol",
"mon_decimal_point",
"mon_thousands_sep",
):
print(f"{attr}: {lc[attr]!a}")
Python 3.7 output:
mon_grouping: []
int_curr_symbol: 'SAR '
currency_symbol: '\xd8\xb1.\xd8\xb3'
mon_decimal_point: '.'
mon_thousands_sep: ''
Expected output:
mon_grouping: []
int_curr_symbol: 'SAR '
currency_symbol: '\u0631.\u0633'
mon_decimal_point: '.'
mon_thousands_sep: ''
Tested on Fedora 29.
|
|||
| msg330129 - (view) | Author: STINNER Victor (vstinner) * | Date: 2018-11-20 12:47 | |
See also bpo-33954: float.__format__('n') fails with _PyUnicode_CheckConsistency assertion error for locales with non-ascii thousands separator. It may be nice to fix these two bugs at the same times, since they are related :-) |
|||
| msg330131 - (view) | Author: STINNER Victor (vstinner) * | Date: 2018-11-20 14:10 | |
I tested manually PR 10606: LC_ALL= LC_CTYPE=xxx LC_MONETARY=xxx ./python -c 'import locale; locale.setlocale(locale.LC_ALL, ""); print(ascii(locale.localeconv()["currency_symbol"]))' '\xa3' Result (bug = result/error without the fix): * LC_CTYPE=en_GB, LC_MONETARY=ar_SA.UTF-8: currency_symbol='\u0631.\u0633' (bug: '\xd8\xb1.\xd8\xb3') * LC_CTYPE=en_GB, LC_MONETARY=fr_FR.UTF-8: currency_symbol='\u20ac' (bug: '\xe2\x82\xac') * LC_CTYPE=en_GB, LC_MONETARY=uk_UA.koi8u: currency_symbol='\u0433\u0440\u043d.' (bug: '\xc7\xd2\xce.') * LC_CTYPE=fr_FR.UTF-8, LC_MONETARY=uk_UA.koi8u: currency_symbol='\u0433\u0440\u043d.' (bug: UnicodeDecodeError) Locale encodings: * en_GB: latin1 * ar_SA.UTF-8: utf8 * fr_FR.UTF-8: utf8 * uk_UA.koi8u: KOI8-U |
|||
| msg330132 - (view) | Author: STINNER Victor (vstinner) * | Date: 2018-11-20 15:20 | |
New changeset 02e6bf7f2025cddcbde6432f6b6396198ab313f4 by Victor Stinner in branch 'master': bpo-28604: Fix localeconv() for different LC_MONETARY (GH-10606) https://github.com/python/cpython/commit/02e6bf7f2025cddcbde6432f6b6396198ab313f4 |
|||
| msg330153 - (view) | Author: STINNER Victor (vstinner) * | Date: 2018-11-20 21:06 | |
New changeset 6eff6b8eecd7a8eccad16419269fa18ec820922e by Victor Stinner in branch '3.7': bpo-28604: Fix localeconv() for different LC_MONETARY (GH-10606) (GH-10619) https://github.com/python/cpython/commit/6eff6b8eecd7a8eccad16419269fa18ec820922e |
|||
| msg330155 - (view) | Author: STINNER Victor (vstinner) * | Date: 2018-11-20 21:36 | |
New changeset df3051b53fd7f2862a4087f5449e811d8421347a by Victor Stinner in branch '3.6': bpo-28604: Fix localeconv() for different LC_MONETARY (GH-10606) (GH-10619) (GH-10621) https://github.com/python/cpython/commit/df3051b53fd7f2862a4087f5449e811d8421347a |
|||
| msg330191 - (view) | Author: STINNER Victor (vstinner) * | Date: 2018-11-21 11:26 | |
It seems like my change introduced a regression: bpo-35290. |
|||
| msg330609 - (view) | Author: STINNER Victor (vstinner) * | Date: 2018-11-28 16:52 | |
See also bpo-31900: localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2018-11-28 16:52:24 | vstinner | set | messages: + msg330609 |
| 2018-11-28 16:51:47 | vstinner | set | title: Exception raised by python3.5 when using en_GB locale -> localeconv() doesn't support LC_MONETARY encoding different than LC_CTYPE encoding |
| 2018-11-21 11:26:12 | vstinner | set | messages: + msg330191 |
| 2018-11-20 21:37:25 | vstinner | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
| 2018-11-20 21:36:19 | vstinner | set | messages: + msg330155 |
| 2018-11-20 21:08:55 | vstinner | set | pull_requests: + pull_request9869 |
| 2018-11-20 21:06:25 | vstinner | set | messages: + msg330153 |
| 2018-11-20 20:14:32 | vstinner | set | pull_requests: + pull_request9867 |
| 2018-11-20 15:20:28 | vstinner | set | messages: + msg330132 |
| 2018-11-20 14:10:13 | vstinner | set | versions: + Python 3.8, - Python 3.5 |
| 2018-11-20 14:10:05 | vstinner | set | messages: + msg330131 |
| 2018-11-20 12:47:44 | vstinner | set | messages: + msg330129 |
| 2018-11-20 12:36:21 | vstinner | set | keywords:
+ patch stage: patch review pull_requests: + pull_request9849 |
| 2018-11-20 12:27:56 | vstinner | set | messages: + msg330128 |
| 2018-10-01 13:53:35 | xtreak | set | nosy:
+ xtreak |
| 2018-09-24 12:30:14 | petr.viktorin | set | nosy:
+ vstinner |
| 2017-09-30 19:24:02 | schwab | set | nosy:
+ schwab messages: + msg303419 |
| 2016-11-04 10:50:40 | cstratak | set | nosy:
+ cstratak |
| 2016-11-03 22:21:49 | serhiy.storchaka | set | nosy:
+ loewis, serhiy.storchaka, lemburg messages:
+ msg280028 |
| 2016-11-03 21:26:34 | Guillaume Pasquet (Etenil) | create | |