Issue39500
Created on 2020-01-30 09:22 by vstinner, last changed 2022-04-11 14:59 by admin. This issue is now closed.
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 18280 | closed | vstinner, 2020-01-30 09:26 | |
| PR 18397 | merged | vstinner, 2020-02-07 08:38 | |
| Messages (8) | |||
|---|---|---|---|
| msg361027 - (view) | Author: STINNER Victor (vstinner) * | Date: 2020-01-30 09:22 | |
The PyUnicode_IsIdentifier() function should be documented. Attachd PR documents it. |
|||
| msg361031 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2020-01-30 10:13 | |
Since the behavior was changed, I think we need a versionchanged directive. This function was added in 3.0. Prior to 3.3 it was always successful (if you pass an unicode object, that is required for most of PyUnicode API). Py_FatalError was added in 3.3, because not all unicode object are now valid. But I am not sure that it could be triggered in real code without using some broken extensions. It may be safer to return 0 instead of returning -1 or crashing if PyUnicode_READY() fails. If return -1 and set an exception, it can lead to an unexpected behavior if the calling code expects only 0/1, and maybe even to crash in debug build due to set an exception and returning value. |
|||
| msg361043 - (view) | Author: STINNER Victor (vstinner) * | Date: 2020-01-30 11:51 | |
> Since the behavior was changed, I think we need a versionchanged directive. Done. > But I am not sure that it could be triggered in real code without using some broken extensions. _PyUnicode_Ready() can fail in two cases: the Py_UNICODE* string contains a character outside [U+0000; U+10ffff] character range, or a memory allocation failed. Both cases look very unlikely, knowning that _PyUnicode_Ready() is only called if PyUnicode_IsIdentifier() is called on a string using the Python legacy C API (Py_UNICODE*). > It may be safer to return 0 instead of returning -1 or crashing if PyUnicode_READY() fails. Functions which don't use PyObject* as return type have the tradition of returning -1. A few examples: * PyUnicode_GetLength() * PyUnicode_CopyCharacters() * PyUnicode_ReadChar() (return "(Py_UCS4)-1" on error) * PyLong_AsUnsignedLong() (return "(unsigned long)-1" on error) * etc. I don't see why PyUnicode_IsIdentifier() deserves to behave differently. > If return -1 and set an exception, it can lead to an unexpected behavior if the calling code expects only 0/1, and maybe even to crash in debug build due to set an exception and returning value. I updated my PR to document the behavior change in the "Changes in the C API" section of What's New in Python 3.9 documentation. Previously, the code crashed Python (Py_FatalError). So I'm confident that this case never occurred ever in the wild. We got zero bug report about that. |
|||
| msg361048 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2020-01-30 12:04 | |
Other examples are: * PyObject_HasAttr * PyObject_HasAttrString * PyMapping_HasKey * PyMapping_HasKeyString They are bad examples, but can't be changed for backward compatibility. I wonder whether PyUnicode_IsIdentifier should also kept unchanged for backward compatibility. There is also a number of *_Check functions which always succeeds. Other example is _PyUnicode_EqualToASCIIString where the behavior is intentional. PyUnicode_IsIdentifier() was not documented before. It makes easier to change its behavior. |
|||
| msg361049 - (view) | Author: STINNER Victor (vstinner) * | Date: 2020-01-30 12:07 | |
> They are bad examples, but can't be changed for backward compatibility. I don't think that we should follow these bad examples :-) IMO ignoring silently bugs is a bad programming practice. I don't expect PyUnicode_IsIdentifier() to be used outside Python. If it's used, I don't see why it would be a "non-ready string" in practice. The risk of regression is very close to zero. If it happens, it's no longer our fault, since I documented the behavior change ;-) Again, right now, Python does crash if this corner case occurs... |
|||
| msg361079 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2020-01-30 21:54 | |
It is not convenient to check the result for error. After we remove support of old PyUnicode API, PyUnicode_IsIdentifier() will be always succeeded. Note that PyUnicode_IsIdentifier() still can crash if you pass a non-PyUnicode object or NULL. It is a prerequisite of this function that the argument must be a PyUnicode object. Add just yet one condition: it must be a ready PyUnicode object. |
|||
| msg361816 - (view) | Author: STINNER Victor (vstinner) * | Date: 2020-02-11 13:29 | |
New changeset f3e7ea5b8c220cd63101e419d529c8563f9c6115 by Victor Stinner in branch 'master': bpo-39500: Document PyUnicode_IsIdentifier() function (GH-18397) https://github.com/python/cpython/commit/f3e7ea5b8c220cd63101e419d529c8563f9c6115 |
|||
| msg362141 - (view) | Author: STINNER Victor (vstinner) * | Date: 2020-02-17 13:41 | |
New changeset 3d235f5c5c5bce6e0caec44d2ce17f670c2ca2d7 by Hai Shi in branch 'master': bpo-39500: Fix compile warnings in unicodeobject.c (GH-18519) https://github.com/python/cpython/commit/3d235f5c5c5bce6e0caec44d2ce17f670c2ca2d7 |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:59:26 | admin | set | github: 83681 |
| 2020-02-17 13:42:15 | vstinner | link | issue39646 superseder |
| 2020-02-17 13:41:22 | vstinner | set | messages: + msg362141 |
| 2020-02-11 13:29:53 | vstinner | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
| 2020-02-11 13:29:37 | vstinner | set | messages: + msg361816 |
| 2020-02-07 08:38:40 | vstinner | set | pull_requests: + pull_request17773 |
| 2020-01-30 21:54:31 | serhiy.storchaka | set | messages: + msg361079 |
| 2020-01-30 12:07:45 | vstinner | set | messages: + msg361049 |
| 2020-01-30 12:04:52 | serhiy.storchaka | set | messages: + msg361048 |
| 2020-01-30 11:51:47 | vstinner | set | messages: + msg361043 |
| 2020-01-30 10:13:55 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages: + msg361031 |
| 2020-01-30 09:26:06 | vstinner | set | keywords:
+ patch stage: patch review pull_requests: + pull_request17655 |
| 2020-01-30 09:22:35 | vstinner | create | |