Issue39824
Created on 2020-03-02 09:52 by vstinner, last changed 2020-03-21 16:40 by vstinner. This issue is now closed.
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 18738 | merged | vstinner, 2020-03-02 10:51 | |
| PR 19076 | merged | shihai1991, 2020-03-19 16:53 | |
| Messages (20) | |||
|---|---|---|---|
| msg363145 - (view) | Author: STINNER Victor (vstinner) * | Date: 2020-03-02 09:52 | |
Currently, when a module implements m_traverse(), m_clear() or m_free(), these methods can be called with md_state=NULL even if the module implements the "Multi-phase extension module initialization" API (PEP 489). I'm talking about these module methods: * tp_traverse: module_traverse() calls md_def->m_traverse() if m_traverse is not NULL * tp_clear: module_clear() calls md_def->m_clear() if m_clear is not NULL * tp_dealloc: module_dealloc() calls md_def->m_free() is m_free is not NULL Because of that, the implementation of these methods must check manually if md_state is NULL or not. I propose to change module_traverse(), module_clear() and module_dealloc() to not call m_traverse(), m_clear() and m_free() if md_state is NULL and m_size > 0. "m_size > 0" is an heuristic to check if the module implements the "Multi-phase extension module initialization" API (PEP 489). For example, the _pickle module doesn't fully implements the PEP 489: m_size > 0, but PyInit__pickle() uses PyModule_Create(). See bpo-32374 which documented that "m_traverse may be called with m_state=NULL" (GH-5140). |
|||
| msg363146 - (view) | Author: STINNER Victor (vstinner) * | Date: 2020-03-02 09:53 | |
The question came up while I reviewed PR 18608 which ports the audioop extension module to multiphase initialization (PEP 489): https://github.com/python/cpython/pull/18608/files#r386061746 Petr Viktorin referred to m_clear() documentation which says that md_state *can* be NULL when m_clear() is called: https://docs.python.org/3/c-api/module.html#c.PyModuleDef.m_clear |
|||
| msg363151 - (view) | Author: STINNER Victor (vstinner) * | Date: 2020-03-02 10:50 | |
There are different kinds of extension modules: (1) no module state (m_size <= 0): **not affected** by this change. Example: _asyncio which implements m_free() to clear global variables and free lists. (2) Have a module state but PyInit_xxx() calls PyModule_Create(): PyModule_Create() always allocates md_state. I understand that such extension module is **not affected** by this change. (3) Multi-phase extension: PyInit_xxx() function calls PyModuleDef_Init(). Such extension module **is affected** if m_traverse, m_clear or m_free() is not NULL. Example: atexit module implements m_traverse, m_clear and m_free. PyModuleObject structure contains Python objects like md_dict (dict), md_name (str) or md_weaklist (list): * module_traverse() always traverses md_dict: m_traverse() is no needed for that. * module_clear() always clears md_dict: m_clear() is no needed for that. * module_dealloc() always deallocates md_dict, md_name and md_weaklist: m_free() is no needed for that. * md_name is a string, it cannot be involved in a reference cycle. I don't think that it's possible to extend PyModuleObject structure (as done by PyListObject for PyObject) to add other Python objects: md_state is designed for that. PyModule_Create() allocates exactly sizeof(PyModuleObject) bytes. If an extension module has a module state, stores Python objects *outside* this state and uses m_traverse, m_clear and m_free to handle these objects: the GC will no longer be able to handle these objects before the module is executed with this change. If such extension module exists, I consider that it's ok to only handle objects stored outside the module state once the module is executed. The window between <the module is created> and <the module is executed> is very short. |
|||
| msg363152 - (view) | Author: STINNER Victor (vstinner) * | Date: 2020-03-02 10:52 | |
I wrote PR 18738 to implement this change. |
|||
| msg363153 - (view) | Author: STINNER Victor (vstinner) * | Date: 2020-03-02 10:56 | |
> I propose to change module_traverse(), module_clear() and module_dealloc() to not call m_traverse(), m_clear() and m_free() if md_state is NULL and m_size > 0. Note: This change also means that m_traverse, m_clear and m_free are no longer called if md_state is set to NULL. But it never occurs in practice. module_dealloc() calls PyMem_FREE(m->md_state) but it doesn't set md_state to NULL. It's not needed, since the module memory is deallocated anyway. |
|||
| msg363155 - (view) | Author: STINNER Victor (vstinner) * | Date: 2020-03-02 11:09 | |
Proposed PR checking if the module state is NULL: * PR 18358: _locale module * PR 18608: audioop module * PR 18613: binascii |
|||
| msg363591 - (view) | Author: Nick Coghlan (ncoghlan) * | Date: 2020-03-07 12:22 | |
One of the intended use cases for Py_mod_create is to return instances of ModuleType subclasses rather than straight ModuleType instances. And those are definitely legal to define:
>>> import __main__
>>> class MyModule(type(__main__)): pass
...
>>> m = MyModule('example')
>>> m
<module 'example'>
So it isn't valid to skip calling the cleanup functions just because md_state is NULL - we have no idea what Py_mod_create might have done that needs to be cleaned up.
It would *probably* be legitimate to skip calling the cleanup functions when there's no Py_mod_create slot defined, but then the rules for "Do I need to account for md_state potentially being NULL or not?" are getting complicated enough that the safest option for a module author is to always assume that md_state might be NULL and handle that case appropriately.
|
|||
| msg363599 - (view) | Author: hai shi (shihai1991) * | Date: 2020-03-07 15:16 | |
> we have no idea what Py_mod_create might have done that needs to be cleaned up. Looks like no extension module author use `Py_mod_create` slots now. |
|||
| msg363708 - (view) | Author: STINNER Victor (vstinner) * | Date: 2020-03-09 09:15 | |
> So it isn't valid to skip calling the cleanup functions just because md_state is NULL - we have no idea what Py_mod_create might have done that needs to be cleaned up. In your example, I don't see what m_clear/m_free would be supposed to clear/free. I don't see how a module can store data without md_state which would require m_clear/m_free to clear/free such data. module_clear() continue to clear Python objects of the PyModuleObject structure with PR 18738. Would you mind to elaborate? The intent of PR 18738 is to simplify the implementation of C extension modules which implements the PEP 489 and uses a module state (md_state > 0). |
|||
| msg363912 - (view) | Author: Petr Viktorin (petr.viktorin) * | Date: 2020-03-11 12:15 | |
If you use a module subclass that needs some additional C-level infrastructure, it would be more appropriate to override tp_clear/tp_free directly. IMO limiting m_clear/m_free to work just with the module state won't hurt. But it is an API change. |
|||
| msg363938 - (view) | Author: STINNER Victor (vstinner) * | Date: 2020-03-11 16:51 | |
> Proposed PR checking if the module state is NULL: > > * PR 18358: _locale module > * PR 18608: audioop module > * PR 18613: binascii Petr suggested to not hold these PRs with this controversial issue. I agree, so I merged these 3 PRs. |
|||
| msg363939 - (view) | Author: STINNER Victor (vstinner) * | Date: 2020-03-11 16:52 | |
Stefan Behnel: as the 3rd author of the PEP 489, what's your call on this issue? |
|||
| msg364280 - (view) | Author: Nick Coghlan (ncoghlan) * | Date: 2020-03-16 03:20 | |
Petr's point that any subclass state should be managed in the subclass cleanup functions is a good one, so I withdraw my concern: * custom module subclasses should clean up like any other class instance * the module slots are then only needed if the module state actually gets populated, so we can safely skip them if it never even gets allocated |
|||
| msg364318 - (view) | Author: STINNER Victor (vstinner) * | Date: 2020-03-16 13:35 | |
I updated PR 18738 to document the incompatible change in What's New In Python 3.9. Sadly, I expect that almost no third-party extension module implement the PEP 489 yet. So I expect that little or no third-party code is impacted in pratice. > the module slots are then only needed if the module state actually gets populated, so we can safely skip them if it never even gets allocated That's also my understanding. > custom module subclasses should clean up like any other class instance That sounds like a reasonable compromise to me. |
|||
| msg364348 - (view) | Author: Pablo Galindo Salgado (pablogsal) * | Date: 2020-03-16 18:09 | |
I think Cython makes use of PEP-489 so unless I am missing something all generated extensions use PEP-489 structures. |
|||
| msg364454 - (view) | Author: STINNER Victor (vstinner) * | Date: 2020-03-17 17:09 | |
New changeset 5b1ef200d31a74a9b478d0217d73ed0a659a8a06 by Victor Stinner in branch 'master': bpo-39824: module_traverse() don't call m_traverse if md_state=NULL (GH-18738) https://github.com/python/cpython/commit/5b1ef200d31a74a9b478d0217d73ed0a659a8a06 |
|||
| msg364455 - (view) | Author: STINNER Victor (vstinner) * | Date: 2020-03-17 17:10 | |
Thanks Petr and Nick for the review ;-) Pablo Galindo Salgado: > I think Cython makes use of PEP-489 so unless I am missing something all generated extensions use PEP-489 structures. Alright. I still consider that my change is correct and will no harm anyone ;-) |
|||
| msg364616 - (view) | Author: miss-islington (miss-islington) | Date: 2020-03-19 17:11 | |
New changeset 13397ee47d23fda2e8d4bef40c1df986457673d1 by Hai Shi in branch 'master': bpo-39824: Convert PyModule_GetState() to get_module_state() (GH-19076) https://github.com/python/cpython/commit/13397ee47d23fda2e8d4bef40c1df986457673d1 |
|||
| msg364738 - (view) | Author: Stefan Behnel (scoder) * | Date: 2020-03-21 11:39 | |
> I think Cython makes use of PEP-489 so unless I am missing something all generated extensions use PEP-489 structures. Cython doesn't make complete use of PEP-489 yet, specifically not of the module state feature (nor module subclasses). This change looks good from my side. Good idea, Victor. |
|||
| msg364758 - (view) | Author: STINNER Victor (vstinner) * | Date: 2020-03-21 16:40 | |
> Cython doesn't make complete use of PEP-489 yet, specifically not of the module state feature (nor module subclasses). This change looks good from my side. Good idea, Victor. Thanks for the confirmation Stefan ;-) |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2020-03-21 16:40:23 | vstinner | set | messages: + msg364758 |
| 2020-03-21 11:39:56 | scoder | set | messages: + msg364738 |
| 2020-03-19 17:11:37 | miss-islington | set | nosy:
+ miss-islington messages: + msg364616 |
| 2020-03-19 16:53:34 | shihai1991 | set | pull_requests: + pull_request18432 |
| 2020-03-17 17:10:56 | vstinner | set | status: open -> closed resolution: fixed messages: + msg364455 stage: patch review -> resolved |
| 2020-03-17 17:09:50 | vstinner | set | messages: + msg364454 |
| 2020-03-16 18:09:47 | pablogsal | set | messages: + msg364348 |
| 2020-03-16 13:35:45 | vstinner | set | messages: + msg364318 |
| 2020-03-16 03:20:23 | ncoghlan | set | messages: + msg364280 |
| 2020-03-11 16:52:45 | vstinner | set | nosy:
+ scoder messages: + msg363939 |
| 2020-03-11 16:51:39 | vstinner | set | messages: + msg363938 |
| 2020-03-11 12:15:27 | petr.viktorin | set | messages: + msg363912 |
| 2020-03-09 09:15:56 | vstinner | set | messages: + msg363708 |
| 2020-03-07 15:16:49 | shihai1991 | set | messages: + msg363599 |
| 2020-03-07 12:22:22 | ncoghlan | set | messages: + msg363591 |
| 2020-03-02 11:09:21 | vstinner | set | messages: + msg363155 |
| 2020-03-02 10:56:00 | vstinner | set | messages: + msg363153 |
| 2020-03-02 10:53:03 | vstinner | set | title: Multi-phase extension module (PEP 489): don't call m_traverse, m_clear nor m_free if md_state is NULL -> Multi-phase extension module (PEP 489): don't call m_traverse, m_clear nor m_free before the module state is allocated |
| 2020-03-02 10:52:47 | vstinner | set | nosy:
+ ncoghlan, petr.viktorin, Dormouse759, pablogsal, shihai1991 messages: + msg363152 |
| 2020-03-02 10:51:37 | vstinner | set | keywords:
+ patch stage: patch review pull_requests: + pull_request18093 |
| 2020-03-02 10:50:53 | vstinner | set | messages: + msg363151 |
| 2020-03-02 09:53:06 | vstinner | set | messages: + msg363146 |
| 2020-03-02 09:52:22 | vstinner | create | |