Issue25554
Created on 2015-11-05 08:27 by joente, last changed 2022-04-11 14:58 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| fix_mem_sre_parse.patch | joente, 2015-11-05 08:27 | patched sre_parse.py | review | |
| fix_mem_sre_parse_2.patch | serhiy.storchaka, 2015-11-05 10:31 | review | ||
| Messages (4) | |||
|---|---|---|---|
| msg254092 - (view) | Author: Jeroen van der Heijden (joente) * | Date: 2015-11-05 08:27 | |
When compiling a regular expression with groups (subpatterns),
circular references are created.
Here is an example to illustrate the problem:
>>> import gc
>>> import re
>>> gc.disable() # disable garbage collector
>>> gc.collect() # make sure we start with 0
0
>>> re.compile('(a|b)') # compile something with groups
re.compile('(a|b)')
>>> gc.collect() # collects x objects depending on the compiled string
11
To fix the issue a weakref object for p is used.
|
|||
| msg254099 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2015-11-05 10:31 | |
Thank you for your report and patch Jeroen. Indeed, there is a regression, and your patch fixes it. But I don't like the idea of using weakref. For now sre_parse has very little dependencies, but weakref depends on collections that depends on a number of modules. For now importing weakref works, but it is too easy to create a dependency loop in future. Here is alternative patch that gets rid of references at all. The subpatterns list was added in the patch for issue9179 and is an implementation detail. We can replace it with a list of subpattern widths. |
|||
| msg254114 - (view) | Author: Jeroen van der Heijden (joente) * | Date: 2015-11-05 15:13 | |
Thanks Serhiy, I totally agree with your solution. Using a list with subpattern widths is definitely better compared to using weakref. |
|||
| msg254115 - (view) | Author: Roundup Robot (python-dev) | Date: 2015-11-05 15:52 | |
New changeset 7f4fca8f13a2 by Serhiy Storchaka in branch '3.5': Issue #25554: Got rid of circular references in regular expression parsing. https://hg.python.org/cpython/rev/7f4fca8f13a2 New changeset 8621727dd9f7 by Serhiy Storchaka in branch 'default': Issue #25554: Got rid of circular references in regular expression parsing. https://hg.python.org/cpython/rev/8621727dd9f7 |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:58:23 | admin | set | github: 69740 |
| 2015-11-05 16:43:26 | serhiy.storchaka | set | status: open -> closed stage: patch review -> resolved resolution: fixed versions: - Python 2.7, Python 3.4 |
| 2015-11-05 15:52:03 | python-dev | set | nosy:
+ python-dev messages: + msg254115 |
| 2015-11-05 15:13:59 | joente | set | messages: + msg254114 |
| 2015-11-05 10:31:24 | serhiy.storchaka | set | files:
+ fix_mem_sre_parse_2.patch assignee: serhiy.storchaka messages:
+ msg254099 |
| 2015-11-05 08:27:44 | joente | create | |