Issue25873
Created on 2015-12-15 19:17 by serhiy.storchaka, last changed 2022-04-11 14:58 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| etree_iter.patch | serhiy.storchaka, 2015-12-15 19:17 | review | ||
| etree_iter2.patch | serhiy.storchaka, 2015-12-16 02:03 | review | ||
| Messages (6) | |||
|---|---|---|---|
| msg256472 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2015-12-15 19:17 | |
Proposed patch refactors and optimize the code for iterating ElementTree. Refactoring and minor optimization makes the code about 30 lines smaller and 2-3% faster. Using a continuous array instead of linked list makes the code up to 40% faster (lxml benchmark was used to measurement). In addition the patch fixes possible issue with owning references. |
|||
| msg256495 - (view) | Author: Martin Panter (martin.panter) * | Date: 2015-12-16 00:47 | |
Looks technically correct as far as my knowledge of the malloc routines goes. What was the problem with references that you fixed? Maybe with parent_stack_push_new() failure?
The main reference counting bug that sticks out to me is with the text and tail elements, because element_get_text() etc return a borrowed reference. It might be good to fix this while you are at it:
>>> element = Element("tag")
>>> class Text:
... def __bool__(self):
... element.text = "changed"
... return True
...
>>> element.text = Text()
>>> i = element.itertext()
>>> next(i)
Segmentation fault (core dumped)
[Exit 139]
|
|||
| msg256502 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2015-12-16 02:02 | |
Good catch Martin. I missed this case. Following patch fixes it. Other bugs look similar. They presumably can be triggered when remove the element itself during executing the __next__() method. I have no a reproducer though. I don't know wherever it is worth to fix these bugs in maintained releases. |
|||
| msg256507 - (view) | Author: Martin Panter (martin.panter) * | Date: 2015-12-16 03:40 | |
The new changes look good. I see these kind of reference counting bugs as fairly low priority. I suspect there are plenty more in this module, at least Issue 24103 and Issue 24104. |
|||
| msg256674 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2015-12-18 12:07 | |
Thank you for your review Martin. I have opened a separate issue25902 for refcount bugs in iteration. Its patch is purposed for all versions, while this issue is for developer version only. |
|||
| msg256795 - (view) | Author: Roundup Robot (python-dev) | Date: 2015-12-21 10:44 | |
New changeset 5a5d5268afd5 by Serhiy Storchaka in branch 'default': Issue #25873: Optimized iterating ElementTree. https://hg.python.org/cpython/rev/5a5d5268afd5 |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:58:25 | admin | set | github: 70061 |
| 2015-12-21 11:51:17 | serhiy.storchaka | set | status: open -> closed assignee: serhiy.storchaka resolution: fixed stage: patch review -> resolved |
| 2015-12-21 10:44:27 | python-dev | set | nosy:
+ python-dev messages: + msg256795 |
| 2015-12-18 12:07:02 | serhiy.storchaka | set | messages: + msg256674 |
| 2015-12-16 03:40:55 | martin.panter | set | messages: + msg256507 |
| 2015-12-16 02:03:45 | serhiy.storchaka | set | files: + etree_iter2.patch |
| 2015-12-16 02:03:29 | serhiy.storchaka | set | files: - etree_iter2.patch |
| 2015-12-16 02:02:52 | serhiy.storchaka | set | files:
+ etree_iter2.patch messages: + msg256502 |
| 2015-12-16 00:47:37 | martin.panter | set | nosy:
+ martin.panter messages: + msg256495 components: + XML |
| 2015-12-15 19:17:03 | serhiy.storchaka | create | |