Issue31499
Created on 2017-09-17 17:27 by scoder, last changed 2022-04-11 14:58 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| bug1.py | vstinner, 2017-09-18 07:00 | |||
| bug2.py | vstinner, 2017-09-18 07:00 | |||
| test_issue31499.diff | serhiy.storchaka, 2017-09-18 09:37 | |||
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 3641 | merged | vstinner, 2017-09-18 08:59 | |
| PR 3645 | merged | python-dev, 2017-09-18 12:29 | |
| Messages (17) | |||
|---|---|---|---|
| msg302375 - (view) | Author: Stefan Behnel (scoder) * | Date: 2017-09-17 17:27 | |
I'm seeing crashes in the latest Py3.7 when I run this test (taken from lxml's compatibility test suite):
etree = xml.etree.ElementTree
def test_feed_parser_error_position(self):
ParseError = etree.ParseError
parser = XMLParser()
try:
parser.close()
except ParseError:
e = sys.exc_info()[1]
self.assertNotEqual(None, e.code)
self.assertNotEqual(0, e.code)
self.assertTrue(isinstance(e.position, tuple))
self.assertTrue(e.position >= (0, 0))
It crashes in expat/xmlparse.c, line 1464:
1459 for (;;) {
1460 BINDING *b = bindings;
1461 if (!b)
1462 break;
1463 bindings = b->nextTagBinding;
1464 FREE(b->uri); // <<<<<<<<<<<<<<< crashes here
1465 FREE(b);
1466 }
1467 }
Probably related to the new expat version (issue 31170).
|
|||
| msg302383 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2017-09-17 19:18 | |
I can't reproduce a crash. Could you please provide a Python script? |
|||
| msg302384 - (view) | Author: Stefan Behnel (scoder) * | Date: 2017-09-17 20:06 | |
Sorry, wrong line number. Was using an installed Py3.7, not a fresh build. However, my crashing installed version is from September 1st, *before* the expat update, which was apparently on September 5th. With a clean debug build, I get a reproducible crash during shutdown, now with the correct line number: Program received signal SIGSEGV, Segmentation fault. PyExpat_XML_ParserFree (parser=0xd01a20) at .../Modules/expat/xmlparse.c:1487 1487 tagList = tagList->parent; (gdb) p taglist No symbol "taglist" in current context. (gdb) list 1482 break; 1483 tagList = freeTagList; 1484 freeTagList = NULL; 1485 } 1486 p = tagList; 1487 tagList = tagList->parent; 1488 FREE(p->buf); 1489 destroyBindings(p->bindings, parser); 1490 FREE(p); 1491 } (gdb) bt #0 PyExpat_XML_ParserFree (parser=0xd01a20) at Modules/expat/xmlparse.c:1487 #1 0x00007fffeec83517 in xmlparser_gc_clear (self=0x7fffeeeb75d8) at Modules/_elementtree.c:3414 #2 xmlparser_dealloc (self=0x7fffeeeb75d8) at Modules/_elementtree.c:3435 #3 0x000000000041fd40 in frame_dealloc (f=0xd017c8) at Objects/frameobject.c:428 #4 0x000000000044f5e5 in tb_dealloc (tb=0x7ffff632d218) at Python/traceback.c:56 #5 0x000000000049c59d in BaseException_clear (self=0x7ffff6567dd8) at Objects/exceptions.c:78 #6 SyntaxError_clear (self=0x7ffff6567dd8) at Objects/exceptions.c:1381 #7 0x0000000000458104 in delete_garbage (old=<optimized out>, collectable=<optimized out>) at Modules/gcmodule.c:759 #8 collect (generation=generation@entry=2, n_collected=n_collected@entry=0x7fffffffd6c8, n_uncollectable=n_uncollectable@entry=0x7fffffffd6d0, nofail=nofail@entry=0) at Modules/gcmodule.c:911 #9 0x0000000000458d2b in collect_with_callback (generation=2) at Modules/gcmodule.c:1020 #10 PyGC_Collect () at Modules/gcmodule.c:1507 #11 0x0000000000439aaf in Py_FinalizeEx () at Python/pylifecycle.c:1002 #12 0x000000000043b3d8 in Py_Exit (sts=sts@entry=0) at Python/pylifecycle.c:1953 #13 0x000000000043f4e8 in handle_system_exit () at Python/pythonrun.c:608 #14 0x000000000043f94d in handle_system_exit () at Python/pythonrun.c:635 #15 PyErr_PrintEx (set_sys_last_vars=set_sys_last_vars@entry=1) at Python/pythonrun.c:618 #16 0x000000000043fc4a in PyErr_Print () at Python/pythonrun.c:514 #17 0x0000000000441475 in PyRun_SimpleFileExFlags (fp=fp@entry=0x9dc050, filename=<optimized out>, closeit=closeit@entry=1, flags=flags@entry=0x7fffffffd8e0) at Python/pythonrun.c:407 #18 0x0000000000441633 in PyRun_AnyFileExFlags (fp=fp@entry=0x9dc050, filename=<optimized out>, closeit=closeit@entry=1, flags=flags@entry=0x7fffffffd8e0) at Python/pythonrun.c:83 #19 0x000000000045666c in run_file (p_cf=0x7fffffffd8e0, filename=0x9973d0 L"test.py", fp=0x9dc050) at Modules/main.c:341 #20 Py_Main (argc=argc@entry=6, argv=argv@entry=0x996020) at Modules/main.c:895 #21 0x000000000041fa2b in main (argc=6, argv=<optimized out>) at ./Programs/python.c:102 I get it in lxml's test suite runs, will try to reduce it to a test script. |
|||
| msg302385 - (view) | Author: Stefan Behnel (scoder) * | Date: 2017-09-17 20:28 | |
Minimal reproducer seems to be this:
----------
import xml.etree.ElementTree as etree
def test():
parser = etree.XMLParser()
try:
parser.close()
except etree.ParseError as exc:
e = exc # must keep local reference!
test()
----------
The master gitrev I tested is 132a7d7cdbc7cb89fa1c1f4e8192241c3d68f549
Since this happens during GC and finalisation, it's quite possible that it's not a bug in ET per se. It might just be a prematurely cleared reference in some tp_clear(), or something like that. (Changing ticket title appropriately.)
|
|||
| msg302411 - (view) | Author: STINNER Victor (vstinner) * | Date: 2017-09-18 07:00 | |
I confirm the crash using attached bug1.py (first example, completed) and bug2.py (second example). |
|||
| msg302412 - (view) | Author: Stefan Behnel (scoder) * | Date: 2017-09-18 07:13 | |
Thanks for confirming, Victor. I hadn't realised that the first update of expat was already back in June. That means it's not ruled out yet as a source of this crash. Bisecting is probably a good idea. |
|||
| msg302414 - (view) | Author: STINNER Victor (vstinner) * | Date: 2017-09-18 07:28 | |
The bug is at this line:
Breakpoint 6, xmlparser_gc_clear (self=0x7ffff7e28c08) at /home/haypo/prog/python/master/Modules/_elementtree.c:3414
static int
xmlparser_gc_clear(XMLParserObject *self)
{
EXPAT(ParserFree)(self->parser); // <--- HERE
...
}
This function calls XML_ParserFree() twice on the same parser object. The first call is fine and frees the memory. Since we now use Python memory allocators, XML_ParserFree() fills the freed memory with 0xDB byte pattern (when Python is in debug mode).
The second XML_ParserFree() call uses freed memory (filled with 0xDB in debug mode).
Call 1: a GC collection
Breakpoint 6, xmlparser_gc_clear (self=0x7ffff7e28c08) at /home/haypo/prog/python/master/Modules/_elementtree.c:3414
(gdb) up
#1 0x0000000000446636 in delete_garbage (collectable=0x7fffffffd9a0, old=0x9b8f90 <_PyRuntime+432>) at Modules/gcmodule.c:759
(gdb) up
#2 0x0000000000446ade in collect (generation=2, n_collected=0x7fffffffda30, n_uncollectable=0x7fffffffda28, nofail=0) at Modules/gcmodule.c:911
(gdb) cont
Continuing.
Call 2: xmlparser_dealloc()
Breakpoint 6, xmlparser_gc_clear (self=0x7ffff7e28c08) at /home/haypo/prog/python/master/Modules/_elementtree.c:3414
(gdb) up
#1 0x00007ffff0038cb8 in xmlparser_dealloc (self=0x7ffff7e28c08) at /home/haypo/prog/python/master/Modules/_elementtree.c:3435
IMHO it's an obvious bug in Python. The question is more why/how the code didn't crash before? :-)
|
|||
| msg302415 - (view) | Author: STINNER Victor (vstinner) * | Date: 2017-09-18 07:32 | |
Python 2.7 is not affected because it doesn't implement tp_clear (it doesn't have xmlparser_gc_clear()), only xmlparser_dealloc() calls EXPAT(ParserFree)(self->parser). I'm unable to reproduce the bug in Python 3.5 nor 3.6. bug2.py creates a reference cycle the "except etree.ParseError as exc: e = exc # must keep local reference!" which requires to trigger a garbage collection to clear the "parser" variable. |
|||
| msg302416 - (view) | Author: Stefan Behnel (scoder) * | Date: 2017-09-18 07:34 | |
> The question is more why/how the code didn't crash before? :-) Typical case of a Schroedinbug. |
|||
| msg302426 - (view) | Author: STINNER Victor (vstinner) * | Date: 2017-09-18 08:45 | |
>> The question is more why/how the code didn't crash before? :-) > > Typical case of a Schroedinbug. I don't believe in the chaos :-) I ran a "git bisect" since January 1st, 2017. Attached bug2.py started to crash since the following commit related to bpo-29049. --- 5a625d0aa6a6d9ec6574ee8344b41d63dcb9897e is the first bad commit commit 5a625d0aa6a6d9ec6574ee8344b41d63dcb9897e Author: INADA Naoki <songofacandy@gmail.com> Date: Sat Dec 24 20:19:08 2016 +0900 Issue #29049: Call _PyObject_GC_TRACK() lazily when calling Python function. Calling function is up to 5% faster. --- This change is also correct. It's more that the change showed a bug which was hidden before. It's just that now the garbage collector breaks the reference cycle differently since frames tracked differently by the GC. |
|||
| msg302430 - (view) | Author: STINNER Victor (vstinner) * | Date: 2017-09-18 09:00 | |
Attached PR 3641 fixes bug1.py and bug2.py crashes. Sadly, I failed to write a reliable unit test using bug2.py. The bug requires to trigger the garbage collector in a specific order which depends on how frames are tracked by the GC... |
|||
| msg302432 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2017-09-18 09:20 | |
def test_issue31499(self):
def test():
...
test()
test.support.gc_collect()
|
|||
| msg302434 - (view) | Author: STINNER Victor (vstinner) * | Date: 2017-09-18 09:21 | |
Serhiy Storchaka: I tried your pattern, but failed to write a reliable unit test. Can you please write a full patch / test? |
|||
| msg302441 - (view) | Author: STINNER Victor (vstinner) * | Date: 2017-09-18 09:49 | |
test_issue31499.diff: Oh great, it works (to reproduce the crash). I modified your test and included it in my PR, I added you as a co-author of my PR ;-) |
|||
| msg302447 - (view) | Author: STINNER Victor (vstinner) * | Date: 2017-09-18 12:29 | |
New changeset e727d41ffcd91b21ce82026ec8c8381d34a16209 by Victor Stinner in branch 'master': bpo-31499, xml.etree: Fix xmlparser_gc_clear() crash (#3641) https://github.com/python/cpython/commit/e727d41ffcd91b21ce82026ec8c8381d34a16209 |
|||
| msg302449 - (view) | Author: STINNER Victor (vstinner) * | Date: 2017-09-18 12:48 | |
New changeset 8afd7ab12d7f8915b549cf04af384b495ec73d22 by Victor Stinner (Miss Islington (bot)) in branch '3.6': [3.6] bpo-31499, xml.etree: Fix xmlparser_gc_clear() crash (GH-3641) (#3645) https://github.com/python/cpython/commit/8afd7ab12d7f8915b549cf04af384b495ec73d22 |
|||
| msg302451 - (view) | Author: STINNER Victor (vstinner) * | Date: 2017-09-18 13:25 | |
The bug is now fixed in Python 3.6 and master. Thanks for the bug report and analysis, Stefan Behnel! |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:58:52 | admin | set | github: 75680 |
| 2017-09-18 13:25:23 | vstinner | set | status: open -> closed resolution: fixed messages: + msg302451 stage: patch review -> resolved |
| 2017-09-18 12:48:26 | vstinner | set | messages: + msg302449 |
| 2017-09-18 12:29:51 | python-dev | set | pull_requests: + pull_request3639 |
| 2017-09-18 12:29:40 | vstinner | set | messages: + msg302447 |
| 2017-09-18 09:49:01 | vstinner | set | messages: + msg302441 |
| 2017-09-18 09:37:05 | serhiy.storchaka | set | files: + test_issue31499.diff |
| 2017-09-18 09:21:28 | vstinner | set | messages: + msg302434 |
| 2017-09-18 09:20:38 | serhiy.storchaka | set | messages: + msg302432 |
| 2017-09-18 09:00:54 | vstinner | set | messages:
+ msg302430 versions: + Python 3.6 |
| 2017-09-18 08:59:06 | vstinner | set | keywords:
+ patch stage: patch review pull_requests: + pull_request3636 |
| 2017-09-18 08:45:05 | vstinner | set | messages: + msg302426 |
| 2017-09-18 07:34:00 | scoder | set | messages: + msg302416 |
| 2017-09-18 07:32:39 | vstinner | set | messages: + msg302415 |
| 2017-09-18 07:28:28 | vstinner | set | messages: + msg302414 |
| 2017-09-18 07:13:52 | scoder | set | messages: + msg302412 |
| 2017-09-18 07:00:17 | vstinner | set | files: + bug2.py |
| 2017-09-18 07:00:12 | vstinner | set | files:
+ bug1.py messages: + msg302411 |
| 2017-09-17 20:28:16 | scoder | set | messages:
+ msg302385 title: ElementTree crash with new expat -> ElementTree crash while cleaning up ParseError |
| 2017-09-17 20:06:12 | scoder | set | messages: + msg302384 |
| 2017-09-17 19:18:39 | serhiy.storchaka | set | messages: + msg302383 |
| 2017-09-17 18:50:07 | jkloth | set | nosy:
+ jkloth |
| 2017-09-17 17:27:08 | scoder | create | |