Created on 2020-06-10 11:25 by pablogsal, last changed 2021-02-15 15:23 by vstinner. This issue is now closed.
As stated in PEP 617, the old parser will be removed in Python 3.10
New changeset 9727694f08cad4b019d2939224e3416312b1c0e1 by Lysandros Nikolaou in branch 'master': bpo-40939: Generate keyword.py using the new parser (GH-20800) https://github.com/python/cpython/commit/9727694f08cad4b019d2939224e3416312b1c0e1
>>> __new_parser__
File "<stdin>", line 1
__new_parser__
^
SyntaxError: You found it!
"new", "ex" or "ng" are not really future proof names. Can we rename the keyword to "__peg_parser__"?
> Can we rename the keyword to "__peg_parser__"? I guess we could just remove this, as soon as the old parser is out. We were only using this to differentiate between the two parsers, when we were testing enabling/disabling the old one. I could get a PR ready to be merged after GH-20768 is there.
> I guess we could just remove this, as soon as the old parser is out. We were only using this to differentiate between the two parsers, when we were testing enabling/disabling the old one. I could get a PR ready to be merged after GH-20768 is there. I would personally would like to keep the easter egg, but I assume is better to rename it to "__peg_parser__".
New changeset 961edf7979ca34d6fe104a1cce005aa8cac35821 by Miss Islington (bot) in branch '3.9': bpo-40939: Generate keyword.py using the new parser (GH-20800) https://github.com/python/cpython/commit/961edf7979ca34d6fe104a1cce005aa8cac35821
> I would personally would like to keep the easter egg, but I assume is better to rename it to "__peg_parser__". Ok then! On it.
I did'nt ask to remove the easter egg. I'm just asking to avoid the "new" name. In my experience, each time that a "new" thing happens, later we have to use "new extended", "new_v2" or worse name :-) Oh, if the name changes, please change it in 3.9 as well. Look at this amazing names of the 5 flavors of functions parsing a string: PyParser_ParseString() PyParser_ParseStringFlags() PyParser_ParseStringFlagsFilename() PyParser_ParseStringFlagsFilenameEx() <= public! PyParser_ParseStringObject() Same for parsing a file: PyParser_ParseFile() PyParser_ParseFileFlags() PyParser_ParseFileFlagsEx() <= public! PyParser_ParseFileObject() Or PyRun functions: PyRun_String() PyRun_AnyFile() PyRun_AnyFileEx() <= public! PyRun_AnyFileFlags() PyRun_SimpleString() PyRun_SimpleFile() PyRun_SimpleFileEx() <= public! PyRun_InteractiveOne() PyRun_InteractiveLoop() PyRun_File() PyRun_FileEx() <= public! PyRun_FileFlags() ceval.c: PyEval_EvalCode() PyEval_EvalCodeEx() <= public! _PyEval_EvalCodeWithName() _PyEval_EvalCode() I cannot count the number of "Ex" functions that we have :-) Py_Finalize() -> :Py_FinalizeEx() <= public! PyErr_Print() -> PyErr_PrintEx() <= public! PySys_SetArgv() -> PySys_SetArgvEx() <= public! PyErr_WarnEx() <= public! _PyBytes_FormatEx() _PyDict_MergeEx() _Py_DecodeLocaleEx(), _Py_EncodeLocaleEx() struct PyMemAllocatorEx _Py_DecodeUTF8Ex(), _Py_EncodeUTF8Ex() etc.
Honestly I see no reason to keep that easter egg. Can we remove it please?
> New changeset 961edf7979ca34d6fe104a1cce005aa8cac35821 by Miss Islington (bot) in branch '3.9': > bpo-40939: Generate keyword.py using the new parser (GH-20800) This change broke this buildbot: AMD64 Arch Linux VintageParser 3.9: https://buildbot.python.org/all/#/builders/765/builds/67 1 test failed: test_keyword
Ok, let's remove it. Lysandros can you modify PR 20802 to remove it?
> Lysandros can you modify PR 20802 to remove it? Yup!
> This change broke this buildbot: GH-20802 fixes this breakage.
New changeset bcd7deed9118e365c1225de2a2e1a81bf988c6ab by Lysandros Nikolaou in branch 'master': bpo-40939: Remove PEG parser easter egg (__new_parser__) (#20802) https://github.com/python/cpython/commit/bcd7deed9118e365c1225de2a2e1a81bf988c6ab
New changeset 1ed83adb0e95305af858bd41af531e487f54fee7 by Pablo Galindo in branch 'master': bpo-40939: Remove the old parser (GH-20768) https://github.com/python/cpython/commit/1ed83adb0e95305af858bd41af531e487f54fee7
A few remaining references to the good old times of the old parser: Programs/_testembed.c:488: putenv("PYTHONOLDPARSER=1"); Programs/_testembed.c:676: putenv("PYTHONOLDPARSER=1"); Tools/scripts/run_tests.py:29: if 'PYTHONOLDPARSER' not in os.environ:
> A few remaining references to the good old times of the old parser: Thanks Victor, opened https://github.com/python/cpython/pull/20815 to address those.
New changeset 436b648910c27baf8164a6d46d746d36d8a93478 by Pablo Galindo in branch 'master': bpo-40939: Remove some extra references to PYTHONOLDPARSER (GH-20815) https://github.com/python/cpython/commit/436b648910c27baf8164a6d46d746d36d8a93478
New changeset 3782497cc22e70b41e32ac09cb06d3948074d8a7 by Pablo Galindo in branch '3.9': [3.9] bpo-40939: Fix test_keyword for the old parser (GH-20814) https://github.com/python/cpython/commit/3782497cc22e70b41e32ac09cb06d3948074d8a7
New changeset 756180b4bfa09bb77394a2b3754d331181d4f28c by Pablo Galindo in branch 'master': bpo-40939: Clean and adapt the peg_generator directory after deleting the old parser (GH-20822) https://github.com/python/cpython/commit/756180b4bfa09bb77394a2b3754d331181d4f28c
Shouldn't the following files be deleted too? Include/bitset.h Include/grammar.h Include/graminit.h Include/parsetok.h Include/node.h Python/graminit.c Parser/node.c Also declarations: PyNode_Compile in Include/compile.h PyParser_SimpleParse* in Include/pythonrun.h And PyParser_ASTFrom* API need new implementations.
I'm currently testing a commit that removes all these files on my fork, before I push it upstream. A question that I'm not 100% sure about is if we can already remove the symbol module. I guess it's okay since it got deprecated in 3.9 (bpo-40759) and the old parser is also out, but just to make sure.
You can delete symbol.py -- it has no use now that the old parser is gone. We should probably also update the regeneration targets in the Makefile. (At least review them.)
New changeset 314858e2763e76e77029ea0b691d749c32939087 by Lysandros Nikolaou in branch 'master': bpo-40939: Remove the old parser (Part 2) (GH-21005) https://github.com/python/cpython/commit/314858e2763e76e77029ea0b691d749c32939087
New changeset d301d9473e9a9b78d6e6678e9fe5ef66d46084e1 by Lysandros Nikolaou in branch '3.9': [3.9] bpo-40939: Deprecate the PyParser_SimpleParse* functions (GH-21012) https://github.com/python/cpython/commit/d301d9473e9a9b78d6e6678e9fe5ef66d46084e1
> bpo-40939: Remove the old parser (Part 2) (GH-21005) This change removes PyNode_Compile() which is part of the public C API. Would you mind to also deprecate it, as you did for PyParser_xxx() functions?
New changeset 564cd187677ae8d1488c4d8ae649aea34ebbde07 by Lysandros Nikolaou in branch 'master': bpo-40939: Rename PyPegen* functions to PyParser* (GH-21016) https://github.com/python/cpython/commit/564cd187677ae8d1488c4d8ae649aea34ebbde07
Thanks Victor for the catch! I've opened a PR that deprecates PyNode_Compile in 3.9.
New changeset 8ae5e8ec8147e6434454e66953c25848b848711a by Lysandros Nikolaou in branch '3.9': [3.9] bpo-40939: Deprecate PyNode_Compile (GH-21036) https://github.com/python/cpython/commit/8ae5e8ec8147e6434454e66953c25848b848711a
New changeset 8d02f91dc6139a13b6efa9bd5a5b4bdd7ddcc29d by Ned Deily in branch 'master': bpo-40939: run autoreconf to fix configure{,.ac} disparity (GH-21152) https://github.com/python/cpython/commit/8d02f91dc6139a13b6efa9bd5a5b4bdd7ddcc29d
There are some difficulties with removing Grammar/Grammar, since it is used to generate the full grammar in the reference docs (Doc/reference/grammar.rst). Producing a similar grammar from the PEG grammar is currently painful because our PEG grammar contains a number of "invalid_*" rules that just exist to be able to produce more useful error messages. See https://github.com/we-like-parsers/cpython/issues/135. The continued existence of Grammar/Grammar has confused at least one person, see issue41362. Grepping for Grammar/Grammar in the docs, there are a few other occurrences: library/token.rst:14:of the parse tree (terminal tokens). Refer to the file :file:`Grammar/Grammar` library/parser.rst:43::file:`Grammar/Grammar` in the standard Python distribution. The parse trees library/symbol.rst:15:names. Refer to the file :file:`Grammar/Grammar` in the Python distribution for reference/grammar.rst:7:.. literalinclude:: ../../Grammar/Grammar
See also https://github.com/python/cpython/pull/19969 (Pablo's attempt at replacing the grammar in the reference docs with something derived from Grammar/python.gram).
New changeset 72cabb2aa636272e608285f5a6ba83b62be9be4e by Pablo Galindo in branch 'master': bpo-40939: Use the new grammar for the grammar specification documentation (GH-19969) https://github.com/python/cpython/commit/72cabb2aa636272e608285f5a6ba83b62be9be4e
The old parser is completely gone from the 3.10 branch. Closing.
New changeset e6b2d93f0c3891827f609ecac1ced21e1626ed0a by Guido van Rossum in branch '3.9': [3.9] bpo-40939: Use the new grammar for the grammar specification documentation (GH-19969) (#21641) https://github.com/python/cpython/commit/e6b2d93f0c3891827f609ecac1ced21e1626ed0a
New changeset b3fbff7289176ba1a322e6899c3d4a04880ed5a7 by Lysandros Nikolaou in branch 'master': bpo-40939: Remove even more references to the old parser (GH-21642) https://github.com/python/cpython/commit/b3fbff7289176ba1a322e6899c3d4a04880ed5a7
New changeset 5a8364780b7e881385f6fabcf072d599e80f51b8 by Terry Jan Reedy in branch 'master': bpo-41808: Add What's New 3.9 entry missing from master (#22294) https://github.com/python/cpython/commit/5a8364780b7e881385f6fabcf072d599e80f51b8
I reopen the issue. Would you mind to explicitly list function removed from the C API in What's New in Python 3.10? https://docs.python.org/dev/whatsnew/3.10.html#id4 I'm talking about the commit 1ed83adb0e95305af858bd41af531e487f54fee7. For example, the unbound project no longer builds with Python 3.10 because PyParser_SimpleParseFile() has been removed: https://bugzilla.redhat.com/show_bug.cgi?id=1889726 There is no mention of PyParser_SimpleParseFile() removal in What's New in Python 3.10. There is only a mention that it's being deprecated in What's New in Python 3.9.
> There is no mention of PyParser_SimpleParseFile() removal in What's New in Python 3.10. By the way, is there a replacement for this function? The unbound project uses it to display a SyntaxError when PyRun_SimpleFile() fails. Petr Menšík asked: "Could we instead modify PyRun_SimpleFile call to produce just one exception, then print it to stderr once and once into the log? (...) But it seems PyRun_SimpleFile does not throw Exception. Can you recommend variant or flags, which would make it raise an Exception, which log_py_err() would then to log file? After commenting out PyParser_SimpleParseFile it reports None, so it did not already raise an exception." https://bugzilla.redhat.com/show_bug.cgi?id=1889726#c3 unbound used the removed function PyParser_SimpleParseFile() in pythonmod/pythonmod.c. Extract of unbound-1.12.0.tar.gz: if (PyRun_SimpleFile(script_py, pe->fname) < 0) { log_err("pythonmod: can't parse Python script %s", pe->fname); /* print the error to logs too, run it again */ fseek(script_py, 0, SEEK_SET); /* we don't run the file, like this, because then side-effects * s = PyRun_File(script_py, pe->fname, Py_file_input, * PyModule_GetDict(PyImport_AddModule("__main__")), pe->dict); * could happen (again). Instead we parse the file again to get * the error string in the logs, for when the daemon has stderr * removed. SimpleFile run already printed to stderr, for then * this is called from unbound-checkconf or unbound -dd the user * has a nice formatted error. */ /* ignore the NULL return of _node, it is NULL due to the parse failure * that we are expecting */ (void)PyParser_SimpleParseFile(script_py, pe->fname, Py_file_input); log_py_err(); PyGILState_Release(gil); fclose(script_py); return 0; }
Honestly that code seems poorly thought out. If running it returns -1, an exception was presumably reported, but not necessarily SyntaxError -- so parsing it may not produce an error at all. The functionality needed is in PyRun_InteractiveOneObjectEx(), but that is not public. :-(
> By the way, is there a replacement for this function? The unbound project uses it to display a SyntaxError when PyRun_SimpleFile() fails. There is no replacement for the function because that function returned CST nodes and those not exist anymore.
Could the removal of the parser module be documented in https://docs.python.org/3.10/whatsnew/3.10.html please?
Please also document the removal of the node.h header file. The removal of this file broke the build of the two following packages. mod_wsgi: https://bugzilla.redhat.com/show_bug.cgi?id=1898158 In file included from src/server/mod_wsgi.c:22: src/server/wsgi_python.h:44:10: fatal error: node.h: No such file or directory 44 | #include "node.h" | ^~~~~~~~ compilation terminated. kdevelop-python: https://bugzilla.redhat.com/show_bug.cgi?id=1898116 In file included from /builddir/build/BUILD/kdev-python-5.6.0/parser/astbuilder.cpp:31: /builddir/build/BUILD/kdev-python-5.6.0/parser/python_header.h:33:10: fatal error: node.h: No such file or directory 33 | #include "node.h" | ^~~~~~~~ compilation terminated.
I will submit a PR today
Pablo, have you already started on this? I didn't see your comment earlier and I've got a PR ready.
> Pablo, have you already started on this? I didn't see your comment earlier and I've got a PR ready. Yeah, but don't worry: submit your PR and I will review it :)
New changeset c26d5916d68c47a20dd941f9e89afdaf85b2711e by Lysandros Nikolaou in branch 'master': bpo-40939: Document removal of the old parser in 3.10 whatsnew (GH-23321) https://github.com/python/cpython/commit/c26d5916d68c47a20dd941f9e89afdaf85b2711e
Victor, Miro, both removal of the parser module and of all the C API functions are now documented in the 3.10 whatsnew document. Do you feel that this is enough for us to close this issue again?
Thanks. I feel like the What's new document should teach people what to do when they are hit by the removals. The removals are documented, but the developers who are affected have no clue what to do. What do you think? (Sorry I wasn't able to provide this feedback before the PR was merged.)
> The removals are documented, but the developers who are affected have no clue what to do. What do you think? Here is difficult to recommend a canonical Path because as I mentioned, there is no replacement for these functions because that functions returned CST nodes and those not exist anymore.
We should check of the 3 mentioned projects (mod_wsgi,
kdevelop-python, unbound) use the removed functions to suggest a similar replacement. I understood that there is no drop-in replacement.
unbound does not use to get the parsed Python code as CST, but uses PyParser_SimpleParseFile() just to display an error message to stderr. I understand that PyParser_ASTFromFileObject() + PyErr_Print() could be used.
But PyParser_ASTFromFileObject() is low-level, it requires to pass an arena object. Maybe the *intent* here is to call compile() and display the error message? Pseudo-code:
---
fp = fopen(filename, "r");
bytes = readall(fp);
PyObject *builtins = PyEval_GetBuiltins();
obj = PyObject_CallMethod(builtins, "compile", "O", bytes);
Py_DECREF(bytes);
if (!obj) {
PyErr_Print();
}
else {
Py_DECREF(obj);
}
fclose(fp);
---
This code is non-trivial :-( Should we provide a *new* C function doing that?
Input: filename
Output: code object
Or maybe I just missed an existing function :-)
> This code is non-trivial :-( Should we provide a *new* C function doing that? We could discuss adding a new C function, but IMHO that code is not especially horrible or unreadable. I agree it could be simpler, though.
Hi All, It seems this patch removes some functions provided by the Stable ABI (PEP 384), most notably Py_CompileString. Was this the intention? If not, is there still a chance to reintroduce it before the release?
> It seems this patch removes some functions provided by the Stable ABI (PEP 384), most notably Py_CompileString. Was this the intention? If not, is there still a chance to reintroduce it before the release? The functions removal is intentional and was approved by the PEP 617. But Py_CompileString() function was not removed, it's still in the master branch (future Python 3.10). Why do you think that it has been removed?
> But Py_CompileString() function was not removed, it's still in the master branch (future Python 3.10). Why do you think that it has been removed? Thank you. It looked that way because of the removed block of lines in the commit 1ed83adb0e95305af858bd41af531e487f54fee7 (pythonrun.c). We were also getting a missing symbol error. We'll check again to be sure.
Attached is a sample program which works on 3.9 but fails linking with 3.10.0a2 The .so is missing the symbol: igor@LAPTOP:~/py_limited_api_example$ nm /home/igor/lib/libpython3.9.so | grep Py_CompileString 0000000000212720 T Py_CompileString 000000000020fe30 T Py_CompileStringExFlags 0000000000212730 T Py_CompileStringFlags 000000000020fd40 T Py_CompileStringObject igor@LAPTOP:~/py_limited_api_example$ nm /home/igor/lib/libpython3.10.so | grep Py_CompileString 0000000000201a40 T Py_CompileStringExFlags 0000000000201980 T Py_CompileStringObject Please stop breaking the Stable ABI :/
Hm, I wonder if there's a typo here in pythonrun.c:
/* For use in Py_LIMITED_API */
#undef Py_CompileString
PyObject *
PyCompileString(const char *str, const char *filename, int start)
{
return Py_CompileStringFlags(str, filename, start, NULL);
}
Shouldn't that function be named Py_CompileString (i.e. Py_ instead of Py)?
This seems to be old code, but there's normally a macro Py_CompileString() that translates to Py_CompileStringFlags() in pythonrun.h:
#ifdef Py_LIMITED_API
PyAPI_FUNC(PyObject *) Py_CompileString(const char *, const char *, int);
#else
#define Py_CompileString(str, p, s) Py_CompileStringExFlags(str, p, s, NULL, -1)
#define Py_CompileStringFlags(str, p, s, f) Py_CompileStringExFlags(str, p, s, f, -1)
.
.
.
Seems that commit 1ed83adb0e95305af858bd41af531e487f54fee7 was deleting some functions that were not correctly covered by redirection macros. I have opened PR 23606 to restore those as they were.
New changeset 46bd5ed94cf3d5e03f45eecf9afea1659980c8bf by Pablo Galindo in branch 'master': bpo-40939: Restore some stable API functions incorrectly deleted (GH-23606) https://github.com/python/cpython/commit/46bd5ed94cf3d5e03f45eecf9afea1659980c8bf
Thanks for the fix! I'd like to submit a test to avoid this and similar issues in future. Are there any guidelines for this? Sorry if this is a wrong place to ask.
I am currently working on test that checks that the stable API symbols are correctly exported. Unfortunately there is no official maintained list if those symbols, so is taking a while
> I am currently working on test that checks that the stable API symbols are correctly exported. Thank you very much! For added motivation, the 3.8.0 release was unusable thanks to issue37633 which was somewhat similar (also in pythonrun). For reference, I've attached a list of symbols we currently use (a few more we have to import dynamically since they're not in the stable ABI but we'd like to keep that list as short as possible).
> Unfortunately there is no official maintained list if those symbols, so is taking a while On Windows there is PC/python3dll.c.
> On Windows there is PC/python3dll.c. Even that is severely out of date, unfortunately. There are many functions that were removed in 3.8 and 3.7 that are still listed there.
Opened https://bugs.python.org/issue42545 for that
FYI the unbound project was fixed by calling Py_CompileString() on Python 3.9 and newer: https://github.com/NLnetLabs/unbound/commit/e0d426ebb10653a78bf5c4053198f6ac19bfcd3e
messages: + msg382324
messages: + msg382285
pull_requests: + pull_request21347
stage: patch review -> resolved