Issue26581
Created on 2016-03-17 12:00 by serhiy.storchaka, last changed 2016-03-20 21:52 by serhiy.storchaka. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| tokenize_double_coding.patch | serhiy.storchaka, 2016-03-17 12:00 | review | ||
| Messages (8) | |||
|---|---|---|---|
| msg261909 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2016-03-17 12:00 | |
When Python source file contains double coding cookies on different lines, the first wins. When it contains double coding cookies on the same line, the last wins. PEP 263 was sufficiently vague about this. Now this is clarified (22490711c870). The first coding cookie should always win. Proposed patch fixes Python tokenizer, the tokenize module, and other places. Tests are taken from issue25643. |
|||
| msg262051 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2016-03-19 15:26 | |
I just tested with Emacs, and it looks that when specify different codings on two different lines, the first coding wins, but when specify different codings on the same line, the last coding wins. Therefore current CPython behavior can be correct, and the regular expression in PEP 263 should be changed to use greedy repetition. |
|||
| msg262052 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2016-03-19 16:12 | |
Do you have write permission to the PEP? Just update it. |
|||
| msg262053 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2016-03-19 16:37 | |
Yes, I have. But I were not sure what behavior should be correct in Python. On one side, always choosing the first declaration (on the same or on different lines) looks more consistent. On other side, current behavior was in CPython from the initial implementing PEP 263 in issue526840 and it matches Emacs behavior (if I understand this correctly). I can update the regular expression, but may be this obscure corner case needs the verbal explanation. |
|||
| msg262054 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2016-03-19 16:48 | |
Right. Please go ahead with both. I am fine with defining the current behavior correct. --Guido (mobile) On Mar 19, 2016 9:37 AM, "Serhiy Storchaka" <report@bugs.python.org> wrote: > > Serhiy Storchaka added the comment: > > Yes, I have. But I were not sure what behavior should be correct in > Python. On one side, always choosing the first declaration (on the same or > on different lines) looks more consistent. On other side, current behavior > was in CPython from the initial implementing PEP 263 in issue526840 and it > matches Emacs behavior (if I understand this correctly). > > I can update the regular expression, but may be this obscure corner case > needs the verbal explanation. > > ---------- > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue26581> > _______________________________________ > |
|||
| msg262089 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2016-03-20 21:29 | |
Ah, I made a mistake! In 2.7 the first coding on the same line wins. And that behavior was from start. Regression was unintentionally introduced in issue18470. Thus *there is* a bug in Python 3. PEP 263 doesn't need more changes, but Python tokenizer and related tools do. Sorry for misleading. |
|||
| msg262090 - (view) | Author: Roundup Robot (python-dev) | Date: 2016-03-20 21:30 | |
New changeset 23a7481eafd4 by Serhiy Storchaka in branch 'default': Issues #25643, #26581: Added new tests for detecting Python source code encoding. https://hg.python.org/cpython/rev/23a7481eafd4 |
|||
| msg262092 - (view) | Author: Roundup Robot (python-dev) | Date: 2016-03-20 21:51 | |
New changeset 1c44cea2ea8f by Serhiy Storchaka in branch '3.5': Issue #26581: Use the first coding cookie on a line, not the last one. https://hg.python.org/cpython/rev/1c44cea2ea8f New changeset 8506d127d482 by Serhiy Storchaka in branch '2.7': Issue #26581: Use the first coding cookie on a line, not the last one. https://hg.python.org/cpython/rev/8506d127d482 New changeset e86cd4a872b8 by Serhiy Storchaka in branch 'default': Issue #26581: Use the first coding cookie on a line, not the last one. https://hg.python.org/cpython/rev/e86cd4a872b8 |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2016-03-20 21:52:08 | serhiy.storchaka | set | status: open -> closed assignee: serhiy.storchaka resolution: fixed stage: patch review -> resolved |
| 2016-03-20 21:51:22 | python-dev | set | messages: + msg262092 |
| 2016-03-20 21:30:28 | python-dev | set | nosy:
+ python-dev messages: + msg262090 |
| 2016-03-20 21:29:54 | serhiy.storchaka | set | messages: + msg262089 |
| 2016-03-19 16:48:13 | gvanrossum | set | messages: + msg262054 |
| 2016-03-19 16:37:37 | serhiy.storchaka | set | messages: + msg262053 |
| 2016-03-19 16:12:07 | gvanrossum | set | messages: + msg262052 |
| 2016-03-19 15:26:20 | serhiy.storchaka | set | nosy:
+ lemburg, loewis messages: + msg262051 |
| 2016-03-17 12:04:22 | serhiy.storchaka | link | issue25643 dependencies |
| 2016-03-17 12:00:36 | serhiy.storchaka | create | |