Issue30157
Created on 2017-04-25 01:51 by jcdavis1983, last changed 2018-02-09 22:03 by serhiy.storchaka. This issue is now closed.
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 1273 | closed | jcdavis1983, 2017-04-25 01:51 | |
| PR 5601 | merged | serhiy.storchaka, 2018-02-09 17:09 | |
| PR 5602 | merged | miss-islington, 2018-02-09 18:02 | |
| PR 5603 | merged | serhiy.storchaka, 2018-02-09 18:11 | |
| PR 5604 | merged | serhiy.storchaka, 2018-02-09 18:16 | |
| Messages (13) | |||
|---|---|---|---|
| msg292249 - (view) | Author: Jake Davis (jcdavis1983) * | Date: 2017-04-25 01:51 | |
Line 220 of Lib/csv.py has an extra `>` in the first group: r'(?P<delim>>[^\w\n"\']) |
|||
| msg292254 - (view) | Author: STINNER Victor (vstinner) * | Date: 2017-04-25 08:39 | |
What is the consequence of this change? Does it change the syntax of the parser? Which kind of format wasn't parsed correctly? |
|||
| msg292267 - (view) | Author: Matthew Barnett (mrabarnett) * | Date: 2017-04-25 16:02 | |
There are 4 patterns. They try to determine the delimiter and quote by looking for matches. Each pattern supposedly covers one of 4 cases:
1. Delimiter, quote, value, quote, delimiter.
2. Start of line/text, quote, value, quote, delimiter.
3. Delimiter, quote, value, quote, end of line/text.
4. Start of line/text, quote, value, quote, end of line/text.
On that basis, case 3 looks wrong because the pattern for delimiter is:
>[^\w\n"\']
instead of the expected:
[^\w\n"\']
Looks like a bug to me.
|
|||
| msg292282 - (view) | Author: STINNER Victor (vstinner) * | Date: 2017-04-25 22:41 | |
Can you please try to write a unit test to check for non-regression? Or at least give an example? |
|||
| msg292290 - (view) | Author: R. David Murray (r.david.murray) * | Date: 2017-04-26 01:02 | |
If it is a bug that indicates there is at least one missing unit test :) Maybe the OP will contribute a test. |
|||
| msg292294 - (view) | Author: Jake Davis (jcdavis1983) * | Date: 2017-04-26 02:59 | |
Will do! I will try to get a regression proof test into test_csv.py in the next 24 hours. Essentially I will make sure that the sniffer returns a positive match for each of the patterns that the regex is intended to hit. |
|||
| msg292434 - (view) | Author: Jake Davis (jcdavis1983) * | Date: 2017-04-27 12:43 | |
I've added some unittests for Sniffer._guess_quote_and_delimiter(); they should prevent regression. |
|||
| msg311898 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2018-02-09 17:11 | |
Since the original author didn't respond for long time I have recreated PR 1273 as PR 5601. |
|||
| msg311902 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2018-02-09 18:00 | |
New changeset 2411292ba8155327125d8a1da8a4c9fa003d5909 by Serhiy Storchaka in branch 'master': bpo-30157: Fix csv.Sniffer.sniff() regex pattern. (GH-5601) https://github.com/python/cpython/commit/2411292ba8155327125d8a1da8a4c9fa003d5909 |
|||
| msg311908 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2018-02-09 22:00 | |
New changeset 2ef69a1d45de8aa41c45d32d9ee1ff227bb1a566 by Serhiy Storchaka (Miss Islington (bot)) in branch '3.7': bpo-30157: Fix csv.Sniffer.sniff() regex pattern. (GH-5601) (GH-5602) https://github.com/python/cpython/commit/2ef69a1d45de8aa41c45d32d9ee1ff227bb1a566 |
|||
| msg311909 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2018-02-09 22:01 | |
New changeset 504f19145ca5738162d6a720fa45b364ac8c0384 by Serhiy Storchaka in branch '3.6': [3.6] bpo-30157: Fix csv.Sniffer.sniff() regex pattern. (GH-5601) (GH-5603) https://github.com/python/cpython/commit/504f19145ca5738162d6a720fa45b364ac8c0384 |
|||
| msg311910 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2018-02-09 22:02 | |
New changeset e7197936c987bdf31b6b7b1dab275d1a762e03b3 by Serhiy Storchaka in branch '2.7': [2.7] bpo-30157: Fix csv.Sniffer.sniff() regex pattern. (GH-5601) (GH-5604) https://github.com/python/cpython/commit/e7197936c987bdf31b6b7b1dab275d1a762e03b3 |
|||
| msg311911 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2018-02-09 22:03 | |
Thank you for your contribution Jake! |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2018-02-09 22:03:21 | serhiy.storchaka | set | status: open -> closed resolution: fixed messages: + msg311911 stage: patch review -> resolved |
| 2018-02-09 22:02:07 | serhiy.storchaka | set | messages: + msg311910 |
| 2018-02-09 22:01:42 | serhiy.storchaka | set | messages: + msg311909 |
| 2018-02-09 22:00:56 | serhiy.storchaka | set | messages: + msg311908 |
| 2018-02-09 18:16:06 | serhiy.storchaka | set | pull_requests: + pull_request5416 |
| 2018-02-09 18:11:25 | serhiy.storchaka | set | pull_requests: + pull_request5415 |
| 2018-02-09 18:02:01 | miss-islington | set | pull_requests: + pull_request5414 |
| 2018-02-09 18:00:51 | serhiy.storchaka | set | messages: + msg311902 |
| 2018-02-09 17:11:49 | serhiy.storchaka | set | versions: + Python 3.8, - Python 3.3, Python 3.4, Python 3.5 |
| 2018-02-09 17:11:35 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages: + msg311898 |
| 2018-02-09 17:09:59 | serhiy.storchaka | set | keywords:
+ patch stage: patch review pull_requests: + pull_request5413 |
| 2017-04-27 12:43:09 | jcdavis1983 | set | messages: + msg292434 |
| 2017-04-26 02:59:51 | jcdavis1983 | set | messages: + msg292294 |
| 2017-04-26 01:02:30 | r.david.murray | set | nosy:
+ r.david.murray messages: + msg292290 |
| 2017-04-25 22:41:07 | vstinner | set | messages: + msg292282 |
| 2017-04-25 16:02:43 | mrabarnett | set | nosy:
+ mrabarnett messages: + msg292267 |
| 2017-04-25 08:39:01 | vstinner | set | nosy:
+ vstinner messages: + msg292254 |
| 2017-04-25 03:25:04 | louielu | set | title: csn.Sniffer.sniff() regex error -> csv.Sniffer.sniff() regex error |
| 2017-04-25 01:51:06 | jcdavis1983 | create | |