Issue32255
Created on 2017-12-08 14:43 by licht-t, last changed 2022-04-11 14:58 by admin. This issue is now closed.
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 4769 | merged | licht-t, 2017-12-09 15:18 | |
| PR 4810 | merged | python-dev, 2017-12-12 09:57 | |
| Messages (12) | |||
|---|---|---|---|
| msg307851 - (view) | Author: Licht Takeuchi (licht-t) * | Date: 2017-12-08 14:43 | |
Inconsistent behavior while reading a single column CSV.
I have the patch and waiting for the CLA response.
# Case 1
## Input
```
import csv
fp = open('test.csv', 'w')
w = csv.writer(fp)
w.writerow([''])
w.writerow(['1'])
fp.close()
```
## Output
```
""
1
```
# Case 2
## Input
```
import csv
fp = open('test.csv', 'w')
w = csv.writer(fp)
w.writerow(['1'])
w.writerow([''])
fp.close()
```
## Output
```
1
```
|
|||
| msg307939 - (view) | Author: Nitish (nitishch) * | Date: 2017-12-10 03:18 | |
Which scenario you think is the wrong behaviour in this case? First one or second one? I don't know much about csv module, but I thought it was a deliberate choice made to quote all empty lines and hence considered the second scenario as buggy. But your pull requests seems to fix the first case. Am I missing something here? |
|||
| msg307940 - (view) | Author: Licht Takeuchi (licht-t) * | Date: 2017-12-10 05:06 | |
I think the first one is buggy and there are two reasons.
1. The both are valid CSV. The double quoting is unnecessary. Some other applications, eg. Excel, does not use the double quoting.
Also, the current implementation make to quote only if the string is '' and the output is at the first line.
2. '' is not quoted when the two columns case.
## Input:
```
import csv
fp = open('test.csv', 'w')
w = csv.writer(fp, dialect=None)
w.writerow(['', ''])
w.writerow(['3', 'a'])
fp.close()
```
## Output:
```
,
3,a
```
These seem inconsistent and the quoting is unnecessary in this case.
# References
http://www.ietf.org/rfc/rfc4180.txt
|
|||
| msg307941 - (view) | Author: Licht Takeuchi (licht-t) * | Date: 2017-12-10 05:15 | |
The current implementation does not quote in most case. IOW, the patch which makes all '' is quoted is the breaking change (Note that there are some applications does not use quoting). |
|||
| msg307984 - (view) | Author: R. David Murray (r.david.murray) * | Date: 2017-12-10 20:29 | |
The second case is indeed the bug, as can be seen by running the examples against python2.7. It looks like this was probably broken by 7901b48a1f89 from issue 23171. |
|||
| msg307986 - (view) | Author: R. David Murray (r.david.murray) * | Date: 2017-12-10 20:31 | |
Serhiy, since it was your patch that probably introduced this bug, can you take a look? Obviously it isn't a very high priority bug, since no one has reported a problem (even this issue isn't reporting the change in behavior as a *problem* :) |
|||
| msg307997 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2017-12-10 22:25 | |
For restoring the 3.4 behavior the single empty field must be quoted. This allows to distinguish a 1-element row with the single empty field from an empty row. |
|||
| msg308009 - (view) | Author: Licht Takeuchi (licht-t) * | Date: 2017-12-11 00:20 | |
Thanks for your investigation! Would you mind if I create a new patch? |
|||
| msg308050 - (view) | Author: Licht Takeuchi (licht-t) * | Date: 2017-12-11 15:05 | |
PR is now fixed so as to follow the behavior on Python 2.7! |
|||
| msg308102 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2017-12-12 09:57 | |
New changeset 2001900b0c02a397d8cf1d776a7cc7fcb2a463e3 by Serhiy Storchaka (Licht Takeuchi) in branch 'master': bpo-32255: Always quote a single empty field when write into a CSV file. (#4769) https://github.com/python/cpython/commit/2001900b0c02a397d8cf1d776a7cc7fcb2a463e3 |
|||
| msg308103 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2017-12-12 09:58 | |
Thank you for your contribution Licht! |
|||
| msg308109 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2017-12-12 10:56 | |
New changeset ce5a3cd9b15c9379753aefabd696bff11495cbbb by Serhiy Storchaka (Miss Islington (bot)) in branch '3.6': bpo-32255: Always quote a single empty field when write into a CSV file. (GH-4769) (#4810) https://github.com/python/cpython/commit/ce5a3cd9b15c9379753aefabd696bff11495cbbb |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:58:55 | admin | set | github: 76436 |
| 2017-12-12 10:56:58 | serhiy.storchaka | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
| 2017-12-12 10:56:43 | serhiy.storchaka | set | messages: + msg308109 |
| 2017-12-12 09:58:03 | serhiy.storchaka | set | messages: + msg308103 |
| 2017-12-12 09:57:18 | python-dev | set | stage: needs patch -> patch review pull_requests: + pull_request4705 |
| 2017-12-12 09:57:09 | serhiy.storchaka | set | messages: + msg308102 |
| 2017-12-11 15:05:12 | licht-t | set | messages: + msg308050 |
| 2017-12-11 00:20:24 | licht-t | set | messages: + msg308009 |
| 2017-12-10 22:25:42 | serhiy.storchaka | set | messages: + msg307997 |
| 2017-12-10 20:31:03 | r.david.murray | set | nosy:
+ serhiy.storchaka messages: + msg307986 |
| 2017-12-10 20:29:10 | r.david.murray | set | versions:
- Python 2.7, Python 3.4, Python 3.5, Python 3.8 nosy: + r.david.murray messages: + msg307984 components:
+ Library (Lib), - IO |
| 2017-12-10 05:15:33 | licht-t | set | messages: + msg307941 |
| 2017-12-10 05:06:24 | licht-t | set | messages: + msg307940 |
| 2017-12-10 03:18:20 | nitishch | set | messages: + msg307939 |
| 2017-12-09 15:18:18 | licht-t | set | keywords:
+ patch stage: patch review pull_requests: + pull_request4672 |
| 2017-12-08 18:50:14 | nitishch | set | nosy:
+ nitishch |
| 2017-12-08 14:43:54 | licht-t | create | |