Issue31800
Created on 2017-10-16 22:51 by mariocj89, last changed 2022-04-11 14:58 by admin. This issue is now closed.
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 4015 | merged | mariocj89, 2017-10-16 22:54 | |
| Messages (9) | |||
|---|---|---|---|
| msg304486 - (view) | Author: Mario Corchero (mariocj89) * | Date: 2017-10-16 22:51 | |
Currently, datetime.strptime does not support parsing utc offsets that include a colon. "+0000" is parsed without issues whilst it fails with "+00:00".
"+NN:NN" is not only ISO8601 valid but also the way the offset is presented to the user when using .isoformat on a datetime with a timezone/offset.
This lead to the users needing to go to external libraries like dateutil or iso8601 just to be able to parse the datetime encoded in strings that "datetime" produces.
Even if a long-term goal would be to provide a way to parse any isoformatted string this issue just aims to address the problem that the %z parsing presents. This already unblocks users from parsing datetime object serialized with isoformat.
With this change, the following will just work:
>>> import datetime as dt
>>> iso_fmt = '%Y-%m-%dT%H:%M:%S%z'
>>> d = dt.datetime.strptime('2004-01-01T10:10:10+05:00', iso_fmt)
*'2004-01-01T10:10:10+05:00' is a sample string generated via datetime.isoformat()
Other options like having a new %:z was proposed but having just %z seems much simpler for the user.
Note: There has been already conversations about adding support on datetime to parse any ISO-formatted string. This is a more simplistic approach. We might be able to get to that situation after this patch, but this aims just to unblock us.
Related:
http://www.loc.gov/standards/datetime/iso-tc154-wg5_n0039_iso_wd_8601-2_2016-02-16.pdf
https://mail.python.org/pipermail/python-ideas/2014-March/027018.html
https://bugs.python.org/issue15873
|
|||
| msg304489 - (view) | Author: Martin Panter (martin.panter) * | Date: 2017-10-17 04:12 | |
FWIW it looks like “strptime” in glibc, and Open and Free BSD support parsing this and even more formats (RFC 822 and RFC 3339; includes “Z”, U.S. time zones, ±HH). Also, there is Issue 24954 for adding “%:z” like Gnu “date”. |
|||
| msg304490 - (view) | Author: Martin Panter (martin.panter) * | Date: 2017-10-17 04:15 | |
Sorry, I meant Net BSD not Free BSD |
|||
| msg304510 - (view) | Author: Mario Corchero (mariocj89) * | Date: 2017-10-17 14:07 | |
Yep, http://man7.org/linux/man-pages/man3/strptime.3.html does support it even if it might look asymetrical. Example: struct tm tm; char buf[255]; memset(&tm, 0, sizeof(struct tm)); strptime("+00:00", "%z", &tm); strftime(buf, sizeof(buf), "%z", &tm); puts(buf); // Will print +0000 exit(EXIT_SUCCESS); Martin do you want me to "cleanup" the PR, add docs, news entry, etc? |
|||
| msg304620 - (view) | Author: Paul Ganssle (p-ganssle) * | Date: 2017-10-19 13:56 | |
This seems very useful to me. I very frequently advise people *against* using dateutil.parser (despite my conflict of interest as maintainer of dateutil) for well-known formats, but the problem frequently comes up of, "what should I do when I have date created by isoformat()?", to which there's no clean satisfying answer other than, "use dateutil.parser even though you know the format." I think the strptime page that Mario linked to is evidence that the %z directive is *intended* to match against -HH:MM, and so that might be the most "standard" solution. That said, I somewhat prefer the granularity of the GNU date extensions %z, %:z and %::z, since this allows downstream users to be stricter about what they are willing to accept. I think either approach is defensible, but that *something* should be done soon, preferably for the 3.7 release. |
|||
| msg304644 - (view) | Author: Mario Corchero (mariocj89) * | Date: 2017-10-19 22:15 | |
As a note Seems support for the ":" was added in 2015 for glibc: http://code.metager.de/source/xref/gnu/glibc/time/strptime_l.c#765 Commit e952e1df Before that, it basically just ignores the minutes. |
|||
| msg304645 - (view) | Author: Mario Corchero (mariocj89) * | Date: 2017-10-19 22:24 | |
I have a patch to add 'Z' support as well if we are interested in making it the same as it glibc does. (as it supports it as well) |
|||
| msg304836 - (view) | Author: Alexander Belopolsky (belopolsky) * | Date: 2017-10-23 19:49 | |
Note that #5288 relaxed the whole number of minutes restriction on UTC offsets. Since the goal is to be able to parse the output of .isoformat(), I think %z should accept sub-minute offsets. |
|||
| msg305017 - (view) | Author: Alexander Belopolsky (belopolsky) * | Date: 2017-10-26 00:35 | |
New changeset 32318930da70ff03320ec50813b843e7db6fbc2e by Alexander Belopolsky (Mario Corchero) in branch 'master': Closes bpo-31800: Support for colon when parsing time offsets (#4015) https://github.com/python/cpython/commit/32318930da70ff03320ec50813b843e7db6fbc2e |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:58:53 | admin | set | github: 75981 |
| 2017-10-26 00:35:43 | belopolsky | set | status: open -> closed resolution: fixed messages: + msg305017 stage: patch review -> resolved |
| 2017-10-26 00:10:21 | belopolsky | link | issue24954 dependencies |
| 2017-10-23 19:49:11 | belopolsky | set | nosy:
+ belopolsky messages: + msg304836 |
| 2017-10-19 22:24:09 | mariocj89 | set | messages: + msg304645 |
| 2017-10-19 22:15:10 | mariocj89 | set | messages: + msg304644 |
| 2017-10-19 13:56:11 | p-ganssle | set | nosy:
+ p-ganssle messages: + msg304620 |
| 2017-10-17 14:07:22 | mariocj89 | set | messages: + msg304510 |
| 2017-10-17 04:15:00 | martin.panter | set | messages: + msg304490 |
| 2017-10-17 04:12:30 | martin.panter | set | nosy:
+ martin.panter messages: + msg304489 |
| 2017-10-17 01:37:07 | pablogsal | set | nosy:
+ pablogsal |
| 2017-10-16 22:54:18 | mariocj89 | set | keywords:
+ patch stage: patch review pull_requests: + pull_request3989 |
| 2017-10-16 22:51:49 | mariocj89 | create | |