[proxy] bugs.python.org← back | site home | direct (HTTPS) ↗ | proxy home | ◑ dark◐ light

Issue 22721: pprint output for sets and dicts is not stable

Created on 2014-10-24 17:00 by serhiy.storchaka, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (16) msg229943 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * Date: 2014-10-24 17:00
pprint() sorts the content of sets and dicts in order to get stable output which doesn't depend on iteration order of set or dict, which depend not only from values of elements, but also from set or dict history.

But in some cases the output is different for equal sets or dicts which differs only by their history.

>>> import pprint
>>> class A:  # string 'A' < 'int'
...     def __lt__(self, other): return False
...     def __gt__(self, other): return self != other
...     def __le__(self, other): return self == other
...     def __ge__(self, other): return True
...     def __eq__(self, other): return self is other
...     def __ne__(self, other): return self is not other
...     def __hash__(self): return 1  # == hash(1)
... 
>>> a = A()
>>> sorted([1, a])
[1, <__main__.A object at 0xb700c64c>]
>>> sorted([a, 1])
[1, <__main__.A object at 0xb700c64c>]
>>> # set
>>> pprint.pprint({1, a})
{<__main__.A object at 0xb700c64c>, 1}
>>> pprint.pprint({a, 1})
{1, <__main__.A object at 0xb700c64c>}
>>> # dict
>>> pprint.pprint({1: 1, a: 1})
{1: 1, <__main__.A object at 0xb700c64c>: 1}
>>> pprint.pprint({a: 1, 1: 1})
{<__main__.A object at 0xb700c64c>: 1, 1: 1}

This is happen because _safe_key's __lt__() calls the __lt__() method of it's left argument, and doesn't use special methods of it's right argument. a.__lt__(1) is successful, but (1).__lt__(a) is failed.

I think that instead of `self.obj.__lt__(other.obj)` here should be `self.obj < other.obj`. Or may be call other.obj.__gt__(self.obj) if the result of self.obj.__lt__(other.obj) is NotImplemented.

_safe_key was introduced in issue3976.
msg229971 - (view) Author: Antoine Pitrou (pitrou) * Date: 2014-10-25 00:47
Hmm... is it important?
msg229980 - (view) Author: Fred Drake (fdrake) Date: 2014-10-25 04:42
Stability in output order from pprint is very useful in doctests (yes, some people write documentation that they test).

I think fixing any output stability issues would be very worthwhile.
msg229993 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * Date: 2014-10-25 10:45
> Hmm... is it important?

Not more than sorting pprint output at all. This looks low priority issue to me, but the fix looks pretty easy. Here is a patch. I hope Raymond will make a review, may be I missed some details.
msg229999 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * Date: 2014-10-25 13:41
And here is alternative patch if the first patch is not correct. It is more complicated and I suppose is less efficient in common case.
msg230161 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * Date: 2014-10-28 17:49
What if [some flavor of] pprint sorted items not by value, but by their repr() string?
It's probably faster than any other algorithm, and guaranteed to produce consistent results.

Or use this idea only for ambiguous cases?
msg230162 - (view) Author: Fred Drake (fdrake) Date: 2014-10-28 17:57
Sorting by the repr sounds good, but if some dict keys or set members are strings containing single-quotes, the primary sort will be on the type of quote used for the repr, which would be surprising and significantly less useful.
msg230671 - (view) Author: Raymond Hettinger (rhettinger) * Date: 2014-11-05 08:55
> the primary sort will be on the type of quote used for the repr,
> which would be surprising and significantly less useful.

How about:  repr(obj).strip("'\"") ?

Overall, the idea of using repr() in some fashion is appealing because it sorts on what the user actually sees.
msg230691 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * Date: 2014-11-05 16:03
> How about:  repr(obj).strip("'\"") ?

String can starts or ends with quotes. And string repr can be a part of the 
repr of other type (e.g. short list).
msg230696 - (view) Author: Antoine Pitrou (pitrou) * Date: 2014-11-05 17:05
I think it'd be nice if the solution kept the current order when all keys are orderable (which is a very common case). So IMO repr() should only be used as a fallback when the object comparison fails.
msg232072 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * Date: 2014-12-03 07:36
My question to Raymond is should we use the "<" operator or special methods __lt__ and __gt__ (this is the difference between alternative patches)?

The use of repr instead of id is different issue.
msg234877 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * Date: 2015-01-28 09:20
Ping.
msg239313 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * Date: 2015-03-26 07:32
Ping.
msg240172 - (view) Author: Fred Drake (fdrake) Date: 2015-04-06 19:28
Sorry for the delay.  pprint_safe_key.patch looks good to me.
msg240174 - (view) Author: Roundup Robot (python-dev) Date: 2015-04-06 19:53
New changeset c8815035116b by Serhiy Storchaka in branch 'default':
Issue #22721: An order of multiline pprint output of set or dict containing
https://hg.python.org/cpython/rev/c8815035116b
msg240175 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * Date: 2015-04-06 19:54
Thank you for your review Fred.
History Date User Action Args 2022-04-11 14:58:09adminsetgithub: 66910 2015-04-06 19:54:20serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg240175

stage: patch review -> resolved

2015-04-06 19:53:08python-devsetnosy: + python-dev
messages: + msg240174
2015-04-06 19:28:00fdrakesetmessages: + msg240172 2015-03-26 07:32:59serhiy.storchakasetmessages: + msg239313 2015-01-28 09:20:29serhiy.storchakasetmessages: + msg234877 2014-12-03 07:36:23serhiy.storchakasetmessages: + msg232072 2014-11-05 17:05:44pitrousetmessages: + msg230696 2014-11-05 16:03:01serhiy.storchakasetmessages: + msg230691 2014-11-05 08:55:46rhettingersetmessages: + msg230671 2014-11-01 08:17:44rhettingersetassignee: rhettinger -> fdrake 2014-11-01 07:11:18rhettingersetassignee: rhettinger
versions: - Python 3.4 2014-10-28 17:57:40fdrakesetmessages: + msg230162 2014-10-28 17:49:03amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg230161
2014-10-25 13:41:01serhiy.storchakasetfiles: + pprint_safe_key_alt.patch

messages: + msg229999

2014-10-25 10:45:22serhiy.storchakasetfiles: + pprint_safe_key.patch
keywords: + patch
messages: + msg229993

stage: patch review

2014-10-25 04:42:51fdrakesetmessages: + msg229980 2014-10-25 00:47:09pitrousetnosy: + pitrou
messages: + msg229971
2014-10-24 17:00:49serhiy.storchakacreate