Issue28071
Created on 2016-09-10 22:22 by terry.reedy, last changed 2022-04-11 14:58 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| set_diff_early_out.diff | rhettinger, 2016-09-11 19:53 | review | ||
| Messages (8) | |||
|---|---|---|---|
| msg275707 - (view) | Author: Terry J. Reedy (terry.reedy) * | Date: 2016-09-10 22:22 | |
Proposal from SO: in the iteration loop for 'myset.difference(iterable)', add equivalent of "if not myset: break'. https://stackoverflow.com/questions/39378043/why-does-pythons-set-difference-method-take-time-with-an-empty-set In the toy example for testing that this is currently not so, myset starts empty, but it was noted by the OP that a more realistic example would be that myset becomes empty after deletions. Postulated reasons why not to do this (as opposed to why it has not been done yet ;-): 1) averaged over all uses, the time saved not iterating would be less than the time spent testing emptyness. 2) an implicit guarantee to complete the iteration for possible side-effects. One answer notes that myset.difference(anotherset) is special-cased and faster than the equivalent non-set iterable. I searched the tracker for 'empty set difference' and got no hits. If I remember, I will post any disposition of this issue back to SO. |
|||
| msg275835 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2016-09-11 19:24 | |
This is reasonable, cheap, and not hard to do. I'll whip-up a patch shortly. |
|||
| msg275841 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2016-09-11 19:53 | |
New timings look nice:
$ py -m timeit -s "r = range(10 ** 4); s = set()" "s.difference(r)"
10000000 loops, best of 3: 0.104 usec per loop
$ py -m timeit -s "r = set(range(10 ** 4)); s = set()" "s.difference(r)"
10000000 loops, best of 3: 0.105 usec per loop
$ py -m timeit -s "r = range(10 ** 4); s = set()" "s.difference_update(r)"
10000000 loops, best of 3: 0.0659 usec per loop
$ py -m timeit -s "r = set(range(10 ** 4)); s = set()" "s.difference_update(r)"
10000000 loops, best of 3: 0.0684 usec per loop
|
|||
| msg275962 - (view) | Author: Roundup Robot (python-dev) | Date: 2016-09-12 05:02 | |
New changeset fa0af1a6344d by Raymond Hettinger in branch 'default': Issue #28071: Add early-out for differencing from an empty set. https://hg.python.org/cpython/rev/fa0af1a6344d |
|||
| msg275965 - (view) | Author: Terry J. Reedy (terry.reedy) * | Date: 2016-09-12 05:15 | |
The patch covers sets that start empty, but not sets that become empty during iteration. Are you thinking about or rejecting the latter? (And should this then be closed?) |
|||
| msg275967 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2016-09-12 05:43 | |
I only want a single quick check upfront. It doesn't make much sense to put this in the inner loop. |
|||
| msg306973 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2017-11-25 20:35 | |
This optimization caused a behavior change: Python 3.5: >>> set().difference([[]]) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'list' Python 3.6: >>> set().difference([[]]) set() |
|||
| msg345185 - (view) | Author: Karthikeyan Singaravelan (xtreak) * | Date: 2019-06-11 05:50 | |
Behavior change in msg306973 is tracked with issue37219 |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:58:36 | admin | set | github: 72258 |
| 2019-06-11 05:50:18 | xtreak | set | nosy:
+ xtreak messages: + msg345185 |
| 2017-11-25 20:35:11 | serhiy.storchaka | set | messages: + msg306973 |
| 2016-09-12 05:43:41 | rhettinger | set | status: open -> closed resolution: fixed messages: + msg275967 |
| 2016-09-12 05:31:37 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka |
| 2016-09-12 05:15:15 | terry.reedy | set | status: closed -> open resolution: fixed -> (no value) messages: + msg275965 |
| 2016-09-12 05:04:55 | rhettinger | set | status: open -> closed resolution: fixed |
| 2016-09-12 05:02:38 | python-dev | set | nosy:
+ python-dev messages: + msg275962 |
| 2016-09-11 19:53:14 | rhettinger | set | files:
+ set_diff_early_out.diff keywords: + patch messages: + msg275841 stage: needs patch -> patch review |
| 2016-09-11 19:24:08 | rhettinger | set | messages: + msg275835 |
| 2016-09-11 19:19:53 | rhettinger | set | assignee: rhettinger |
| 2016-09-10 22:22:04 | terry.reedy | create | |