Issue38045
Created on 2019-09-06 12:18 by Antony.Lee, last changed 2020-05-14 11:47 by Antony.Lee. This issue is now closed.
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 16291 | closed | python-dev, 2019-09-19 17:53 | |
| PR 16483 | merged | hongweipeng, 2019-09-30 04:25 | |
| Messages (5) | |||
|---|---|---|---|
| msg351244 - (view) | Author: Antony Lee (Antony.Lee) * | Date: 2019-09-06 12:18 | |
Consider the following example
from enum import Flag
F = Flag("F", list("abcdefghijklm"))
for idx in range(2**len(F) - 1):
F(idx)
creating all possible combos of 13 flags, so 8192 instances (yes, I know the instances are cached later, but the initial cost is still significant). Locally, this takes over 4.5s to execute; profiling shows that ~30% of that time is spent executing enum._is_power_of_two(!):
def _power_of_two(value):
if value < 1:
return False
return value == 2 ** _high_bit(value)
Indeed, replacing the implementation of _power_of_two by
import math
_powers_of_two = {
2 ** i for i in range(math.ceil(math.log2(sys.maxsize)) + 1)}
def _power_of_two(value):
return (value in _powers_of_two if value <= sys.maxsize
else value == 2 ** _high_bit(value))
shaves off ~30% of the runtime (obviously this won't help for huge (>sys.maxsize) flag values, but these are rarer I would think).
An even better fix, I think, would be for Flag to cache `flags_to_check` (updating it at the same time as `_value2member_map_`) instead of recomputing it each time in _decompose
flags_to_check = [
(m, v)
for v, m in list(flag._value2member_map_.items())
if m.name is not None or _power_of_two(v)
]
as this actually needlessly introduces quadratic complexity when iterating over all possible combos (because the size of _value2member_map_ is growing). (Also, it's not actually clear to me how one can end up with unnamed power-of-two instances in _value2member_map?)
|
|||
| msg351246 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2019-09-06 13:08 | |
The fast method to check if the value is a power of two:
not value & (value - 1)
|
|||
| msg351265 - (view) | Author: Antony Lee (Antony.Lee) * | Date: 2019-09-06 18:08 | |
Actually, microoptimizing _power_of_two is a red herring and the problem is the quadratic complexity of _decompose (_value2member_map_ becomes bigger and bigger at each iteration). Replacing the implementation of _decompose by the following fixes the performance issue (the original snippet executes in ~150ms, which may be even more optimizable but already a 30x improvement :)), and still passes test_enum.py (from cpython 3.7.4):
def _decompose(flag, value):
"""Extract all members from the value."""
# _decompose is only called if the value is not named
not_covered = value
members = []
for member in flag:
member_value = member.value
if member_value and member_value & value == member_value:
members.append(member)
not_covered &= ~member_value
if issubclass(flag, IntFlag) and value > 0:
while not_covered:
new_value = 2 ** _high_bit(not_covered)
if new_value not in flag._value2member_map_:
# construct singleton pseudo-members
pseudo_member = int.__new__(flag, value)
pseudo_member._name_ = None
pseudo_member._value_ = value
flag._value2member_map_.setdefault(value, pseudo_member)
members.append(flag(new_value))
not_covered &= ~new_value
if not members and value in flag._value2member_map_:
members.append(flag._value2member_map_[value])
members.sort(key=lambda m: m._value_, reverse=True)
if len(members) > 1 and members[0].value == value:
# we have the breakdown, don't need the value member itself
members.pop(0)
return members, not_covered
Checking for issubclass(flag, IntFlag) is a bit ugly but that's because Flag and IntFlag really need two separate implementations of _decompose -- the former wants to error out on uncovered values, the latter should create them on-the-fly.
|
|||
| msg357535 - (view) | Author: Ethan Furman (ethan.furman) * | Date: 2019-11-26 22:36 | |
New changeset 0b41a922f95f62b620d5a197b9f2ed8e4bb70730 by Ethan Furman (HongWeipeng) in branch 'master': bpo-38045: Improve the performance of _decompose() in enum.py (GH-16483) https://github.com/python/cpython/commit/0b41a922f95f62b620d5a197b9f2ed8e4bb70730 |
|||
| msg368831 - (view) | Author: Antony Lee (Antony.Lee) * | Date: 2020-05-14 11:47 | |
Now fixed, I believe. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2020-05-14 11:47:45 | Antony.Lee | set | status: open -> closed resolution: fixed messages: + msg368831 stage: patch review -> resolved |
| 2019-11-26 22:36:06 | ethan.furman | set | messages: + msg357535 |
| 2019-09-30 04:25:33 | hongweipeng | set | pull_requests: + pull_request16068 |
| 2019-09-21 14:12:22 | xtreak | set | nosy:
+ ethan.furman |
| 2019-09-19 17:53:12 | python-dev | set | keywords:
+ patch stage: patch review pull_requests: + pull_request15876 |
| 2019-09-06 18:08:17 | Antony.Lee | set | messages: + msg351265 |
| 2019-09-06 13:20:41 | vstinner | set | title: Flag instance creation is slow -> enum.Flag instance creation is slow |
| 2019-09-06 13:08:46 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages: + msg351246 |
| 2019-09-06 12:18:00 | Antony.Lee | create | |