Issue23488
Created on 2015-02-20 10:18 by rhettinger, last changed 2015-05-25 22:03 by rhettinger. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| random_uint32.patch | serhiy.storchaka, 2015-02-23 08:14 | review | ||
| random_uint32_2.patch | serhiy.storchaka, 2015-03-27 07:58 | review | ||
| Messages (19) | |||
|---|---|---|---|
| msg236262 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2015-02-20 10:18 | |
The Modules/_randommodule.c implements the 32-bit version of the MersenneTwister and its struct uses (unsigned long) for each of the 624 elements of the state vector. On a 32-bit build, the unsigned longs are 4 bytes. However, on a 64-bit build, they are 8 bytes each eventhough only the bottom 32-bits are used. This causes the random object to be twice as big as necessary. sys.getsizeof(_random.Random()) reports 5016 bytes. This wastes memory, grinds the cache, and slows performance. The (unsigned long) declaration should probably be replaced with (uint32_t). |
|||
| msg236263 - (view) | Author: STINNER Victor (vstinner) * | Date: 2015-02-20 10:33 | |
> The (unsigned long) declaration should probably be replaced with (uint32_t) Would it be possible to benchmark this change, to ensure that it doesn't kill performances? A quick micro-benchmark using timeit should be enough ;) I agree with the change, I already noticed the unused bits long time ago, when I took at look at the Mersenne Twister implementation. |
|||
| msg236264 - (view) | Author: STINNER Victor (vstinner) * | Date: 2015-02-20 10:33 | |
Oh, by the way, using 32 bits unsigned integers would avoid all the "& 0xffffffff" everywhere. |
|||
| msg236265 - (view) | Author: Antoine Pitrou (pitrou) * | Date: 2015-02-20 10:42 | |
Yes, I noticed this when reimplementing the random module in Numba. *Theoretically*, I think you need "long" to ensure ints are at least 32 bits. But in practice, I think CPython already needs 32-bit C ints. (note Numpy also uses C longs internally) > Would it be possible to benchmark this change, to ensure that it doesn't kill performances? There is no way it can kill performance. |
|||
| msg236306 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2015-02-20 16:32 | |
Here is a patch. It also optimizes getrandbit() and seed() as was originally proposed in issue16496. |
|||
| msg236428 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2015-02-23 07:51 | |
> Here is a patch. Where?! BYW, we might want to use PY_UINT32_T rather than uint32_t directly. |
|||
| msg236429 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2015-02-23 08:14 | |
Oh, sorry, here is it. |
|||
| msg236430 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2015-02-23 08:29 | |
Some microbenchmark results on 32-bit Linux: $ ./python -m timeit -s "from random import getrandbits" -- "getrandbits(64)" Before: 1000000 loops, best of 3: 1.41 usec per loop After: 1000000 loops, best of 3: 1.34 usec per loop $ ./python -m timeit -s "from random import getrandbits" -- "getrandbits(2048)" Before: 100000 loops, best of 3: 5.84 usec per loop After: 100000 loops, best of 3: 5.61 usec per loop $ ./python -m timeit -s "from random import getrandbits" -- "getrandbits(65536)" Before: 10000 loops, best of 3: 145 usec per loop After: 10000 loops, best of 3: 137 usec per loop |
|||
| msg236432 - (view) | Author: STINNER Victor (vstinner) * | Date: 2015-02-23 09:50 | |
The patch looks good to me. For utint32_t, see my old issue #17884: "Try to reuse stdint.h types like int32_t". |
|||
| msg236839 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2015-02-27 21:01 | |
> see my old issue #17884 ... and you can also re-read my explanations in that issue about why simply using uint32_t and int32_t doesn't work! We need something like PY_UINT32_T (and co) for portability. The only part of #17884 that's still valid is that it may well make sense to insist that exact-width 32-bit and 64-bit signed and unsigned integer types are available when building Python. |
|||
| msg239379 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2015-03-27 07:58 | |
Updated patch addresses some Victor's comments. |
|||
| msg243051 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2015-05-13 07:19 | |
Ping. |
|||
| msg243052 - (view) | Author: Antoine Pitrou (pitrou) * | Date: 2015-05-13 07:57 | |
You should fix the comment as mentioned in the review, otherwise looks good to me. |
|||
| msg243060 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2015-05-13 09:08 | |
This is good to go. |
|||
| msg243061 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2015-05-13 09:09 | |
This would have gone quicker if the size bug-fix hadn't been commingled with the optimization. |
|||
| msg243075 - (view) | Author: Roundup Robot (python-dev) | Date: 2015-05-13 12:04 | |
New changeset 4b5461dcd190 by Serhiy Storchaka in branch 'default': Issue #23488: Random generator objects now consume 2x less memory on 64-bit. https://hg.python.org/cpython/rev/4b5461dcd190 |
|||
| msg243455 - (view) | Author: Roundup Robot (python-dev) | Date: 2015-05-18 04:47 | |
New changeset 16d0e3dda31c by Zachary Ware in branch 'default': Issue #23488: Fix a syntax error on big endian platforms. https://hg.python.org/cpython/rev/16d0e3dda31c |
|||
| msg243461 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2015-05-18 06:36 | |
Thanks Zachary for fixing this. |
|||
| msg244050 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2015-05-25 22:03 | |
Should this be backported? IMO, it is a bug. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2015-05-25 22:03:43 | rhettinger | set | messages: + msg244050 |
| 2015-05-18 06:36:53 | serhiy.storchaka | set | messages: + msg243461 |
| 2015-05-18 04:47:20 | python-dev | set | messages: + msg243455 |
| 2015-05-13 12:06:17 | serhiy.storchaka | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
| 2015-05-13 12:04:52 | python-dev | set | nosy:
+ python-dev messages: + msg243075 |
| 2015-05-13 09:09:49 | rhettinger | set | messages: + msg243061 |
| 2015-05-13 09:08:42 | rhettinger | set | messages: + msg243060 |
| 2015-05-13 07:57:39 | pitrou | set | assignee: rhettinger -> serhiy.storchaka messages: + msg243052 |
| 2015-05-13 07:19:09 | serhiy.storchaka | set | messages: + msg243051 |
| 2015-03-27 07:58:55 | serhiy.storchaka | set | files:
+ random_uint32_2.patch messages: + msg239379 |
| 2015-02-27 21:01:52 | mark.dickinson | set | messages: + msg236839 |
| 2015-02-23 09:50:48 | vstinner | set | messages: + msg236432 |
| 2015-02-23 08:29:34 | serhiy.storchaka | set | messages: + msg236430 |
| 2015-02-23 08:14:17 | serhiy.storchaka | set | files:
+ random_uint32.patch keywords: + patch messages: + msg236429 |
| 2015-02-23 07:51:58 | mark.dickinson | set | messages: + msg236428 |
| 2015-02-20 16:32:16 | serhiy.storchaka | set | messages:
+ msg236306 stage: patch review |
| 2015-02-20 10:42:33 | pitrou | set | versions: - Python 2.7, Python 3.4 |
| 2015-02-20 10:42:26 | pitrou | set | nosy:
+ pitrou, mark.dickinson messages: + msg236265 |
| 2015-02-20 10:33:59 | vstinner | set | messages: + msg236264 |
| 2015-02-20 10:33:31 | vstinner | set | nosy:
+ vstinner messages: + msg236263 |
| 2015-02-20 10:18:59 | rhettinger | set | components: + Extension Modules |
| 2015-02-20 10:18:42 | rhettinger | create | |