[proxy] bugs.python.org← back | site home | direct (HTTPS) ↗ | proxy home | ◑ dark◐ light

Issue 34867: Add mode to disable small integer and interned string caches

Created on 2018-10-02 00:40 by steven.daprano, last changed 2022-04-11 14:59 by admin.

Messages (10) msg326838 - (view) Author: Steven D'Aprano (steven.daprano) * Date: 2018-10-02 00:40
Split off from #34850 by Guido's request.

To help catch incorrect use of `is` when `==` is intended, perhaps we should add an interpreter mode that disables the caches for small ints and interned strings.

Nathaniel called it "chaos mode" but I don't like the name as there is nothing chaotic about the lack of such caches, and it doesn't come close to chaos testing (e.g. Netflix's Chaos Monkey tool).
msg326843 - (view) Author: Ammar Askar (ammar2) * Date: 2018-10-02 01:07
Maybe something more akin to UndefinedBehaviorSanitizer? Since its supposed to be catching implementation specific quirks. It wouldn't really be sanitizing though, more just making the bugs more likely to appear.
msg326850 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * Date: 2018-10-02 04:31
Adding a runtime option will hit a performance of normal execution. And it is impossible to disable interning strings completely. Some core code depends on this. I have also concerns about disabling caching an empty string.

There are also other caches on different levels.
msg326852 - (view) Author: Ammar Askar (ammar2) * Date: 2018-10-02 04:34
Serhiy, take a look at the linked ticket. The idea is that something like pytest or libregrtest will use this to bring underlying bugs to the surface. It isn't intended to be used in normal execution.
msg326853 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * Date: 2018-10-02 04:38
I don't worry about the performance when caches are disabled. An additional check will hit the performance in normal execution.
msg326854 - (view) Author: Ammar Askar (ammar2) * Date: 2018-10-02 04:40
Aah sorry, I misinterpreted what you meant. The original ticket proposes it as a compile time flag as well.
msg326860 - (view) Author: Raymond Hettinger (rhettinger) * Date: 2018-10-02 06:34
I don't think this option will be of any value.  For it to work, the code would need to have this particular bug, have test cases that triggered those bugs, and a user sophisticated enough to run the tests but unsophisticated enough to make beginner mistakes regarding when to use identity tests versus equality tests (something I teach on day one of beginner Python courses).

Before this goes further, I would like to see some evidence that it would actually catch a real bug in the wild.
msg327073 - (view) Author: Neil Schemenauer (nascheme) * Date: 2018-10-04 18:20
Woudn't turning these off hurt performance a lot?  If so, I don't know if people would actually use such a mode.  Then it becomes pretty useless.  Could we combine this idea with the PYTHONDEVMODE flag?  If PYTHONDEVMODE is turned on, we could do a check like Serhiy suggests for inappropriate 'is' comparisons.  That seems more useful to me.
msg327093 - (view) Author: Gregory P. Smith (gregory.p.smith) * Date: 2018-10-04 22:26
The intent is to use only enable this during testing / continuous integration.
msg330823 - (view) Author: Terry J. Reedy (terry.reedy) * Date: 2018-11-30 20:00
Steven, thank you for splitting this off for proper discussion.

To me, the base issue is that CPython is both the language reference implementation and, as yet, the main production implementation.  As the latter, it has unintended and unwanted bugs and intentional optimizations added for performance rather than language conformance.  Some of these, like caching, affect boolean results involving 'is' and id().  Problems arise when people confuse reference features with implementation features.

This issue proposes adding a mode that turns off certain optimization features.  There is another proposal to turn off other optimizations (again during code analysis and testing) that affect tracing results and sometimes coverage results based thereon, giving false negatives.  In either case, I see the result as a 'language reference' mode.  As Steven suggested, the result is in a sense less chaotic, not more.  A chaos mode for caching would randomly cache or not.

Multiple comments above contain 'bug'.  Given that the language leaves implementations to cache certain immutables -- or not -- the bug in code meant to be implementation independent is to depend on caching *either way*.  Turning caching off only catches the 'bug' of assuming caching, not the bug of assuming no caching.

From a math viewpoint, n is n for all n, so 'is' *is* the proper comparison for ints.  From this viewpoint, caching should be the default and having not caching most values of n, and having to use '==' instead of 'is', is the practice time-space tradeoff compromise.  

Like Raymond, I currently think that this proposal lacks sufficient justification.
History Date User Action Args 2022-04-11 14:59:06adminsetgithub: 79048 2018-11-30 20:00:55terry.reedysettype: enhancement

messages: + msg330823
nosy: + terry.reedy

2018-10-04 22:26:33gregory.p.smithsetmessages: + msg327093 2018-10-04 18:20:46naschemesetnosy: + nascheme
messages: + msg327073
2018-10-02 14:01:12jwilksetnosy: + jwilk
2018-10-02 06:34:31rhettingersetnosy: + rhettinger
messages: + msg326860
2018-10-02 04:40:08ammar2setmessages: + msg326854 2018-10-02 04:38:28serhiy.storchakasetmessages: + msg326853 2018-10-02 04:34:59ammar2setmessages: + msg326852 2018-10-02 04:31:56serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg326850
2018-10-02 01:07:58ammar2setnosy: + ammar2
messages: + msg326843
2018-10-02 00:40:12steven.dapranocreate