Issue38147
Created on 2019-09-12 19:34 by sir-sigurd, last changed 2022-04-11 14:59 by admin.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| disasm.txt | sir-sigurd, 2019-09-19 15:27 | |||
| Messages (6) | |||
|---|---|---|---|
| msg352228 - (view) | Author: Sergey Fedoseev (sir-sigurd) * | Date: 2019-09-12 19:34 | |
GCC (along with Clang and ICC) has __builtin_unreachable() and MSVC has __assume() builtins, that can be used to optimize out unreachable conditions. So we could add macro like this: #ifdef Py_DEBUG # define Py_ASSUME(cond) (assert(cond)) #else # if defined(_MSC_VER) # define Py_ASSUME(cond) (__assume(cond)) # elif defined(__GNUC__) # define Py_ASSUME(cond) (cond? (void)0: __builtin_unreachable()) # else # define Py_ASSUME(cond) ((void)0); # endif #endif Here's a pair of really simple examples showing how it can optimize code: https://godbolt.org/z/g9LYXF. Real world example. _PyLong_Copy() [1] calls _PyLong_New() [2]. _PyLong_New() checks the size, so that overflow does not occur. This check is redundant when _PyLong_New() is called from _PyLong_Copy(). We could add a function that bypass that check, but in LTO build PyObject_MALLOC() is inlined into _PyLong_New() and it also checks the size. Adding Py_ASSUME((size_t)size <= MAX_LONG_DIGITS) allows to bypass both checks. [1] https://github.com/python/cpython/blob/3a4f66707e824ef3a8384827590ebaa6ca463dc0/Objects/longobject.c#L287-L309 [2] https://github.com/python/cpython/blob/3a4f66707e824ef3a8384827590ebaa6ca463dc0/Objects/longobject.c#L264-L283 |
|||
| msg352792 - (view) | Author: STINNER Victor (vstinner) * | Date: 2019-09-19 13:43 | |
> Real world example. _PyLong_Copy() [1] calls _PyLong_New() [2]. _PyLong_New() checks the size, so that overflow does not occur. This check is redundant when _PyLong_New() is called from _PyLong_Copy(). We could add a function that bypass that check, but in LTO build PyObject_MALLOC() is inlined into _PyLong_New() and it also checks the size. Adding Py_ASSUME((size_t)size <= MAX_LONG_DIGITS) allows to bypass both checks.
This sounds like a bad usage of __builtin_unreachable().
_PyLong_New() must always check that size <= MAX_LONG_DIGITS, the check must not be optimized by the compiler.
__builtin_unreachable() must only be used if the code really be reached by design.
For example:
if (...) { Py_FatalError("oops)"; __builtin_unreachable() }
But it's a bad example, since Py_FatalError is decorated with the "noreturn" attribute, so the compiler should already know that Py_FatalError() never returns.
|
|||
| msg352793 - (view) | Author: STINNER Victor (vstinner) * | Date: 2019-09-19 13:44 | |
> This check is redundant when _PyLong_New() is called from _PyLong_Copy(). If you care of _PyLong_Copy() performance, you should somehow manually inline _PyLong_New() inside _PyLong_Copy(). |
|||
| msg352800 - (view) | Author: Sergey Fedoseev (sir-sigurd) * | Date: 2019-09-19 15:27 | |
> If you care of _PyLong_Copy() performance, you should somehow manually inline _PyLong_New() inside _PyLong_Copy(). It doesn't solve this: > We could add a function that bypass that check, but in LTO build PyObject_MALLOC() is inlined into _PyLong_New() and it also checks the size. Adding Py_ASSUME((size_t)size <= MAX_LONG_DIGITS) allows to bypass both checks. Here's example: https://github.com/sir-sigurd/cpython/commit/c8699d0c614a18d558216ae7d432107147c95c28. I attach some disassembly from this example compiled with LTO, to demonstrate how the proposed macro affects generated code. |
|||
| msg352803 - (view) | Author: STINNER Victor (vstinner) * | Date: 2019-09-19 15:50 | |
> Here's example: https://github.com/sir-sigurd/cpython/commit/c8699d0c614a18d558216ae7d432107147c95c28. "_Py_ASSUME((size_t)size <= MAX_LONG_DIGITS);" Typically, such code use assert() and is removed for release build. assert() is more for contract base programming: when the error "cannot" happen at runtime (it would be a programming error). For other cases, I prefer to always emit code to handle the error (the error can happen, for example, the function must check inputs), even in release build. |
|||
| msg353027 - (view) | Author: STINNER Victor (vstinner) * | Date: 2019-09-23 14:57 | |
See also bpo-38249. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:59:20 | admin | set | github: 82328 |
| 2019-09-23 14:57:12 | vstinner | set | messages: + msg353027 |
| 2019-09-19 15:51:35 | vstinner | set | title: add macro for __builtin_unreachable -> add Py_ASSUME() macro for __builtin_unreachable() |
| 2019-09-19 15:50:37 | vstinner | set | messages: + msg352803 |
| 2019-09-19 15:27:43 | sir-sigurd | set | files:
+ disasm.txt messages: + msg352800 |
| 2019-09-19 13:44:07 | vstinner | set | messages: + msg352793 |
| 2019-09-19 13:43:13 | vstinner | set | nosy:
+ vstinner messages: + msg352792 |
| 2019-09-12 19:34:23 | sir-sigurd | create | |