[proxy] web.archive.org← back | site home | direct (HTTPS) ↗ | proxy home | ◑ dark◐ light

Issue 25761: Improve unpickling errors handling

Created on 2015-11-29 10:59 by serhiy.storchaka, last changed 2016-09-08 08:07 by serhiy.storchaka. This issue is now closed.

Messages (7) msg255567 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * Date: 2015-11-29 10:59
This issue is for better detecting and reporting errors in broken pickle data.
msg255569 - (view) Author: Roundup Robot (python-dev) Date: 2015-11-29 11:14
New changeset d47e5b162072 by Serhiy Storchaka in branch '3.4':
Issue #25761: Added more test cases for testing unpickling broken data.
https://hg.python.org/cpython/rev/d47e5b162072

New changeset c7e7d77ef8bf by Serhiy Storchaka in branch '2.7':
Issue #25761: Added more test cases for testing unpickling broken data.
https://hg.python.org/cpython/rev/c7e7d77ef8bf

New changeset 4897438543da by Serhiy Storchaka in branch '3.5':
Issue #25761: Added more test cases for testing unpickling broken data.
https://hg.python.org/cpython/rev/4897438543da

New changeset c852c7d8d681 by Serhiy Storchaka in branch 'default':
Issue #25761: Added more test cases for testing unpickling broken data.
https://hg.python.org/cpython/rev/c852c7d8d681
msg255571 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * Date: 2015-11-29 12:19
Proposed patch allows unpickler to detect errors related to reading a stack behind a mark.

For now Python implementation just pops a sentinel used as a mark. This can cause TypeError, AttributeError or UnpicklingError besides IndexError:

>>> pickle._loads(b'}(NNs.')
Traceback (most recent call last):
  ...
TypeError: 'object' object does not support item assignment
>>> pickle._loads(b'](Na.')
Traceback (most recent call last):
  ...
AttributeError: 'object' object has no attribute 'append'

Or can silently expose the mark object:

>>> pickle._loads(b')(.')
<object object at 0xb71084b0>
>>> pickle._loads(b']](a.')
[<object object at 0xb71084d8>]

C implementation just ignores incorrect mark:

>>> pickle.loads(b'}(NNs.')
{None: None}
>>> pickle.loads(b'](Na.')
[None]
>>> pickle.loads(b')(.')
()
>>> pickle.loads(b']](a.')
[[]]

But in case of complex data this can cause errors later.

With the patch C implementation always raises UnpicklingError with relevant message and Python implementation always raises IndexError.

>>> pickle.loads(b'}(NNs.')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'pickle' is not defined
>>> import pickle, pickletools
>>> pickle.loads(b'}(NNs.')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
_pickle.UnpicklingError: unexpected MARK found
>>> pickle.loads(b'](Na.')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
_pickle.UnpicklingError: unexpected MARK found
>>> pickle.loads(b')(.')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
_pickle.UnpicklingError: unexpected MARK found
>>> pickle.loads(b']](a.')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
_pickle.UnpicklingError: unexpected MARK found
>>> pickle._loads(b'}(NNs.')
Traceback (most recent call last):
  ...
IndexError: list index out of range
>>> pickle._loads(b'](Na.')
Traceback (most recent call last):
  ...
IndexError: list index out of range
>>> pickle._loads(b')(.')
Traceback (most recent call last):
  ...
IndexError: pop from empty list
>>> pickle._loads(b']](a.')
Traceback (most recent call last):
  ...
IndexError: pop from empty list
msg256028 - (view) Author: Roundup Robot (python-dev) Date: 2015-12-06 20:02
New changeset 5c670af0100f by Serhiy Storchaka in branch 'default':
Issue #25761: Improved detecting errors in broken pickle data.
https://hg.python.org/cpython/rev/5c670af0100f
msg256032 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * Date: 2015-12-06 21:36
When pickle stream is unexpectedly ended, different exceptions can be raised. EOFError("Ran out of input") is raised when the stream is unexpectedly ended without the STOP opcode. But it can be raised also when the data for the opcode is incomplete. Other possible exceptions are UnpicklingError, AttributeError and ValueError.

Examples:

>>> pickle.loads(b'L')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
_pickle.UnpicklingError: pickle data was truncated
>>> pickle.loads(b'L10')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
EOFError: Ran out of input
>>> pickle.loads(b'L10L')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '10L'
>>> pickle.loads(b"S'abc'")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
_pickle.UnpicklingError: the STRING opcode argument must be quoted
>>> pickle.loads(b'(cbuiltins\nlist')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: Can't get attribute 'lis' on <module 'builtins' (built-in)>

Following patch makes C implementation of unpickler always raise UnpicklingError("pickle data was truncated") if the data for the opcode is truncated (above examples). EOFError("Ran out of input") is raised when the stream is unexpectedly ended without the STOP opcode, as before.

I'm not sure, may be always raise UnpicklingError or EOFError? Or change error message?
msg256055 - (view) Author: Roundup Robot (python-dev) Date: 2015-12-07 09:32
New changeset 001514146c21 by Serhiy Storchaka in branch 'default':
Issue #25761: Fixed reference leak added in previous changeset (5c670af0100f).
https://hg.python.org/cpython/rev/001514146c21
msg274626 - (view) Author: Roundup Robot (python-dev) Date: 2016-09-06 20:55
New changeset 231f578dfd3d by Serhiy Storchaka in branch 'default':
Issue #25761: Improved error reporting about truncated pickle data in
https://hg.python.org/cpython/rev/231f578dfd3d
History Date User Action Args 2016-09-08 08:07:23serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved 2016-09-06 20:55:32python-devsetmessages: + msg274626 2015-12-07 09:32:41python-devsetmessages: + msg256055 2015-12-06 21:36:37serhiy.storchakasetfiles: + unpickling_eof_errors.patch

messages: + msg256032

2015-12-06 20:02:15python-devsetmessages: + msg256028 2015-11-29 19:09:45serhiy.storchakasetnosy: + pitrou, alexandre.vassalotti
2015-11-29 12:19:34serhiy.storchakasetfiles: + unpickling_mark_errors.patch
versions: + Python 3.6
messages: + msg255571

keywords: + patch
stage: patch review

2015-11-29 11:14:44python-devsetnosy: + python-dev
messages: + msg255569
2015-11-29 10:59:25serhiy.storchakacreate