As in the Python 3 series, the backported ssl.create_default_context()
API is granted a backwards compatibility exemption that permits the
protocol, options, cipher and other settings of the created SSL context to
be updated in maintenance releases to use higher default security settings.
This allows them to appropriately balance compatibility and security at the
time of the maintenance release, rather than at the time of the original
feature release.
This PEP does not grant any other exemptions to the usual backwards
compatibility policy for maintenance releases. Instead, by explicitly
encouraging the use of feature based checks, it is designed to make it easier
to write more secure cross-version compatible Python software, while still
limiting the risk of breaking currently working software when upgrading to
a new Python 2.7 maintenance release.
In all cases where this proposal allows new features to be backported to
the Python 2.7 release series, it is possible to write cross-version
compatible code that operates by “feature detection” (for example, checking
for particular attributes in a module), without needing to explicitly check
the Python version.
It is then up to library and framework code to provide an appropriate warning
and fallback behaviour if a desired feature is found to be missing. While
some especially security sensitive software MAY fail outright if a desired
security feature is unavailable, most software SHOULD instead emit a warning
and continue operating using a slightly degraded security configuration.
The backported APIs allow library and application code to perform the
following actions after detecting the presence of a relevant
network security related feature:
- explicitly opt in to more secure settings (to allow the use of enhanced
security features in older maintenance releases of Python with less
secure default behaviour)
- explicitly opt in to less secure settings (to allow the use of newer Python
feature releases in lower security environments)
- determine the default setting for the feature (this MAY require explicit
Python version checks to determine the Python feature release, but DOES
NOT require checking for a specific maintenance release)
Security related changes to other modules (such as higher level networking
libraries and data format processing libraries) will continue to be made
available as backports and new modules on the Python Package Index, as
independent distribution remains the preferred approach to handling
software that must continue to evolve to handle changing development
requirements independently of the Python 2 standard library. Refer to
the Motivation and Rationale section for a review of the characteristics
that make the secure networking infrastructure worthy of special
consideration.
Under this proposal, OpenSSL may be upgraded to more recent feature releases
in Python 2.7 maintenance releases. On Linux and most other POSIX systems,
the specific version of OpenSSL used already varies, as CPython dynamically
links to the system provided OpenSSL library by default.
For the Windows binary installers, the _ssl and _hashlib modules are
statically linked with OpenSSL and the associated symbols are not exported.
Marc-Andre Lemburg indicates that updating to newer OpenSSL releases in the
egenix-pyopenssl binaries has not resulted in any reported compatibility
issues [3]
The Mac OS X binary installers historically followed the same policy as
other POSIX installations and dynamically linked to the Apple provided
OpenSSL libraries. However, Apple has now ceased updating these
cross-platform libraries, instead requiring that even cross-platform
developers adopt Mac OS X specific interfaces to access up to date security
infrastructure on their platform. Accordingly, and independently of this
PEP, the Mac OS X binary installers were already going to be switched to
statically linker newer versions of OpenSSL [4]
The creation of this PEP was prompted primarily by the aging SSL support in
the Python 2 series. As of March 2014, the Python 2.7 SSL module is
approaching four years of age, and the SSL support in the still popular
Python 2.6 release had its feature set locked six years ago.
These are simply too old to provide a foundation that can be recommended
in good conscience for secure networking software that operates over the
public internet, especially in an era where it is becoming quite clearly
evident that advanced persistent security threats are even more widespread
and more indiscriminate in their targeting than had previously been
understood. While they represented reasonable security infrastructure in
their time, the state of the art has moved on, and we need to investigate
mechanisms for effectively providing more up to date network security
infrastructure for users that, for whatever reason, are not currently in
a position to migrate to Python 3.
While the use of the system OpenSSL installation addresses many of these
concerns on Linux platforms, it doesn’t address all of them (in particular,
it is still difficult for sotware to explicitly require some higher level
security settings). The standard library support can be bypassed by using a
third party library like PyOpenSSL or Pycurl, but this still results in a
security problem, as these can be difficult dependencies to deploy, and many
users will remain unaware that they might want them. Rather than explaining
to potentially naive users how to obtain and use these libraries, it seems
better to just fix the included batteries.
In the case of the binary installers for Windows and Mac OS X that are
published on python.org, the version of OpenSSL used is entirely within
the control of the Python core development team, but is currently limited
to OpenSSL maintenance releases for the version initially shipped with the
corresponding Python feature release.
With increased popularity comes increased responsibility, and this proposal
aims to acknowledge the fact that Python’s popularity and adoption is at a
sufficiently high level that some of our design and policy decisions have
significant implications beyond the Python development community.
As one example, the Python 2 ssl module does not support the Server
Name Indication standard. While it is possible to obtain SNI support
by using the third party requests client library, actually doing so
currently requires using not only requests and its embedded dependencies,
but also half a dozen or more additional libraries. The lack of support
in the Python 2 series thus serves as an impediment to making effective
use of SNI on servers, as Python 2 clients will frequently fail to handle
it correctly.
Another more critical example is the lack of SSL hostname matching in the
Python 2 standard library - it is currently necessary to rely on a third
party library, such as requests or backports.ssl_match_hostname to
obtain that functionality in Python 2.
The Python 2 series also remains more vulnerable to remote timing attacks
on security sensitive comparisons than the Python 3 series, as it lacks a
standard library equivalent to the timing attack resistant
hmac.compare_digest() function. While appropriate secure comparison
functions can be implemented in third party extensions, many users don’t
even consider the issue and use ordinary equality comparisons instead
- while a standard library solution doesn’t automatically fix that problem,
it does make the barrier to resolution much lower once the problem is
pointed out.
Python 2.7 represents the only long term maintenance release the core
development team has provided, and it is natural that there will be things
that worked over a historically shorter maintenance lifespan that don’t work
over this longer support period. In the specific case of the problem
described in this PEP, the simplest available solution is to acknowledge
that long term maintenance of network security related modules requires
the ability to add new features, even while retaining backwards compatibility
for existing interfaces.
For those familiar with it, it is worth comparing the approach described in
this PEP with Red Hat’s handling of its long term open source support
commitments: it isn’t the RHEL 6.0 release itself that receives 10 years
worth of support, but the overall RHEL 6 series. The individual RHEL 6.x
point releases within the series then receive a wide variety of new
features, including security enhancements, all while meeting strict
backwards compatibility guarantees for existing software. The proposal
covered in this PEP brings our approach to long term maintenance more into
line with this precedent - we retain our strict backwards compatibility
requirements, but make an exception to the restriction against adding new
features.
To date, downstream redistributors have respected our upstream policy of
“no new features in Python maintenance releases”. This PEP explicitly
accepts that a more nuanced policy is appropriate in the case of network
security related features, and the specific change it describes is
deliberately designed such that it is potentially suitable for Red Hat
Enterprise Linux and its downstream derivatives.
The key requirement for a feature to be considered for inclusion in this
proposal was that it must have security implications beyond the specific
application that is written in Python and the system that application is
running on. Thus the focus on network security protocols, password storage
and related cryptographic infrastructure - Python is a popular choice for
the development of web services and clients, and thus the capabilities of
widely used Python versions have implications for the security design of
other services that may themselves be using newer versions of Python or
other development languages, but need to interoperate with clients or
servers written using older versions of Python.
The intent behind this requirement was to minimise any impact that the
introduction of this policy may have on the stability and compatibility of
maintenance releases, while still addressing some key security concerns
relating to the particular aspects of Python 2.7. It would be thoroughly
counterproductive if end users became as cautious about updating to new
Python 2.7 maintenance releases as they are about updating to new feature
releases within the same release series.
The ssl module changes are included in this proposal to bring the
Python 2 series up to date with the past 4 years of evolution in network
security standards, and make it easier for those standards to be broadly
adopted in both servers and clients. Similarly the hash algorithm
availability indicators in hashlib are included to make it easier for
applications to detect and employ appropriate hash definitions across both
Python 2 and 3.
The hmac.compare_digest() and hashlib.pbkdf2_hmac() are included to
help lower the barriers to secure password storage and checking in Python 2
server applications.
The os.urandom() change has been included in this proposal to further
encourage users to leave the task of providing high quality random numbers
for cryptographic use cases to operating system vendors. The use of
insufficiently random numbers has the potential to compromise any
cryptographic system, and operating system developers have more tools
available to address that problem adequately than the typical Python
application runtime.
This alternative represents the status quo. Unfortunately, it has proven
to be unworkable in practice, as the backwards compatibility implications
mean that this is a non-trivial migration process for large applications
and integration projects. While the tools for migration have evolved to
a point where it is possible to migrate even large applications
opportunistically and incrementally (rather than all at once) by updating
code to run in the large common subset of Python 2 and Python 3, using the
most recent technology often isn’t a priority in commercial environments.
Previously, this was considered an acceptable harm, as while it was an
unfortunate problem for the affected developers to have to face, it was
seen as an issue between them and their management chain to make the case
for infrastructure modernisation, and this case would become naturally
more compelling as the Python 3 series evolved.
However, now that we’re fully aware of the impact the limitations of the
Python 2 standard library may be having on the evolution of internet
security standards, I no longer believe that it is reasonable to expect
platform and application developers to resolve all of the latent defects
in an application’s Unicode correctness solely in order to gain access to
the network security enhancements already available in Python 3.
While Ubuntu (and to some extent Debian as well) are committed to porting all
default system services and scripts to Python 3, and to removing Python 2
from its default distribution images (but not from its archives), this is
a mammoth task and won’t be completed for the Ubuntu 14.04 LTS release
(at least for the desktop image - it may be achieved for the mobile and
server images).
Fedora has even more work to do to migrate, and it will take a non-trivial
amount of time to migrate the relevant infrastructure components. While
Red Hat are also actively working to make it easier for users to use more
recent versions of Python on our stable platforms, it’s going to take time
for those efforts to start having an impact on end users’ choice of version,
and any such changes also don’t benefit the core platform infrastructure
that runs in the integrated system Python by necessity.
The OpenStack migration to Python 3 is also still in its infancy, and even
though that’s a project with an extensive and relatively robust automated
test suite, it’s still large enough that it is going to take quite some time
to migrate fully to a Python 2/3 compatible code base.
And that’s just three of the highest profile open source projects that
make heavy use of Python. Given the likely existence of large amounts of
legacy code that lacks the kind of automated regression test suite needed
to help support a migration from Python 2 to Python 3, there are likely to
be many cases where reimplementation (perhaps even in Python 3) proves
easier than migration. The key point of this PEP is that those situations
affect more people than just the developers and users of the affected
application: the existence of clients and servers with outdated network
security infrastructure becomes something that developers of secure
networked services need to take into account as part of their security
design, and that’s a problem that inhibits the adoption of better security
standards.
As Terry Reedy noted, if we try to persist with the status quo, the likely
outcome is that commercial redistributors will attempt to do something
like this on behalf of their customers anyway, but in a potentially
inconsistent and ad hoc manner. By drawing the scope definition process
into the upstream project we are in a better position to influence the
approach taken to address the situation and to help ensure some consistency
across redistributors.
The problem is real, so something needs to change, and this PEP describes
my preferred approach to addressing the situation.
With sufficient corporate support, it likely would be possible to create
and release Python 2.8 (it’s highly unlikely such a project would garner
enough interest to be achievable with only volunteers). However, this
wouldn’t actually solve the problem, as the aim is to provide a relatively
low impact way to incorporate enhanced security features into integrated
products and deployments that make use of Python 2.
Upgrading to a new Python feature release would mean both more work for the
core development team, as well as a more disruptive update that most
potential end users would likely just skip entirely.
Attempting to create a Python 2.8 release would also bring in suggestions
to backport many additional features from Python 3 (such as tracemalloc
and the improved coroutine support), making the migration from Python 2.7
to this hypothetical 2.8 release even riskier and more disruptive.
This is not a recommended approach, as it would involve substantial
additional work for a result that is actually less effective in achieving
the original aim (which is to eliminate the current widespread use of the
aging network security infrastructure in the Python 2 series).
Furthermore, while I can’t make any commitments to actually addressing
this issue on Red Hat platforms, I can categorically rule out the idea
of a Python 2.8 being of any use to me in even attempting to get it
addressed.
While this initially appears to be an attractive and easier to manage
approach, it actually suffers from several significant problems.
Firstly, this is complex, low level, cross-platform code that integrates
with the underlying operating system across a variety of POSIX platforms
(including Mac OS X) and Windows. The CPython BuildBot fleet is already set
up to handle continuous integration in that context, but most of the
freely available continuous integration services just offer Linux, and
perhaps paid access to Windows. Those services work reasonably well for
software that largely runs on the abstraction layers offered by Python and
other dynamic languages, as well as the more comprehensive abstraction
offered by the JVM, but won’t suffice for the kind of code involved here.
The OpenSSL dependency for the network security support also qualifies as
the kind of “complex binary dependency” that isn’t yet handled well by the
pip based software distribution ecosystem. Relying on a third party
binary dependency also creates potential compatibility problems for pip
when running on other interpreters like PyPy.
Another practical problem with the idea is the fact that pip itself
relies on the ssl support in the standard library (with some additional
support from a bundled copy of requests, which in turn bundles
backport.ssl_match_hostname), and hence would require any replacement
module to also be bundled within pip. This wouldn’t pose any
insurmountable difficulties (it’s just another dependency to vendor), but
it would mean yet another copy of OpenSSL to keep up to date.
This approach also has the same flaw as all other “improve security by
renaming things” approaches: they completely miss the users who most need
help, and raise significant barriers against being able to encourage users
to do the right thing when their infrastructure supports it (since
“use this other module” is a much higher impact change than “turn on this
higher security setting”). Deprecating the aging SSL infrastructure in the
standard library in favour of an external module would be even more user
hostile than accepting the slightly increased risk of regressions associated
with upgrading it in place.
Last, but certainly not least, this approach suffers from the same problem
as the idea of doing a Python 2.8 release: likely not solving the actual
problem. Commercial redistributors of Python are set up to redistribute
Python, and a pre-existing set of additional packages. Getting new
packages added to the pre-existing set can be done, but means approaching
each and every redistributor and asking them to update their
repackaging process accordingly. By contrast, the approach described in
this PEP would require redistributors to deliberately opt out of the
security enhancements by deliberately downgrading the provided network
security infrastructure, which most of them are unlikely to do.
Earlier versions of this PEP included the concept of a 2.7-legacy-ssl
branch that preserved the exact feature set of the Python 2.7.6 network
security infrastructure.
In my opinion, anyone that actually wants this is almost certainly making a
mistake, and if they insist they really do want it in their specific
situation, they’re welcome to either make it themselves or arrange for a
downstream redistributor to make it for them.
If they are made publicly available, any such rebuilds should be referred to
as “Python 2.7 with Legacy SSL” to clearly distinguish them from the official
Python 2.7 releases that include more up to date network security
infrastructure.
After the first Python 2.7 maintenance release that implements this PEP, it
would also be appropriate to refer to Python 2.7.6 and earlier releases as
“Python 2.7 with Legacy SSL”.
Earlier versions of this PEP suggested synchronising the hmac,
hashlib and ssl modules entirely with their Python 3 counterparts.
This approach proved too vague to build a compelling case for the exception,
and has thus been replaced by the current more explicit proposal.
Earlier versions of this PEP suggested a general policy change related to
future Python 3 enhancements that impact the general security of the
internet.
That approach created unnecessary uncertainty, so it has been simplified to
propose backport a specific concrete set of changes. Future feature
backport proposals can refer back to this PEP as precedent, but it will
still be necessary to make a specific case for each feature addition to
the Python 2.7 long-term support release.