-
-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid reloading root certificates to improve concurrent performance #6667
Conversation
…hen `verify=True`
We've been hit by the slowness of
Is there anything I can do to advance this PR? |
Given this breaks the behavior of the module for a whole class of users as it's written today, there's not much to do to advance it. Even still, I'm pretty sure SSLComtext is not itself thread safe but I need to find a reference for that so loading it at the module will likely cause issues |
Here's your quote: python/cpython#95031 (comment)
Can't speak for the reliability of that quote though. Given that a lot of time is spent in |
Here is a more reliable reference from the openssl guys themselves about EDIT: Here are the official docs for openssl. Unfortunately the python SSLContext docs don't mention thread safety at all. Found this in the Python 3.13.0a5 release docs:
Asked for official clarification on the thread-safety of SSLContext here. |
If you don't trust me, then please trust David Benjamin's statement on threading. He is the main lead behind BoringSSL and did a lot of TLS stuff in Chrome browser. |
@tiran Thank you very much for your detailed commentary!
@sigmavirus24 What exactly are you unhappy about with the current solution and what would an acceptable solution for you look like? Would you prefer something like introducing a new "ssl_context" parameter somehow that could be set by the library user and be set to something like |
See #2118 I recommend that you either use urllib3 directly or switch to httpx. Most of the secret sauce of requests is in urllib3. httpx has HTTP/2 support. |
Requests supports being run (with certifi) inside a zip file created by a tool like pants, pyinstaller, etc. The code here removes that support in loading the trust stores. |
Ah, I see that the PR was updated and moved the extraction. I think the last blocker is the context being "public" in how it is named. That will encourage people to modify it in a way we don't want to be supporting. People will attempt to modify it anyway but when things go wrong, it will be clear that they weren't intended to modify it. To be clear, I expect people to use this to work around libraries that aren't written correctly that leverage requests but do not allow people to specify a trust store or customize a session. Either way, if we rename the default I'm in favor of merging this. Unless @nateprewitt has objections. (And yes, the caveats around changing ciphers or loading other trust stores is what I recalled being a problem that has odd behavior but even so, making this private will alleviate that likely spike in issues.) |
Thanks for the follow-up @sigmavirus24. I renamed the default context as requested, please let me know if you'd like any further changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just want a second set of eyes from @nateprewitt or @sethmlarson
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, this seems reasonable to me. I was curious if we'd be better off doing this per-Adapter instance instead of globally but it seems like that may not be a concern with Christian's response.
Thank you so much! This made a big improvement for my use case, which is load generation (https://github.com/locustio/locust), so lots of concurrent request.Sessions. Especially on Windows, where Python 3.12 was completely broken for for my users... Any chance 2.32 will be upon us soon? If not, a pre-release version would also be very helpful. |
We don't have solidified dates for 2.32.0 but I would wager it's sometime in mid-late June. As for a pre-release, time permitting I'll take a look at it but can't guarantee anything currently. |
@cyberw I remember doing some profiling on Windows as well and finding out that it was particularly infamous there, because the certificates were being reloaded not one, but two or three times per request. I don't recall the specific reason since I ran most of the experiments on Linux afterwards, but I think it had to do with the CA store being a folder instead of a bundle file. In any case, hopefully this mitigates the issue now that it's merged. If not, feel free to ping me and I'll try to follow up with something more specific for Windows. |
On Windows (3.12.2) the example (modified to run against github.com) takes ~1.5s without the fix and ~0.2 with it (which is actually faster than the verify=False version, at ~0.5s. Odd) |
This patch has the side-effect, that it is no longer possible to pass in a custom ssl_context via the PoolManager.
This worked in 2.31.0 and stopped working, because custom 'ssl_context' is overwritten by default 'ssl_context' with this merge request. I think it would be good to support a custom ssl_context again, since otherwise it does not seem to be possible to change the seclevel settings of python requests anymore and that is required if you are working with legacy code and old certificates that can't easily be renewed. |
thanks, @mm-matthias i am using python 3.11.7. will do more testing with other versions |
…tation (#118597) Add thread-safety clarifications to the SSLContext documentation. Per the issue: This issue has also come up [here](psf/requests#6667) where the matter was clarified by @tiran in [this comment](psf/requests#6667): > `SSLContext` is designed to be shared and used for multiple connections. It is thread safe as long as you don't reconfigure it once it is used by a connection. Adding new certs to the internal trust store is fine, but changing ciphers, verification settings, or mTLS certs can lead to surprising behavior. The problem is unrelated to threads and can even occur in a single-threaded program.
…ocumentation (pythonGH-118597) Add thread-safety clarifications to the SSLContext documentation. Per the issue: This issue has also come up [here](psf/requests#6667) where the matter was clarified by @tiran in [this comment](psf/requests#6667): > `SSLContext` is designed to be shared and used for multiple connections. It is thread safe as long as you don't reconfigure it once it is used by a connection. Adding new certs to the internal trust store is fine, but changing ciphers, verification settings, or mTLS certs can lead to surprising behavior. The problem is unrelated to threads and can even occur in a single-threaded program. (cherry picked from commit 4f59f86) Co-authored-by: mm-matthias <[email protected]>
…ocumentation (pythonGH-118597) Add thread-safety clarifications to the SSLContext documentation. Per the issue: This issue has also come up [here](psf/requests#6667) where the matter was clarified by @tiran in [this comment](psf/requests#6667): > `SSLContext` is designed to be shared and used for multiple connections. It is thread safe as long as you don't reconfigure it once it is used by a connection. Adding new certs to the internal trust store is fine, but changing ciphers, verification settings, or mTLS certs can lead to surprising behavior. The problem is unrelated to threads and can even occur in a single-threaded program. (cherry picked from commit 4f59f86) Co-authored-by: mm-matthias <[email protected]>
…documentation (GH-118597) (#120596) gh-118596: Add thread-safety clarifications to the SSLContext documentation (GH-118597) Add thread-safety clarifications to the SSLContext documentation. Per the issue: This issue has also come up [here](psf/requests#6667) where the matter was clarified by @tiran in [this comment](psf/requests#6667): > `SSLContext` is designed to be shared and used for multiple connections. It is thread safe as long as you don't reconfigure it once it is used by a connection. Adding new certs to the internal trust store is fine, but changing ciphers, verification settings, or mTLS certs can lead to surprising behavior. The problem is unrelated to threads and can even occur in a single-threaded program. (cherry picked from commit 4f59f86) Co-authored-by: mm-matthias <[email protected]>
…documentation (GH-118597) (#120596) gh-118596: Add thread-safety clarifications to the SSLContext documentation (GH-118597) Add thread-safety clarifications to the SSLContext documentation. Per the issue: This issue has also come up [here](psf/requests#6667) where the matter was clarified by @tiran in [this comment](psf/requests#6667): > `SSLContext` is designed to be shared and used for multiple connections. It is thread safe as long as you don't reconfigure it once it is used by a connection. Adding new certs to the internal trust store is fine, but changing ciphers, verification settings, or mTLS certs can lead to surprising behavior. The problem is unrelated to threads and can even occur in a single-threaded program. (cherry picked from commit 4f59f8638267aa64ad2daa0111d8b7fdc2499834) Co-authored-by: mm-matthias <[email protected]> CPython-sync-commit-latest: 7abfc92c8bb6dd75b7a82f6fadd919af6522406d
…documentation (GH-118597) (#120595) gh-118596: Add thread-safety clarifications to the SSLContext documentation (GH-118597) Add thread-safety clarifications to the SSLContext documentation. Per the issue: This issue has also come up [here](psf/requests#6667) where the matter was clarified by @tiran in [this comment](psf/requests#6667): > `SSLContext` is designed to be shared and used for multiple connections. It is thread safe as long as you don't reconfigure it once it is used by a connection. Adding new certs to the internal trust store is fine, but changing ciphers, verification settings, or mTLS certs can lead to surprising behavior. The problem is unrelated to threads and can even occur in a single-threaded program. (cherry picked from commit 4f59f86) Co-authored-by: mm-matthias <[email protected]>
Hi @mm-matthias I tried the latest cpython 3.12.4 and saw the same problem. So I written a python script below to simulate my code in production. and this script can reproduce this problem quickly with requests 2.32.3, and no such kind problem wiht requests 2.31.0. import concurrent.futures
import random
import uuid
from threading import Thread
from time import time
import requests
def do_request():
start = time()
random_id = uuid.uuid4()
delay = random.randint(1, 5)
print("start {} delay {} seconds".format(random_id, delay))
endpoints = []
endpoints.append('https://httpbin.org/delay/' + str(delay))
delay = str(random.randint(1, 5)) + 's'
endpoints.append('https://run.mocky.io/v3/0432e9f0-674f-45bd-9c18-628b861c2258?mocky-delay=' + str(delay))
random.shuffle(endpoints)
response = None
for endpoint in endpoints:
try:
print("start {} delay {} seconds".format(random_id, endpoint))
if 'run' in endpoint:
cert = './client.crt', './client.key'
response = requests.get(endpoint, timeout=random.randint(1, 5), cert=cert)
else:
response = requests.get(endpoint, timeout=random.randint(1, 5))
except Exception as e:
print(e)
end = time()
print("finished {} in {} seconds".format(random_id, end - start))
return response
def measure():
cnt = 20
with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
futures = []
for server in range(1, cnt):
futures.append(executor.submit(do_request))
for future in concurrent.futures.as_completed(futures):
pass
for i in range(1, 500):
threads = [Thread(target=measure, args=()) for _ in range(5)]
for t in threads: t.start()
for t in threads: t.join() |
@mingshuang I've run your script, but without the |
hi @mm-matthias sorry forgot to attach the certificate. the certificate can be generated with command |
…ocumentation (python#118597) Add thread-safety clarifications to the SSLContext documentation. Per the issue: This issue has also come up [here](psf/requests#6667) where the matter was clarified by @tiran in [this comment](psf/requests#6667): > `SSLContext` is designed to be shared and used for multiple connections. It is thread safe as long as you don't reconfigure it once it is used by a connection. Adding new certs to the internal trust store is fine, but changing ciphers, verification settings, or mTLS certs can lead to surprising behavior. The problem is unrelated to threads and can even occur in a single-threaded program.
…ocumentation (python#118597) Add thread-safety clarifications to the SSLContext documentation. Per the issue: This issue has also come up [here](psf/requests#6667) where the matter was clarified by @tiran in [this comment](psf/requests#6667): > `SSLContext` is designed to be shared and used for multiple connections. It is thread safe as long as you don't reconfigure it once it is used by a connection. Adding new certs to the internal trust store is fine, but changing ciphers, verification settings, or mTLS certs can lead to surprising behavior. The problem is unrelated to threads and can even occur in a single-threaded program.
…ocumentation (python#118597) Add thread-safety clarifications to the SSLContext documentation. Per the issue: This issue has also come up [here](psf/requests#6667) where the matter was clarified by @tiran in [this comment](psf/requests#6667): > `SSLContext` is designed to be shared and used for multiple connections. It is thread safe as long as you don't reconfigure it once it is used by a connection. Adding new certs to the internal trust store is fine, but changing ciphers, verification settings, or mTLS certs can lead to surprising behavior. The problem is unrelated to threads and can even occur in a single-threaded program.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Run nodes debug merge
Reproducing the problem
Let's consider the following script. It runs a bunch of concurrent requests against a URL, both with certificate verification enabled and disabled, and outputs the time it takes to do it in both cases.
What's the time difference between the two? It turns out it is highly dependent on your local configuration. On my local machine, with a relatively modern config (Python 3.12 + OpenSSL 3.0.2), the times are
~1.2s
forverify=True
and~0.5s
forverify=False
.It's a >100% difference, but we initially blamed it on cert verification taking some time. However, we observed even larger differences (>500%) in some environments, and decided to find out what was going on.
Problem description
Our main use case for requests is running lots of requests concurrently, and we spent some time bisecting this oddity to see if there was room for a performance optimization.
The issue is a bit more clear if you profile the concurrent executions. When verifying certs, these are the top 3 function calls by time spent in them:
Conversely, this is how the top 3 looks like without cert verification:
In the first case, a full 0.68 seconds are spent in the
load_verify_locations()
function of thessl
module, which configures aSSLContext
object to use a set of CA certificates for validation. Inside it, there is a C FFI call to OpenSSL'sSSL_CTX_load_verify_locations()
which is known to be quite slow. This happens once per request (hence the30
on the left).We believe that, in some cases, there is even some blocking going on, either because each FFI call locks up the GIL or because of some thread safety mechanisms in OpenSSL itself. We also think that this is more or less pronounced depending on internal changes between OpenSSL's versions, hence the variability between environments.
When cert validation isn't needed, these calls are skipped which speeds up concurrent performance dramatically.
Submitted solution
It isn't possible to skip loading root CA certificates entirely, but it isn't necessary to do it on every request. More specifically, a call to
load_verify_locations()
happens when:A new
urllib3.connectionpool.HTTPSConnectionPool
is created.On connection, by urllib3's
ssl_wrap_socket()
, when the connection'sca_certs
orca_cert_dir
attributes are set (see the relevant code).The first case doesn't need to be addressed anymore after the latest addition of
_get_connection()
. Since it now passes downpool_kwargs
, this allows urllib3 to use a cached pool with the same settings every time, instead of creating one per request.The second one is addressed in this PR. If a verified connection is requested,
_urllib3_request_context()
already makes it so that a connection pool using aSSLContext
with all relevant certificates loaded is always used. Hence, there is no need to trigger a call toload_verify_locations()
again.You can test against https://invalid.badssl.com to check that
verify=True
andverify=False
still behave as expected and are now equally fast.I'd like to mention that there have been a few changes in Requests since I started drafting this, and I'm not sure that setting
conn.ca_certs
orconn.ca_certs = cert_loc
incert_verify()
is even still needed, since I think that the logic could be moved to_urllib3_request_context()
and benefit from using a cached context in those cases too.