-
-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Support SSL Key Rotation in HTTP Server #13495
New issue
Have a question about this project? No Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “No Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? No Sign in to your account
Conversation
Signed-off-by: Keyun Tong <tongkeyun@gmail.com>
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
@russellb @robertgshaw2-redhat could you please take a look? Unfortunately I have little background on this. |
@@ -36,3 +36,4 @@ einops # Required for Qwen2-VL. | |||
compressed-tensors == 0.9.1 # required for compressed-tensors | |||
depyf==0.18.0 # required for profiling and debugging with compilation config | |||
cloudpickle # allows pickling lambda functions in model_executor/models/registry.py | |||
watchfiles # required for http server to monitor the updates of TLS files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to pin a version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API being used is pretty standard, it should not be very sensitive to specific versions.
I would say the latest version is preferred here.
Sure. What are some other examples of services that do dynamic reloading of configuration like this? I would normally expect a configuration rollout to include restart services. Running in an environment like Kubernetes makes this fairly straightforward. I think it's going to be more complicated to account for all possible cases. What if the files are just deleted? Should the service keep running with what it had previously loaded? Should it exit with an error? My preference would be to leave it as-is, and leave it to the administrator (or service automation) to decide when the service should restart with new configuration. |
@russellb For example, a thrift server also supports SSL key/cert rotation: https://github.com/facebook/fbthrift/blob/a3b88c21b4bf382d506922c2d874b21a7c06b821/thrift/lib/cpp2/server/ThriftServer.cpp#L1876-L1881 |
Thanks for the example. I've thought about this some more and I'm still not really comfortable with the feature. Automatic reloading based on files changing strikes me as very surprising behavior. An alternative that some services use is allow you to send them a SIGHUP signal to reload their configuration. This would typically be hidden behind something like systemd, so to an administrator it's Whether it was via Another reason I'm more on the side of keeping this simple is I don't expect using built-in SSL to be the production SSL endpoint in most cases. When running in Kubernetes, I'd expect a load balancer serving as ingress into the cluster would terminate SSL. In other words, I'd prefer to keep this simple and defer the more complex and dynamic configuration management to systems outside of vllm. |
I want to clarify one more thing. You should not interpret my comments as a rejection of the PR! I'm not a maintainer and don't have that authority. I'm just stating my gut reaction to the feature and certainly don't mind if the consensus after maintainer review goes toward accepting it! |
@russellb thanks for reviewing and sharing your opinion. The file monitoring is limited to SSL key rotation at the moment. Generally, I feel people shouldn't mix up the expectation of model file loading with SSL key rotation. (Do we need to make it more clearly stated through documentation?) About relying on cc: @WoosukKwon @simon-mo to hear about your suggestions. |
vllm/entrypoints/launcher.py
Outdated
watch_ssl_cert_task = None | ||
if config.ssl_keyfile and config.ssl_certfile: | ||
watch_ssl_cert_task = loop.create_task( | ||
watch_files([config.ssl_keyfile, config.ssl_certfile], | ||
update_ssl_cert_chain)) | ||
|
||
watch_ssl_ca_task = None | ||
if config.ssl_ca_certs: | ||
watch_ssl_ca_task = loop.create_task( | ||
watch_files([config.ssl_ca_certs], update_ssl_ca)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i also feel explicitly sending SIGHUP has clearer semantics if reverse proxy is not an option.
in addition, since ssl rotation is irrelevant to most users. i think we should isolate these changes to a dedicated ssl.py module instead of directly putting them in top serve_http entry point.
i also imagine more complexity may come to handle edge cases in production.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, SIGHUP isn't how TW does the key rotation, which we could consider in the long term.
Good point on feature isolation, I can gate the feature and decouple it into a dedicated file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is TW?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an infrastructure we are trying to integrate with, which we don't have a lot of control on how SSL files are delivered..
just to be clear, I don't suggest that systemd should be required for this. On the vLLM side, it's handling the SIGHUP signal. There's different ways to send the signal to trigger a reload and |
Signed-off-by: Keyun Tong <tongkeyun@gmail.com>
Signed-off-by: Keyun Tong <tongkeyun@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is well scoped enough to handle a common deployment scenarios. I do agree SIGHUP
might be a better option if designed from scratch but this PR offers isolated functionality for certain integrations.
I think there's at least a small race condition here, where either the server cert OR the CA cert will be updated, but not both (but both need to be updated). One or more connections could get handled in between updating each file.
It's a trivial change to use SIGHUP, FWIW. |
@russellb Once we have a deployment environment that supports SIGHUP signaling when certs are updated, I think we can definitely extend the functionality here to support SIGHUP mode. |
Signed-off-by: Louis Ulmer <ulmerlouis@gmail.com>
Some production setup requires TLS key/cert rotation in HTTP server. This change use watchfiles to async'ly monitor the updates of ssl key, cert, and CA files, and update the SSLContext when changes are detected.
Test cmd
vllm serve /tmp/model -tp 1 --max_num_seqs 32 --ssl-keyfile ~/test_certs/server.key --ssl-certfile ~/test_certs/server.crt --ssl-ca-certs ~/test_certs/rootCA.crt --enable-ssl-refresh
touch ~/test_certs/server.key
Server output:
INFO 02-18 11:40:01 launcher.py:24] Watching files: ['/home/ktong/test_certs/server.key', '/home/ktong/test_certs/server.crt']
INFO 02-18 11:40:01 launcher.py:24] Watching files: ['/home/ktong/test_certs/rootCA.crt']
INFO: Application startup complete.
INFO 02-18 11:42:31 launcher.py:28] File change detected: modified - /home/ktong/test_certs/server.key
INFO 02-18 11:42:31 launcher.py:57] Reloading SSL certificate chain