-
-
Notifications
You must be signed in to change notification settings - Fork 7.1k
[Hardware][TPU] Add check for no additional graph compilation during runtime #14710
New issue
Have a question about this project? No Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “No Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? No Sign in to your account
[Hardware][TPU] Add check for no additional graph compilation during runtime #14710
Conversation
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
da62779
to
63a6bd8
Compare
Per offline discussion, use env var |
87b0e54
to
5c43c67
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks clean thanks!
0be5272
to
360003c
Compare
Signed-off-by: Siyuan Liu <lsiyuan@google.com>
Signed-off-by: Siyuan Liu <lsiyuan@google.com>
Signed-off-by: Siyuan Liu <lsiyuan@google.com>
Signed-off-by: Siyuan Liu <lsiyuan@google.com>
Signed-off-by: Siyuan Liu <lsiyuan@google.com>
360003c
to
a5496de
Compare
LGTM thanks! |
…runtime (vllm-project#14710) Signed-off-by: Siyuan Liu <lsiyuan@google.com>
…runtime (vllm-project#14710) Signed-off-by: Siyuan Liu <lsiyuan@google.com>
…runtime (vllm-project#14710) Signed-off-by: Siyuan Liu <lsiyuan@google.com> Signed-off-by: Louis Ulmer <ulmerlouis@gmail.com>
…runtime (vllm-project#14710) Signed-off-by: Siyuan Liu <lsiyuan@google.com>
Record the number of cached compilation graphs after warm up. After each execution step, check the number of cached compilation graphs is not changing.
VLLM_XLA_CHECK_RECOMPILATION
with a default value of0
to enable/disable the recompilation assertion