[core] Add tags parameter to wake_up() #15500

erictang000 · 2025-03-25T20:59:03Z

Addresses #15254

Adds optional tags parameter for all calls to wake_up (for both online and offline mode). Previous behavior for calling wake_up() with no parameters should remain unchanged (will reallocate both weights and kv_cache together), but now the user has the option to call wake_up(tags=["weights"]), then wake_up(tags=["kv_cache"] in order to support better weight updating for RLHF (more details in #15254)

Also fixes tests/entrypoints/openai/test_sleep.

Signed-off-by: Eric <erictang000@gmail.com>

github-actions · 2025-03-25T20:59:12Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

comaniac

Also cc @youkaichao

vllm/v1/worker/gpu_worker.py

Signed-off-by: Eric <erictang000@gmail.com>

youkaichao · 2025-03-27T10:39:55Z

vllm/executor/executor_base.py

            logger.warning("Executor is not sleeping.")
            return
        time_before_wakeup = time.perf_counter()
-        self.collective_rpc("wake_up")
+        self.collective_rpc("wake_up", kwargs=dict(tags=tags))
        time_after_wakeup = time.perf_counter()
        self.is_sleeping = False
        logger.info("It took %.6f seconds to wake up.",


we should add the wake-up tags in the logging.

we should also track how many tags are sleeping / waken up, and set self.is_sleeping = False only after all tags are waken up.

since you cannot access the allocator in the executor, i'm fine with hard-coding sleeping_tags = ("weights", "kv_caches") when we call sleep()

youkaichao · 2025-03-27T10:43:26Z

tests/basic_correctness/test_cumem.py

+@create_new_process_for_each_test()
+@pytest.mark.parametrize("model, use_v1", [("meta-llama/Llama-3.2-1B", True),
+                                           ("meta-llama/Llama-3.2-1B", False)])
+def test_end_to_end_with_tags(monkeypatch: pytest.MonkeyPatch, model: str,


imo it's too heavy to test both w/ and w/o flags in the ci. let's remove this test and only test it in the api server then.

moved the test logic under the existing test_end_to_end to avoid the reinitialization for now if that's better? can also delete entirely but maybe good to have a check for memory utilization looking correct with the wake_up("weights") call since that's the core motivation for this pr.

vllm/entrypoints/llm.py

Signed-off-by: Eric <erictang000@gmail.com>

mergify · 2025-03-27T23:50:25Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @erictang000.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

…up_tags Signed-off-by: Eric <erictang000@gmail.com>

comaniac

LGTM. Just nits. Leave to @youkaichao

vllm/executor/executor_base.py

Signed-off-by: Eric <erictang000@gmail.com>

vllm/entrypoints/openai/api_server.py

mergify · 2025-03-28T10:00:53Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @erictang000.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

youkaichao

LGTM in general, thanks for adding the functionality! one comment is please use query_params instead of json , to keep consistent with the rest code.

in addition, update the tests using params, as is done in #14373

Signed-off-by: Eric <erictang000@gmail.com>

…up_tags Signed-off-by: Eric <erictang000@gmail.com>

erictang000 · 2025-03-31T18:18:13Z

fixed!

DarkLight1337 · 2025-04-01T08:20:16Z

Please fix the merge conflict

mergify · 2025-04-01T08:21:16Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @erictang000.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

…up_tags Signed-off-by: Eric <erictang000@gmail.com>

erictang000 · 2025-04-01T20:36:15Z

fixed!

Signed-off-by: Eric <erictang000@gmail.com>

Signed-off-by: Eric <erictang000@gmail.com> Signed-off-by: xinyuxiao <xinyuxiao2024@gmail.com>

Signed-off-by: Eric <erictang000@gmail.com> Signed-off-by: Louis Ulmer <ulmerlouis@gmail.com>

Signed-off-by: Eric <erictang000@gmail.com>

This is a memory optimization method implemented based on this [fix](vllm-project/vllm#15500). I just successfully ran a 72B model on 8*H800 cards. Before the fix, I would encounter an OOM issue. Please note that this fix is only effective for vLLM >= 0.8.3.

add support for waking up with tags

ce975b5

Signed-off-by: Eric <erictang000@gmail.com>

mergify bot added frontend v1 labels Mar 25, 2025

erictang000 marked this pull request as ready for review March 25, 2025 21:25

erictang000 requested review from DarkLight1337, robertgshaw2-redhat, simon-mo, WoosukKwon, njhill, ywang96, comaniac, alexm-redhat, zhuohan123 and youkaichao as code owners March 25, 2025 21:25

comaniac reviewed Mar 25, 2025

View reviewed changes

vllm/v1/worker/gpu_worker.py Outdated Show resolved Hide resolved

make tags pass through to cumem allocator

3d0ac82

Signed-off-by: Eric <erictang000@gmail.com>

youkaichao reviewed Mar 27, 2025

View reviewed changes

comaniac reviewed Mar 27, 2025

View reviewed changes

vllm/entrypoints/llm.py Outdated Show resolved Hide resolved

address comments

aef0982

Signed-off-by: Eric <erictang000@gmail.com>

mergify bot added the needs-rebase label Mar 27, 2025

Merge branch 'main' of https://github.com/erictang000/vllm into wake_…

35714e3

…up_tags Signed-off-by: Eric <erictang000@gmail.com>

mergify bot removed the needs-rebase label Mar 27, 2025

comaniac approved these changes Mar 27, 2025

View reviewed changes

vllm/executor/executor_base.py Outdated Show resolved Hide resolved

vllm/executor/executor_base.py Outdated Show resolved Hide resolved

comaniac added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 27, 2025

address nits

b9ec17d

Signed-off-by: Eric <erictang000@gmail.com>

youkaichao reviewed Mar 28, 2025

View reviewed changes

vllm/entrypoints/openai/api_server.py Show resolved Hide resolved

youkaichao mentioned this pull request Mar 28, 2025

[Misc] Fix test_sleep to use query parameters #14373

Merged

mergify bot added the needs-rebase label Mar 28, 2025

youkaichao approved these changes Mar 28, 2025

View reviewed changes

erictang000 added 2 commits March 31, 2025 18:09

update to use query params

1ee0696

Signed-off-by: Eric <erictang000@gmail.com>

Merge branch 'main' of https://github.com/erictang000/vllm into wake_…

73e616f

…up_tags Signed-off-by: Eric <erictang000@gmail.com>

mergify bot removed the needs-rebase label Mar 31, 2025

mergify bot added the needs-rebase label Apr 1, 2025

Merge branch 'main' of https://github.com/erictang000/vllm into wake_…

1ea6e1f

…up_tags Signed-off-by: Eric <erictang000@gmail.com>

mergify bot removed the needs-rebase label Apr 1, 2025

vllm-bot merged commit ddb94c2 into vllm-project:main Apr 2, 2025
33 of 37 checks passed

StevenShi-23 pushed a commit to StevenShi-23/vllm that referenced this pull request Apr 3, 2025

[core] Add tags parameter to wake_up() (vllm-project#15500)

203e420

Signed-off-by: Eric <erictang000@gmail.com>

Alex4210987 pushed a commit to LeiWang1999/vllm-bitblas that referenced this pull request Apr 5, 2025

[core] Add tags parameter to wake_up() (vllm-project#15500)

9882841

Signed-off-by: Eric <erictang000@gmail.com> Signed-off-by: xinyuxiao <xinyuxiao2024@gmail.com>

lulmer pushed a commit to lulmer/vllm that referenced this pull request Apr 7, 2025

[core] Add tags parameter to wake_up() (vllm-project#15500)

8276af7

Signed-off-by: Eric <erictang000@gmail.com> Signed-off-by: Louis Ulmer <ulmerlouis@gmail.com>

BearBiscuit05 mentioned this pull request Apr 9, 2025

[vllm] fix oom when vllm wakeup (vllm >=0.8.3) volcengine/verl#987

Merged

nishith-fujitsu pushed a commit to nishith-fujitsu/vllm that referenced this pull request Apr 9, 2025

[core] Add tags parameter to wake_up() (vllm-project#15500)

951c92a

Signed-off-by: Eric <erictang000@gmail.com>

ckhordiasma mentioned this pull request Apr 17, 2025

[do not merge] pr test for nm changes into 2.20 red-hat-data-services/vllm#107

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[core] Add tags parameter to wake_up() #15500

[core] Add tags parameter to wake_up() #15500

erictang000 commented Mar 25, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Mar 25, 2025

comaniac left a comment

youkaichao Mar 27, 2025

erictang000 Mar 27, 2025

youkaichao Mar 27, 2025

erictang000 Mar 27, 2025

mergify bot commented Mar 27, 2025

comaniac left a comment

mergify bot commented Mar 28, 2025

youkaichao left a comment

erictang000 commented Mar 31, 2025

DarkLight1337 commented Apr 1, 2025

mergify bot commented Apr 1, 2025

erictang000 commented Apr 1, 2025

[core] Add tags parameter to wake_up() #15500

[core] Add tags parameter to wake_up() #15500

Conversation

erictang000 commented Mar 25, 2025 • edited by github-actions bot Loading

github-actions bot commented Mar 25, 2025

comaniac left a comment

Choose a reason for hiding this comment

youkaichao Mar 27, 2025

Choose a reason for hiding this comment

erictang000 Mar 27, 2025

Choose a reason for hiding this comment

youkaichao Mar 27, 2025

Choose a reason for hiding this comment

erictang000 Mar 27, 2025

Choose a reason for hiding this comment

mergify bot commented Mar 27, 2025

comaniac left a comment

Choose a reason for hiding this comment

mergify bot commented Mar 28, 2025

youkaichao left a comment

Choose a reason for hiding this comment

erictang000 commented Mar 31, 2025

DarkLight1337 commented Apr 1, 2025

mergify bot commented Apr 1, 2025

erictang000 commented Apr 1, 2025

erictang000 commented Mar 25, 2025 •

edited by github-actions bot

Loading