Skip to content

Clear runner state of previous executions before warm starts #1905

New issue

Have a question about this project? No Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “No Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? No Sign in to your account

Merged
merged 10 commits into from
Apr 9, 2025

Conversation

nicktrn
Copy link
Collaborator

@nicktrn nicktrn commented Apr 9, 2025

The main change in here is the clearing of prior state, i.e. previous run execution and its completion, before proceeding to the warm start phase. Also adds excessive debug logs (we'll have to disable these at some point).

Summary by CodeRabbit

  • New Features

    • Introduced a configurable wait period before process suspension for customizable control.
    • Enabled explicit process suspension with refined error reporting.
  • Refactor

    • Enhanced logging and diagnostic messaging for clearer tracking of process status and execution states.
    • Updated execution controls to support flexible handling of run attempts, including warm start behavior.

Copy link

changeset-bot bot commented Apr 9, 2025

⚠️ No Changeset found

Latest commit: 74cca5a

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link
Contributor

coderabbitai bot commented Apr 9, 2025

Walkthrough

This pull request introduces several enhancements across multiple modules. The changes improve logging in the run controller by incorporating environment variables and detailed contextual data. In addition, the task execution flow now supports process suspension with refined error handling, including a new SuspendedProcessError. The updates also allow for a warm start option during task execution through modified method signatures in both the task run process and task executor.

Changes

File(s) Change Summary
packages/cli-v3/src/entryPoints/managed-run-controller.ts Added new environment variable TRIGGER_PRE_SUSPEND_WAIT_MS (default: 200ms). Enhanced logging: replaced console logs with sendDebugLog including additional context. Refined snapshot handling and updated method signatures for better run attempt management.
packages/cli-v3/src/executions/taskRunProcess.ts Introduced a new private property _isBeingSuspended to track suspension state. Updated execute method to accept an optional isWarmStart flag. Added a public suspend method that triggers process suspension using a SIGKILL signal and adjusts error handling in the exit flow.
packages/core/src/v3/errors.ts Added a new error class SuspendedProcessError extending the native Error class, designed to represent a suspended process state.
packages/core/src/v3/workers/taskExecutor.ts Modified the execute method signature to take an optional isWarmStart parameter, updating the logic that sets the warm start state during task execution.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant Controller as ManagedRunController
  participant Executor as TaskExecutor
  participant Process as TaskRunProcess

  Client->>Controller: startAndExecuteRunAttempt(...)
  Controller->>Controller: Log run start details (env & context)
  Controller->>Executor: execute(..., isWarmStart?)
  Executor->>Process: execute(..., isWarmStart?)
  alt Process triggers suspension
      Process->>Process: Set _isBeingSuspended = true
      Process->>Process: kill("SIGKILL")
      Process->>Controller: Return SuspendedProcessError
      Controller->>Client: Log suspension event with detailed context
  else Normal execution
      Process->>Controller: Return successful execution outcome
      Controller->>Client: Log execution events and status updates
  end
Loading

Possibly related PRs

Suggested reviewers

  • ericallam

Poem

I'm a little rabbit, hopping on code paths so bright,
With logs that sing and variables that shine in the night.
I suspend and I execute with a twitch of my ear,
My code runs smooth, never a bug to fear.
Hoppy changes abound—let's celebrate with delight!
🥕🐇


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8382274 and 74cca5a.

📒 Files selected for processing (4)
  • packages/cli-v3/src/entryPoints/managed-run-controller.ts (35 hunks)
  • packages/cli-v3/src/executions/taskRunProcess.ts (6 hunks)
  • packages/core/src/v3/errors.ts (1 hunks)
  • packages/core/src/v3/workers/taskExecutor.ts (2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
packages/core/src/v3/workers/taskExecutor.ts (3)
packages/core/src/v3/timeout/api.ts (1)
  • signal (27-29)
packages/core/src/v3/timeout/usageTimeoutManager.ts (1)
  • signal (12-14)
packages/core/src/v3/taskContext/index.ts (1)
  • isWarmStart (34-36)
packages/cli-v3/src/entryPoints/managed-run-controller.ts (2)
packages/core/src/v3/workers/taskExecutor.ts (3)
  • execution (997-1120)
  • execution (1247-1263)
  • result (1123-1170)
packages/core/src/v3/errors.ts (1)
  • SuspendedProcessError (500-506)
⏰ Context from checks skipped due to timeout of 90000ms (4)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
  • GitHub Check: units / 🧪 Unit Tests
  • GitHub Check: typecheck / typecheck
🔇 Additional comments (39)
packages/core/src/v3/errors.ts (1)

500-506: Add new SuspendedProcessError class
This explicit error type helps represent a suspended process state more clearly.

packages/core/src/v3/workers/taskExecutor.ts (2)

92-93: Extended method signature to accept isWarmStart
No issues found. This optional parameter cleanly integrates into the existing function signature.


106-106: Propagate isWarmStart in task context
Good fallback logic with isWarmStart ?? this._isWarmStart; no concerns here.

packages/cli-v3/src/executions/taskRunProcess.ts (5)

32-32: Import SuspendedProcessError
No issues here; consistent with updated error handling.


77-77: Introduce _isBeingSuspended field
Sets default to false and enables clear tracking of the suspension state.


218-221: Extend execute(...) signature with isWarmStart
This addition is consistent with the warm start logic across the codebase.


355-359: Reject with SuspendedProcessError if isBeingSuspended
Ensures the correct error type is thrown, reflecting the process's suspension.


440-443: New suspend() method
This sets _isBeingSuspended and kills the process with SIGKILL to trigger suspension handling.

packages/cli-v3/src/entryPoints/managed-run-controller.ts (31)

11-11: Import SuspendedProcessError
No issues; aligns with the new error handling strategy.


59-59: Add TRIGGER_PRE_SUSPEND_WAIT_MS
Provides a configurable wait time before suspending. This is a useful addition.


174-185: Initialize controller log properties
Populating these properties for debug logs is a good practice.


186-188: Set heartbeat/snapshot intervals
No issues found; it clearly sets default intervals.


190-191: Initialize metadataClient
Allows dynamic environment overrides. Looks good.


208-211: Log snapshot poll skipping
Helps clarify execution flow when no run is present.


231-236: Log snapshot poll failure
Useful for diagnosing poll issues.


256-260: Log polling error
No concerns here; improves visibility into failures.


267-269: Log skipping heartbeat
Makes sense when run or snapshot ID is missing.


274-276: Log heartbeat started
Further clarity on lifecycle events, no issues.


305-309: Log SIGTERM
Provides helpful debugging information for graceful shutdown.


358-361: Log snapshot not changed
No concerns; ensures we skip redundant updates.


401-405: Log skipping exit from run phase
No issues; good for clarifying logic branches.


411-415: Log same run
Prevents unnecessary transition if the run ID matches.


420-423: Log exiting run phase
Clearly indicates the transition out of a run phase.


482-484: Log handleSnapshotChange locked
No issues; short-circuits concurrent snapshot updates.


492-499: Log missing snapshot
Straightforward debug log for clarity.


515-518: Log snapshot not changed
Good for diagnosing whether updates to snapshots are needed.


563-569: Log failed to cancel attempt
No concerns; helps diagnose cancellation issues.


578-581: Log run finished
Indicates the run's terminal state.


584-587: Suspend run if there's an active execution
Ensures clearing prior state before finishing.


624-624: Wait using TRIGGER_PRE_SUSPEND_WAIT_MS
Implements a pause before suspension; this is a sensible approach.


628-634: Log snapshot changed after threshold
No issues; clarifies concurrency timing.


640-647: Log missing run ID or snapshot ID after suspension
No concerns; handles edge case.


656-663: Log failed to suspend run
Helps track suspension failures.


690-694: Log run is suspending
Acknowledges the suspension request.


705-706: Suspend run
Triggers the new SuspendedProcessError approach.


711-714: Log run pending execution
No issues; a minor but clear logging addition.


887-888: Add isWarmStart param
No concerns; consistent with PR objective of clearing state between warm starts.


897-898: Skip lock check for immediate retry
Helps concurrency by bypassing the lock under certain conditions.


1003-1004: Suspend run logic on SuspendedProcessError
Allows the system to handle suspended runs gracefully.

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai plan to trigger planning for file edits and PR creation.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@nicktrn nicktrn merged commit 98a3cfb into main Apr 9, 2025
12 checks passed
@matt-aitken matt-aitken deleted the fix/runner-clean-slate branch April 9, 2025 14:53
No Sign up for free to join this conversation on GitHub. Already have an account? No Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants