Skip to content

planning: Automation testing strategy for Jan #4907

New issue

Have a question about this project? No Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “No Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? No Sign in to your account

Open
david-menloai opened this issue Apr 14, 2025 · 1 comment
Open

planning: Automation testing strategy for Jan #4907

david-menloai opened this issue Apr 14, 2025 · 1 comment
Assignees
Labels
QA For tracking QA processes and initiatives

Comments

@david-menloai
Copy link
Contributor

Problem Statement:

Currently, creating automated UI tests involves a significant amount of manual effort and technical expertise. The typical process includes:

Manually Identifying Element Locators: Finding stable CSS selectors, XPaths, IDs, etc., for each UI element the test interacts with. This is often brittle and time-consuming.

  • Coding User Interactions: Writing code (e.g., using Selenium, Playwright, Cypress) to simulate user actions like clicking buttons, typing text, selecting dropdowns, etc., based on the identified locators.
  • Structuring Tests: Grouping individual actions into logical test steps and combining steps to form complete test scenarios.
  • Maintenance: Updating locators and scripts when the UI changes.
  • This multi-step, code-intensive process acts as a bottleneck, slowing down test creation, increasing maintenance overhead, and requiring specialized automation engineers.

Proposed Solution:

We propose the development of an AI Test Agent designed to drastically simplify UI test automation. The core idea is to shift the low-level implementation details (locator finding, action simulation, step coding) from the human user to an AI.

The intended workflow would be

  1. User Input: The user provides high-level test scenarios written in a Behavior-Driven Development (BDD) format (e.g., Gherkin syntax: Given/When/Then). Example:
    Gherkin
Scenario: View Jan Hardware usage
  Given I open Jan application
  When I click the 'System Monitor' button on the bottom right
  Then I should see current hardware usage for CPU and Memory
  1. AI Test Agent Processing: The AI agent takes the BDD scenario as input and performs the following:
  • Parses BDD Steps: Understands the intent described in each Given/When/Then step.
  • Interprets Actions: Maps natural language descriptions (e.g., "click the 'System Monitor' button") to corresponding UI interactions.
  • Dynamic Element Location: Intelligently identifies the target UI elements (e.g., the button labeled "System Monitor", the labels "CPU", "Memory") at runtime, potentially using a combination of visual AI and DOM analysis, reducing reliance on brittle selectors.
  • Executes Actions: Uses an underlying browser automation driver (Playwright) to perform the identified actions on the application under test.
  • Performs Validations: Checks for expected outcomes as described in "Then" steps.
  1. Output: The agent executes the test and reports the results (pass/fail) based on the BDD scenario.

Goals & Motivation:

  • Reduce Manual Effort: Significantly decrease the time and effort required to create and maintain UI automation scripts.
  • Accelerate Test Creation: Enable faster development of new automated tests.
  • Lower Technical Barrier: Allow team members with less coding experience (e.g., manual QAs, BAs) to contribute effectively to test automation by focusing on writing BDD scenarios.
  • Improve Maintainability: Reduce test fragility by relying on AI for element location, potentially making tests more resilient to minor UI changes.
  • Focus on Behavior: Shift the team's focus from how to automate (implementation details) to what to test (application behavior).

Acceptance Criteria (Initial High-Level):

  • The system accepts BDD feature files or plain text scenarios as input.
  • The AI Test Agent can parse common Gherkin keywords (Given, When, Then, And, But).
  • The AI agent can interpret and execute basic UI actions (e.g., navigate, click, type, select, verify text presence).
  • The AI agent can dynamically locate target UI elements based on textual descriptions (e.g., labels, placeholders, button text) within the BDD steps.
  • Test execution results are reported clearly, indicating pass/fail status for scenarios/steps.

Potential Challenges & Considerations:

  • Reliability and accuracy of AI-driven element identification.
  • Handling ambiguity in natural language BDD steps.
  • Performance overhead of AI analysis during test execution.
  • Integration with existing testing frameworks and CI/CD pipelines.
  • Managing complex interactions, waits, and synchronization issues.
@david-menloai david-menloai added the QA For tracking QA processes and initiatives label Apr 14, 2025
@david-menloai david-menloai self-assigned this Apr 14, 2025
@github-project-automation github-project-automation bot moved this to Investigating in Menlo Apr 14, 2025
@david-menloai
Copy link
Contributor Author

david-menloai commented Apr 14, 2025

A problem with the current tool-based implementation is the context length limitation.

{'error': {'message': "This model's maximum context length is 128000 tokens. However, your messages resulted in 206607 tokens (205528 in the messages, 1079 in the functions). Please reduce the length of the messages or functions.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}

❌ This happens with a demo website with a huge DOM 😅

No Sign up for free to join this conversation on GitHub. Already have an account? No Sign in to comment
Labels
QA For tracking QA processes and initiatives
Projects
Status: No status
Status: Investigating
Development

No branches or pull requests

1 participant