Skip to content

Avoid using Regex for matching URL paths when possible as doing so causes more of the request time to be spent in regex then doing the API call itself. #24873

New issue

Have a question about this project? No Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “No Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? No Sign in to your account

Closed
LukeButters opened this issue Apr 7, 2020 · 12 comments
Labels
status: declined A suggestion or change that we don't feel we should currently apply

Comments

@LukeButters
Copy link

LukeButters commented Apr 7, 2020

Affects: I don't know of a version this does not affect.

Hi, profiling reveals that a bunch of time is spent in in this stack trace:

    at java.base@11.0.3/java.util.regex.Matcher.<init>(Matcher.java:248)
    at java.base@11.0.3/java.util.regex.Pattern.matcher(Pattern.java:1133)
    at org.springframework.util.AntPathMatcher$AntPathStringMatcher.matchStrings(AntPathMatcher.java:686)
    at org.springframework.util.AntPathMatcher.matchStrings(AntPathMatcher.java:421)
    at org.springframework.util.AntPathMatcher.doMatch(AntPathMatcher.java:218)
    at org.springframework.util.AntPathMatcher.match(AntPathMatcher.java:177)
    at org.springframework.web.servlet.mvc.condition.PatternsRequestCondition.getMatchingPattern(PatternsRequestCondition.java:258)
    at org.springframework.web.servlet.mvc.condition.PatternsRequestCondition.getMatchingPatterns(PatternsRequestCondition.java:223)
    at org.springframework.web.servlet.mvc.condition.PatternsRequestCondition.getMatchingCondition(PatternsRequestCondition.java:205)
    at org.springframework.web.servlet.mvc.method.RequestMappingInfo.getMatchingCondition(RequestMappingInfo.java:229)
    at org.springframework.web.servlet.mvc.method.RequestMappingInfoHandlerMapping.getMatchingMapping(RequestMappingInfoHandlerMapping.java:93)
    at org.springframework.web.servlet.mvc.method.RequestMappingInfoHandlerMapping.getMatchingMapping(RequestMappingInfoHandlerMapping.java:57)
    at org.springframework.web.servlet.handler.AbstractHandlerMethodMapping.addMatchingMappings(AbstractHandlerMethodMapping.java:428)
    at org.springframework.web.servlet.handler.AbstractHandlerMethodMapping.lookupHandlerMethod(AbstractHandlerMethodMapping.java:394)
    at org.springframework.web.servlet.handler.AbstractHandlerMethodMapping.getHandlerInternal(AbstractHandlerMethodMapping.java:368)
    at org.springframework.web.servlet.handler.AbstractHandlerMethodMapping.getHandlerInternal(AbstractHandlerMethodMapping.java:65)
    at org.springframework.web.servlet.handler.AbstractHandlerMapping.getHandler(AbstractHandlerMapping.java:401)
    at org.springframework.web.servlet.DispatcherServlet.getHandler(DispatcherServlet.java:1231)
    at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1014)
    at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:942)
    at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1005)
    at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:897)

I think this is trying to work out which API method to call and to do that it is matching the URL path against a regex. Regex is extremely slow, by using regex here it makes the APIs spend most of their time in this rather than doing the actual work. This is not good especially when we want to use spring to slice up an application served over many APIs e.g. micro services. I think a case in which this gets especially bad is when many APIs exist, I think this does a linear search :(.

One step towards not having all time spent in this regex matching would be to create simple matchers for some of the more trivial and common cases like those where the API method is not presenting a complex regex. I think some thought might need to go into this, you may want to collection the value of a bunch of @RequestMappings to find out what common styles of request mappings are in use.

I think this is looking at all @RequestMapping trying to find which ones match, perhaps what might also work is having the matcher defined as two sections a quick match which is fast to exclude (e.g. path length, starts with X etc) and if that matches then go on to the slow regex matching.

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged or decided on label Apr 7, 2020
@LukeButters LukeButters changed the title Avoid using Regex for matching URL paths when possible Avoid using Regex for matching URL paths when possible as doing so causes more of the request time to be spent in regex then doing the API call itself. Apr 7, 2020
@encircled
Copy link
Contributor

Hi,

regex matching will be skipped, if the requested location equals the mapping string. Can you provide a sample please?

@LukeButters
Copy link
Author

In general most URLs looks like:

@RequestMapping(value = "/something/things/foos/{foo}/bar/{id}")

In some cases when . is allowed in the values we have to work around some default in which dots are excluded this is done with this trivial regex.

@RequestMapping(value = "/something/things/foos/{foo}/bar/{id:.+}")

The other complex case we have is where the value ends with /**.

From memory when I last looked deep into the Antmatcher stuff I think in both of those cases it makes a regex and does a regex match.

@encircled
Copy link
Contributor

I did a small research and it is possible to increase from 440 ops/ms to 710 ops/ms (on my machine ;)) for cases like "/something/things/foos/{foo}/bar/{id}" by applying:

  • skip regex if the pattern is .* anyway
  • use strings comparison instead of regex matching, when the path token is a constant

However, I can't say whether it make sense to implement solution like this, need judgement from someone from spring team

@LukeButters
Copy link
Author

Do you mind sharing the code you did to test that?

@encircled
Copy link
Contributor

Sure, I will prepare the PR later

@bclozel
Copy link
Member

bclozel commented Apr 9, 2020

As a side note, I'd like to point out that we're aware that AntPathMatcher is not optimized for matching request path patterns (memory allocation, CPU). We've introduced PathPattern and PathPatternParser in Spring WebFlux since.

This implementation is not currently the default in Spring MVC as it's a breaking change - but it's a change we're considering for future versions.

We'll gladly consider improvements in AntPathMatcher but please consider:

  • running the whole Spring Framework test suite (./gradlew check) before sending PRs, as AntPathMatcher is used in many places and it's easy to get unintended consequences.
  • contribute sample patterns and requests in this issue so we could run benchmark against implementations.

Thanks!

@encircled
Copy link
Contributor

Code style issues fixed :)

I left just one sample in void matchPerformanceTest(), but it covers both cases I've described

@bclozel
Copy link
Member

bclozel commented Apr 9, 2020

I've benchmarked the proposed PR and also variants of these changes. I've reused a benchmark that we used for testing the PathPatternParser implementation.

Judging from the results, I don't think this is worth pursuing right now. We'll gain way more by switching to the new implementation.

Thanks all!

@bclozel bclozel closed this as completed Apr 9, 2020
@bclozel bclozel added status: declined A suggestion or change that we don't feel we should currently apply and removed status: waiting-for-triage An issue we've not yet triaged or decided on labels Apr 9, 2020
@LukeButters
Copy link
Author

Is it at least possible to switch to the new implementation?

@bclozel
Copy link
Member

bclozel commented Apr 10, 2020

@LukeButters this is not possible right now as there is a behavior and configuration change. We’ll consider a migration path in a future Spring version.

@encircled
Copy link
Contributor

Can you share your test scenario please? What I see with JMH is 767.614 ns/op vs 344.477 ns/op on AntPathMatcher for cases when the URL consist of constants and path variables

encircled added a commit to encircled/spring-framework that referenced this issue Apr 13, 2020
@rstoyanchev
Copy link
Contributor

As a side note, I'd like to point out that we're aware that AntPathMatcher is not optimized for matching request path patterns (memory allocation, CPU). We've introduced PathPattern and PathPatternParser in Spring WebFlux since.

This implementation is not currently the default in Spring MVC as it's a breaking change - but it's a change we're considering for future versions.

There is an issue for that now #24945.

No Sign up for free to join this conversation on GitHub. Already have an account? No Sign in to comment
Labels
status: declined A suggestion or change that we don't feel we should currently apply
Projects
None yet
Development

No branches or pull requests

5 participants