-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[benchmark] Add ReplaceSubrange benchmark #25310
New issue
Have a question about this project? No Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “No Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? No Sign in to your account
Conversation
@milseman I'm thinking this is in your wheelhouse. |
@swift-ci please smoke test |
@swift-ci please benchmark |
Performance: -O
Code size: -OPerformance: -Osize
Code size: -OsizePerformance: -Onone
Code size: -swiftlibs
How to read the dataThe tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.If you see any unexpected regressions, you should consider fixing the Noise: Sometimes the performance results (not code size!) contain false Hardware Overview
|
@keitaito This is a great start, but can you address the issues listed at the bottom of the benchmark comment, please? |
@keitaito I suggest naming these benchmarks per naming convention as
and the file as |
You need to halve the loop multiplier. Mind the 80 character line limit. I would also just create a single testing function and pass the let t: [BenchmarkCategory] = [.validation, .api, .String]
public let StringReplaceSubrange = [
BenchmarkInfo(name: "String.replaceSubrange.SmallLiteral",
runFunction: { replaceSubrange($0, "coffee" }, tags: t),
BenchmarkInfo(name: "String.replaceSubrange.LargeLiteral",
runFunction: { replaceSubrange($0, "coffee\u{301}coffeecoffeecoffee" },
tags: t),
BenchmarkInfo(name: "ReplaceSubrangeWithLargeManagedString",
runFunction: { replaceSubrange($0, largeManaged) }, tags: t,
setUpFunction: { blackHole(largeManaged) }),
]
@inline(never)
public func replaceSubrange(_ N: Int, _ string: String) {
var copy = string
let range = string.startIndex..<string.index(after: string.startIndex)
for _ in 0 ..< 5_000*N {
copy.replaceSubrange(range, with: "t")
}
} @milseman Does it make sense to test range replacement with a single character that doesn't change the size? How about something like this? @inline(never)
public func replaceSubrange(_ N: Int, _ string: String) {
var s = string
for _ in 0 ..< 2_500*N {
let ff =
s.index(s.startIndex, offsetBy:2)..<s.index(s.startIndex, offsetBy:4)
s.replaceSubrange(ff, with: "☕️")
let cup =
s.index(s.startIndex, offsetBy:2)..<s.index(s.startIndex, offsetBy:3)
copy.replaceSubrange(cup, with: "ff")
}
CheckResults(s == string)
} Also, since |
@milseman Could you also please verify my reasoning above regarding the Also, would it make sense to do a variant that would cross the boundary between the small and large representations? Or vary the benchmarks based on the position of the range, i.e. replacing from start, in the middle or just at the end, to stress the reallocations? |
@brentdax Thanks Brent for running tests 😄 @palimondo Hi Pavol, thanks for the great feedback! I will start working on easy fixes like re-naming the file and benchmarks, and fixing line length 🔨 |
I don't think @inline(never)
public func replaceSubrange(_ N: Int, _ string: String) {
var copy = getString(string)
let range = string.startIndex..<string.index(after: string.startIndex)
for _ in 0 ..< 5_000*N {
copy.replaceSubrange(range, with: "t")
}
} So there's no measurable overhead in the getString call, but it serves as an optimization barrier, should the optimizer ever do more inter-procedural optimizations that are not contingent on prior inlining.
I think a benchmark that grows a small string to a large one could be interesting. We don't often go the other way, and we don't shrink the string under normal operations (but I don't think it matters what that operation is, and edit: As for varying the location and triggering reallocations, I think that also would be better done as a separate dedicated benchmark. This benchmark stresses the overhead of the 2-way introspection and dispatch pattern that eventually results in splicing in the underlying bytes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good if you can incorporate @palimondo 's feedback.
After thinking about this more, the largeLiteral
is likely redundant with largeManaged
, because after the very first replaceSubrange()
call, it will be managed (due to mutation) for the subsequent 10_000
operations. The large vs small split is nice, though.
Per Michael's feedback (swiftlang#25310 (review)), largeLiteral is likely redundant with largeManaged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for taking a while to update this PR, but now this PR adds String.replaceSubrange(_:with:)
benchmarks with arguments of types String, Substring, Array, and Repeated. Would you mind taking a look at this PR again when you have time?
Also, while I was working on this PR, I've got a few questions related to Swift.String. Would you mind giving me some explanations on them if you have time? 🙇♂️
|
||
private var largeManagedString: String = { | ||
return getString("coffee\u{301}coffeecoffeecoffeecoffee") | ||
}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\u{301}
was suggested to be added to the test string when I paired with Michael. @milseman would you mind reminding me the reason for this? I don't remember it now 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll guess it was in order for the string to be in particular normalization form. @milseman Do you want to vary the benchmarks also for different normalization forms? SR-8905 doesn't mention that…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering if the result of testing with the string "coffeécoffeecoffeecoffeecoffee" would be different from the one with "coffeecoffeecoffeecoffeecoffee" 🤔 If there is distinct difference, maybe we could add two benchmarks for the one with the acute accent character and the other one without it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grapheme segmentation is relevant to this benchmark, but not normalization (since there's no comparison). The difference is that the precomposed representation is a single scalar per grapheme cluster, while the decomposed (multi-scalar) form is not. The single-scalar one will hit our grapheme breaking fast-paths while the multi-scalar one will call out to ICU. Alternatively, you could use other kinds of multi-scalar graphemes clusters, such as complex emoji. I just mentioned "\u{301}"
because you can just stick it after an "e"
to get the same effect.
BenchmarkInfo( | ||
name: "Str.replaceSubrange.SmallLiteral.RepeatedChar", | ||
runFunction: { replaceSubrange($0, "coffee", with: getRepeatedCharacter(repeatedCharacter)) }, | ||
tags: tags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The benchmark name "Str.replaceSubrange.SmallLiteral.RepeatedChar" is longer than 40 characters, but I couldn't think a better name fitting 40. Maybe it can be like "Str.replaceSubrange.LargeManagedRepChar", but I was concerned "RepChar" is a little bit hard to understand that it means Repeated<Character>
. @palimondo What do you think about this naming? Please let me know if you have any suggestions on it 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at what @milseman writes in SR-8905:
replaceSubrange<C: Collection>(_:C)
- Arguments of types
String
,Substring
,Array<Character>
,Repeated<Character>
, etc
I'd say the naming convention calls for base name of String.replaceSubrange
which varies across the argument type (String
, Substring
, ArrChar
, RepChar
) for the general case of large strings. Then we'll denote the special optimization for small strings with a simple .Small
suffix and we'll get these benchmarks:
String.replaceSubrange.String
String.replaceSubrange.Substring
String.replaceSubrange.ArrChar
String.replaceSubrange.RepChar
String.replaceSubrange.String.Small
String.replaceSubrange.Substring.Small
String.replaceSubrange.ArrChar.Small
String.replaceSubrange.RepChar.Small
The longest one is String.replaceSubrange.Substring.Small
at 39 characters, just under the 40 chars limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's an awesome naming idea. I will use them. Thanks for your suggestion!
|
||
private func setupLargeManagedString() { | ||
_ = largeManagedString | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should largeManagedString
be called from setupFunction
closure before it is used for benchmarks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that in the replaceSubrange(::with:)
you already do the var copy = getString(string)
, this whole dance here is unnecessary. You should declare this as simple let largeString = "coffee\u{301}coffeecoffeecoffeecoffee"
and drop all the setUpFunction
s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. Thanks for explaining!
let range = string.startIndex..<string.index(after: string.startIndex) | ||
for _ in 0 ..< 5_000 * N { | ||
copy.replaceSubrange(range, with: replacingString) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the criteria for choosing this multiplying number like 5000? Does this depend on the benchmark time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. We are just trying to size the workload to run in 20–1000 μs, so that it is in a measurement sweet spot for our system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explaining!
BenchmarkInfo( | ||
name: "Str.replaceSubrange.SmallLiteral.String", | ||
runFunction: { replaceSubrange($0, "coffee", with: "t") }, | ||
tags: tags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind explaining what is the difference between the small literal string vs the large managed string? Is this something related to this small string optimization? If the string fits 15 ASCII characters length, it won't be allocated in the heap memory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. See: _SmallString
and _StringGuts
for implementation details if you're interested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the links! I will take a look at them 🔍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small form accommodates 15 UTF-8 code units in length (not just ASCII)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for clarifying the length of the small string, Michael 😄
@swift-ci please benchmark |
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think one generic test method should be able to over all of our test cases. See inline comments.
|
||
private func setupLargeManagedString() { | ||
_ = largeManagedString | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that in the replaceSubrange(::with:)
you already do the var copy = getString(string)
, this whole dance here is unnecessary. You should declare this as simple let largeString = "coffee\u{301}coffeecoffeecoffeecoffee"
and drop all the setUpFunction
s.
), | ||
BenchmarkInfo( | ||
name: "Str.replaceSubrange.SmallLiteral.Substr", | ||
runFunction: { replaceSubrange($0, "coffee", with: getSubstring("t")) }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to put an optimization barrier (by calling the getSubstring
from TestUtils
) here.
Shorter way to get a Substring
is to get a full subrange like this: "t"[...]
.
For an implementation symmetry, I'd also extract the "coffee" into smallString
constant, so that our benchmark definitions would vary only like this:
runFunction: { replaceSubrange($0, largeString, with: "t") }
runFunction: { replaceSubrange($0, smallString, with: "t") }
runFunction: { replaceSubrange($0, largeString, with: "t"[...]) }
runFunction: { replaceSubrange($0, smallString, with: "t"[...]) }
benchmark/utils/TestsUtils.swift
Outdated
@@ -322,3 +322,11 @@ public func getString(_ s: String) -> String { return s } | |||
// The same for Substring. | |||
@inline(never) | |||
public func getSubstring(_ s: Substring) -> Substring { return s } | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to define any new optimization barrier functions...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This review comment got somehow lost… Let me try again.
} | ||
|
||
@inline(never) | ||
private func replaceSubrange(_ N: Int, _ string: String, with replacingSubstring: Substring) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand @milseman's intention from SR-8905 correctly, we are designing benchmark for the generic replaceSubstring
method that varies across the String
, Substring
, Array<Character>
and Repeat<Character>
specializations.
Therefore we should be able to define single shared generic test function and vary the parameter in runFunction
closure in BenchmarkInfo
declarations.
For an example of such benchmarks, see append
variants in DataBenchmarks
@milseman Any thoughts on keeping or dropping the @inline(never)
annotation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could refactor this to a generic @inline(__always)
implementation function that takes a scale factor (because the String version should be much faster than the Repeat version) and does the loop and replaceSubrange
calls. You'd likely want to make a @inline(never)
top-level function for each individual benchmark. It doesn't save you a whole lot.
@palimondo Thank you so much for another review, Pavol! I will address your comments and fix warnings and errors from the swift-ci benchmark check report 🔨 |
Several build failure errors are suddenly thrown when running
A lot of error messages are displayed on Terminal. Here is an example of a build failure error.
I'm not sure if it's related, but these errors started being thrown after Xcode 11 beta 3 was installed on my machine, FWIW. |
I'd guess you need to do a clean build after Xcode update. |
- Re-name benchmark names - Remove setupLargeManagedString() - Remove unnecessary optimization barrier functions - Update benchmark argument string by using smallString and largeString - Remove unnecessary optimization barrier function calls - Update replaceSubrange(_:_:with) to be generic function
@palimondo Thanks for the reply on the build issue, Pavol! I ended up re-building everything from scratch 😅 |
The benchmark with
|
@keitaito there are explicit pre-specializations for the others: https://github.com/apple/swift/blob/master/stdlib/public/core/StringRangeReplaceableCollection.swift#L197 Benchmarking a non-pre-specialized argument type helps identify and track changes in the fully-generic implementation. |
@swift-ci please benchmark |
I’m running the benchmarks on CI with your current multiplier (5000), to get a relation to your local results. |
Performance: -O
Code size: -OPerformance: -Osize
Code size: -OsizePerformance: -Onone
Code size: -swiftlibs
How to read the dataThe tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.If you see any unexpected regressions, you should consider fixing the Noise: Sometimes the performance results (not code size!) contain false Hardware Overview
|
@palimondo if @keitaito fixes the workload size as recommended in the report, are we good to go?
|
@milseman If you don't foresee an order of magnitude improvement in the performance of Otherwise he should go with your recommendation from before to:
If I understand the point correctly, inlined function will allow the compiler to unroll the loop, which would not be the case if the loop multiplier was passed as parameter to non-inlined generic function? (I don't think we'd need to create per-case functions, as your next sentence said, because the inline closures play the same role and there is nowhere they can get inlined into… I think the ritualistic way we sprinkle In other words, I'd prefer to have same multiplier across all variants, but when variants differ several orders of magnitude, it is not possible to do, we should have properly tailored multiplier for each individual case ( |
Same multiplier sounds good to me if we can make all the benchmarks fit in our time windows |
+[Gardening] minor formatting (80 chars)
@swift-ci please benchmark |
@swift-ci please smoke test |
Performance: -O
Code size: -OPerformance: -Osize
Code size: -OsizePerformance: -Onone
Code size: -swiftlibs
How to read the dataThe tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.If you see any unexpected regressions, you should consider fixing the Noise: Sometimes the performance results (not code size!) contain false Hardware Overview
|
@keitaito Thank you for your patience and your contribution! |
Woot! Thank you Pavol for updating the multiplier for me, and thank you for merging the PR! I really appreciate both Pavol and Michael that you patiently and carefully gave me reviews and feedback on my PR. Sorry for taking a long time to get it merged, but I'm happy and excited it's finally merged 🙂 I learned very interesting things under the hood of Swift standard library and String APIs through working on it. Thank you so much! |
I have a question for @milseman about pre-specializations.
Does this mean that if |
Another question: would you mind explaining what's the criteria for choosing
Given these, could inlining affect benchmark performance? Looking at benchmark source code files, it seems |
The ritualistic placement of For a more targeted use of For an example of technically correct So I don’t see a very good reason to use |
@keitaito If you’re interested in playing some more with this, I think it should be possible and even beneficial — to prevent further confusion — to remove most of the |
🔨 Changes
Add ReplaceSubrange benchmark. It is still work in progress, but it would be great if I could get feedback to make sure I'm on the right direction 🙇♂️
Resolves a part of SR-8905 Gaps in String benchmarking.
Shout-out to Rob and Michael for mentoring at try! Swift San Jose!
To-do list