-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Rust should pass vectors by vector register #93490
New issue
Have a question about this project? No Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “No Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? No Sign in to your account
Comments
This doesn't make sense for integers. Yes, you've picked out a very lucky example where this allows vectorization, but generally this just means that data will have to be moved back and forth between vector and GPR registers, for the majority case where no vectorization is possible. Passing these by pointer (as in 1.47) is the right thing to do. |
I actually tried this but this is complicated because this register are influenced by |
That makes sense, but the exact same happens with floats. It is beneficial for floats to use vector registers, isn't it? struct Foo
{
float bar1;
float bar2;
float bar3;
float bar4;
};
Foo sum_cpp(Foo foo1, Foo foo2)
{
Foo foo3;
foo3.bar1 = foo1.bar1 + foo2.bar1;
foo3.bar2 = foo1.bar2 + foo2.bar2;
foo3.bar3 = foo1.bar3 + foo2.bar3;
foo3.bar4 = foo1.bar4 + foo2.bar4;
return foo3;
} Gets turned into: sum_cpp(Foo, Foo): # @sum_cpp(Foo, Foo)
addps xmm0, xmm2
addps xmm1, xmm3
ret |
It seems like that issue is discussed in: #79865. It also seems like that issue will be fixed by upgrading to LLVM 14: #79865 (comment) |
This is because clang generate a more optimized layout |
@Miksel12 I've open #93564 to fix the general issue related to the aggregation of types and I manage to also fix this issue. With my PR your example code would now be compiled to: sum_rust:
addps xmm0, xmm1
ret Even better than clang ! EDIT: It's no longer the case, due to the abi + target_features unsoundness. |
Note that if you add So arguably this more about " |
It doesn't. It passes them by reference both when using define void @_ZN10playground8sum_rust17hf4e374897bed05a9E(<4 x float>* noalias nocapture sret(<4 x float>) dereferenceable(16) %0, <4 x float>* noalias nocapture readonly align 16 dereferenceable(16) %a, <4 x float>* noalias nocapture readonly align 16 dereferenceable(16) %b) unnamed_addr #0 {
start:
%1 = load <4 x float>, <4 x float>* %a, align 16
%2 = load <4 x float>, <4 x float>* %b, align 16
%3 = fadd <4 x float> %1, %2
store <4 x float> %3, <4 x float>* %0, align 16
ret void
} |
Rust currently doesn't pass vectors of floats by vector register.
This should be able to be passed by vector registers:
But in Rust 1.47 it uses the stack:
Post 1.47, it is packed into integer registers (see this issue: #85265):
This issue should be fixed by #93405 and should bring it back to pre 1.48.
But ideally it should be optimized to:
@dotdash mentions in #85265 that this is due to Rust not using the proper types on the LLVM IR level: #85265 (comment)
EDIT:
Clang is able to use this optimization in a similar case:
Gets turned into:
The text was updated successfully, but these errors were encountered: