arm64: Refactor mov/movprfx for unmasked operations#123717
arm64: Refactor mov/movprfx for unmasked operations#123717ylpoonlg wants to merge 7 commits intodotnet:mainfrom
Conversation
* Move MOVPRFX logic from codegen to emit.
|
Tagging subscribers to this area: @JulieLeeMSFT, @dotnet/jit-contrib |
|
SPMI asmdiffs: G_M39772_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
ldr q16, [fp, #0x20] // [V00 arg0]
ldr w0, [fp, #0x1C] // [V01 arg1]
- mov v0.16b, v16.16b
+ movprfx z0, z16
insr z0.s, w0
- ;; size=16 bbWeight=1 PerfScore 8.50
+ ;; size=16 bbWeight=1 PerfScore 10.00ASIMD |
In an ideal world, the perfscore would detect instruction fusing (but let's not try to fix that here - perfscore needs fixing regardless). So, agreed, your patch is better. |
|
@dhartglassMSFT, PTAL. |
| return true; | ||
| } | ||
|
|
||
| if (ins == INS_sve_movprfx) |
There was a problem hiding this comment.
is there a reason that movprfx between the same register is not redundant, or was this more of a pre-emptive/safety thing?
Either way, can you add a comment about why this case is needed
Also if you're up for it, the comment on line 16909 mentions "3 cases" but lists four :D
There was a problem hiding this comment.
is there a reason that movprfx between the same register is not redundant, or was this more of a pre-emptive/safety thing?
I don't think currently there is any instance of movprfx being called with canSkip = false, so it was just added preemptively. It might be useful for codegen testing, or maybe @a74nh had something else in mind?
There was a problem hiding this comment.
The code below this are some specific optimisations that are trying to merge the current mov with he previous instruction. It would need some reworking to also work for movprfx.
However - movprfx is already an optimised mov, the microarch will merge it into the instruction following it. So, we probably don't want/need to try to optimise away the movprfx.
So, happy with this as is.
(Although the comment should still be updated)
| instruction ins, emitAttr attr, regNumber dstReg, regNumber srcReg, bool canSkip, insOpts opt /* = INS_OPTS_NONE */) | ||
| { | ||
| emitAttr size = EA_SIZE(attr); | ||
|
|
There was a problem hiding this comment.
assert(IsMovInstruction(ins)); similar to the one in emitIns_Mov would make sense here too
|
Hi @ylpoonlg I added a blurb in the PR description, I didn't find the description clear in 115508. Thanks Alan for filling me in on that. Hopefully will help out anyone looking in this area in the future. Please feel free to re-word it too, if you dont like the wording or if I got a detail wrong. |
| for (helper.EmitBegin(); !helper.Done(); helper.EmitCaseEnd()) | ||
| { | ||
| GetEmitter()->emitIns_R_R(INS_sve_movprfx, EA_SCALABLE, reg1, reg3); | ||
| GetEmitter()->emitInsSve_R_R(INS_sve_movprfx, EA_SCALABLE, reg1, reg3); |
There was a problem hiding this comment.
For my education, going after these would be part of the remaining work for 115508 correct?
There was a problem hiding this comment.
Yes, they will be removed in the next PR. This particular instance is just to avoid hitting
if (IsMovInstruction(ins))
{
assert(!"Please use emitIns_Mov() to correctly handle move elision");
emitIns_Mov(ins, attr, reg1, reg2, /* canSkip */ false, opt);
}
in emitIns_R_R, since we now consider INS_sve_movprfx a mov instruction.
Maybe we could do a similar thing in emitInsSve_R_R, but that would mean moving the whole emit logic for INS_sve_movprfx/INS_sve_mov into emitInsSve_Mov.
|
Change lgtm, thanks for the refactor I can merge once Alan signs off. |
This PR is the first of a few contributing to #115508.
Motivation was that the jit doesn't recognize read-modify-write instructions (
Addwith two operands for example) until later after lsra. Jit needs to prefix such instructions withmov/movprfxwhere it can't encode the dst operand. Previously, this logic was sprinkled around code-generation. This PR moves this logic to create thesemovs in a unified location, during emithwintrinsiccodegenarm64.cpp.AddCarryWideningEven/Oddinto the emit function.cc @dotnet/arm64-contrib @a74nh