tgsi to glsl conversion miscompile some write-masks
We currently mis-compile instructions with less than four components in the write-mask, that doesn't simply use the first n components. For example:
MIN OUT[1].z, TEMP[0].xyxw, CONST[0].xyyw
becomes:
gl_FragDepth = float((min( temp0[0].xyxw , uintBitsToFloat(fsconst0[0].xyyw))));
instead of something like:
gl_FragDepth = (min( temp0[0].xyxw , uintBitsToFloat(fsconst0[0].xyyw))).z;
or even:
gl_FragDepth = min( temp0[0].x , uintBitsToFloat(fsconst0[0].y))));
The problem stems from the combination of two things:
- We always narrow the result by calling the constructor of the type, who will simply discard the unused components from the end (see
get_destination_info()
where we computedstconv
). - We don't cull unused components from the source-swizzle before the operation (see
get_source_info()
, ).
The problem could be fixed by changing either of these two. However:
- Fixing 1) is a massive undertaking; this would require changing the format-string of every opcode. That smells super bug-prone to me.
- Fixing 2) seems fragile; every format-string would have to be vetted if it's OK to change their inputs. For instance
TGSI_OPCODE_LIT
depends on the input-order, and would have to become much, much more complex than it is now.
So yeah, it seems a bit like we've done an early design-mistake here, and that has propagated into a lot of code. And it's pretty hard to fix.
I'd love to hear alternative ideas on how to fix this.