tgsi to glsl conversion miscompile some write-masks

We currently mis-compile instructions with less than four components in the write-mask, that doesn't simply use the first n components. For example:

MIN OUT[1].z, TEMP[0].xyxw, CONST[0].xyyw

becomes:

gl_FragDepth = float((min( temp0[0].xyxw , uintBitsToFloat(fsconst0[0].xyyw))));

instead of something like:

gl_FragDepth = (min( temp0[0].xyxw , uintBitsToFloat(fsconst0[0].xyyw))).z;

or even:

gl_FragDepth = min( temp0[0].x , uintBitsToFloat(fsconst0[0].y))));

The problem stems from the combination of two things:

We always narrow the result by calling the constructor of the type, who will simply discard the unused components from the end (see get_destination_info() where we compute dstconv).
We don't cull unused components from the source-swizzle before the operation (see get_source_info(), ).

The problem could be fixed by changing either of these two. However:

Fixing 1) is a massive undertaking; this would require changing the format-string of every opcode. That smells super bug-prone to me.
Fixing 2) seems fragile; every format-string would have to be vetted if it's OK to change their inputs. For instance TGSI_OPCODE_LIT depends on the input-order, and would have to become much, much more complex than it is now.

So yeah, it seems a bit like we've done an early design-mistake here, and that has propagated into a lot of code. And it's pretty hard to fix.

I'd love to hear alternative ideas on how to fix this.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information

Admin message