AOT compilation of tess, part 1
Our tessellator kernels are large, which means compiling them takes a nontrivial amount of time. yet that depends on dynamic state and happens at draw time. Yikes!
The medium-term goal is to compile them all the way to G13 assembly at mesa build time. This MR doesn't do that. But it does shake all sorts of yaks I hit while trying to get there. In particular a whole bunch of scary GL driver bugs, some driver optimizations, and so on. And simplify the tessellator so we can reduce the variants that we'll need to compile (from 81 to 6).