Panfrost perf-counter block fixes
The number of counter blocks depends on the number of shader cores, so we need to use that to find out how much space to allocate for counter values.
The performance counter blocks are in a different order to the list in kbase, so the block needs to be fixed in get_counter.
This patch doesn't take into account the possibility of multiple L2 caches, or "V4" GPUs (t600-t720), which use a different layout, as I haven't got any such hardware.