1 job for multiple VM binds
Looking for some quick feedback before working on this more, based on top of [1]. Last 20 patches are new. I suggest briefly looking at the patches but the end result is what is important.
Very high level pseudo code for new VM bind flow, the key being evrything is now based on xe_vma_ops which are created at the IOCTL level and passed down into VM, PT, and MIGRATE layers to create 1 job (per tile) no matter how many VMA operations there are. If an error occurs at any time in the flow, everything is unwound all the back to the IOCTL (unwind is WIP but code structed to do this). Rebinds (from exec, preempt rebind worker, and page faults) all use a dummy xe_vma_ops to hook into the VM bind code.
vm_bind_ioctl()
for each VM IOCTL operation
create are parse into VMA operations
while drm exec loop
for each VMA operation
lock and validate each VMA operation
fence = xe_vma_ops_execute()
install fence into VM dma-resv slots
for each VMA operation
install fence into external BO slots
signal out fences
return
xe_vma_ops_execute()
for each tile
setup PT arguments
for each tile
prepare PT operations
for each tile
fence = run PT operations in 1 job
for each tile
commit PT operations
create composite fence
return composite fence
error_cleanup:
for each tile
abort PT operations
return err
State of the code:
- Equivalent functionality to what is place (VM killed just killed on error after 'prepare PT operations' step) but with 1 job per IOCTL
- Code structured so proper error handling can be implemented
Next steps:
- Code proper error handling after 'prepare PT operations' step
- rework trace points
- Add prefetch suppression of unnecessary rebinds
- Add CPU bind path in run_job()
- Rebind worker gets queued more often than needed
- multi-GT needs some work (fence install, corking of jobs, etc...)