Executorctl: poor performance when packing the job folder
Packing the job folder can take a lot of time in the infrastructure, and we can see a 100% utilization while this is going on.
It seems like we are currently not only generating a tarball of the job folder, but we also compress it. Additionally, the default compression level python goes for is 9
while the tar
command defaults to 6
:
For modes 'w:gz', 'r:gz', 'w:bz2', 'r:bz2', 'x:gz', 'x:bz2', tarfile.open() accepts the keyword argument
compresslevel (default 9) to specify the compression level of the file.
Source: https://docs.python.org/3/library/tarfile.html
I would suggest experimenting with compression levels between 1
and 6
. I have a feeling 4
is what we should be going for as most jobs are local and never bandwidth limited like they would if they were submitted by users over the network.
Alternatively, we may want to detect if we are running locally or not, in which case we may want to skip compressing altogether. In this case, we could default to the compression level 6
.