It would be useful to document how using the nclone=X MPI option interacts with the Loops you setup in the input file. i.e. is the total number of samples increased as you increase X or is it the same but only spread across more parallel tasks, each of which does less work ?
It would be useful to document how using the nclone=X MPI option interacts with the Loops you setup in the input file. i.e. is the total number of samples increased as you increase X or is it the same but only spread across more parallel tasks, each of which does less work ?