Cluster computation • BATSS

To speed up the computation, the batss.glm function can use parallelisation on single machines when computation = "parallel" (which is the default).

When using a cluster, parallelisation is best achieved by letting the cluster workload manager - typically Slurm on clusters running Linux - split the set of seeds (corresponding to as many simulated trials) between cluster nodes and cpus.

Let’s assume a BATSS user wants to perform a Monte Carlo simulation considering 10’000 trials and has 500 cpus to do so. The strategy we suggest consists in

running batss.glm on each cpu with a subset of the 10’000 seeds of interest specified in argument R, so that each cpu evaluates a different set of seeds,
saving the (500) batss.glm outputs as a Rdata files with the function save under different names (like one of the seed evaluated by the cpu like the first or the last one, for example),
finally use the function batss.combine to combine these outputs.

In the next Section, we show examples of use of the function batss.combine.