Page 1 of 1

Efficient way of debugging Python scripts with MPI library

Posted: Tue Jan 19, 2021 11:55 pm
by itaru
Hi,

If someone knows an efficient way of debugging a Python script (with MPI library enabled) on a parallel system, let me know.

Thanks,
Itaru.

Re: Efficient way of debugging Python scripts with MPI library

Posted: Wed Jan 20, 2021 9:24 am
by ted
General suggestions that you probably already know:
1. Do as much development/debugging as possible on a multicore desktop machine. Defer executing code on specialized parallel hardware until absolutely necessary.
2. Write code that will produce the same results regardless of the number of hosts on which it is executed. If the code is a model specification, implement it it so that it will produce the same results regardless of which cells are being simulated by any given host. If you are dealing with a network model, include a procedure that reports the model's connectivity matrix (including connection weights and delays) after model setup is complete.
3. If the script involves stochasticity, make sure that you have explicit control the pseudorandom number generator (RNG). In particular, the RNG's "seed" (or "sequence parameters") should be explicitly specified by you, not taken from the date, time of day, number of runs that have been executed etc.. Verify that each host is using the correct seed/sequence parameters.

Re: Efficient way of debugging Python scripts with MPI library

Posted: Wed Jan 20, 2021 10:50 pm
by itaru
Hi Ted,
I take all of your suggestions. I kept adding an exit() call to check where MPI library starts to failing, but took a while
to understand that I was accessing a cell that is not associated with the rank it was made; I was wishing some handy tool
like a debugger.

Itaru.

Re: Efficient way of debugging Python scripts with MPI library

Posted: Thu Jan 21, 2021 10:05 am
by ted
My suggestions pertain to development and debugging of NEURON models. It looks like your question is centered on MPI itself or NEURON's internals.

Re: Efficient way of debugging Python scripts with MPI library

Posted: Thu Jan 21, 2021 10:28 am
by hines
I was accessing a cell that is not associated with the rank it was made
This wording puzzles me as I don't know anyway for a cell not to be associated with the rank it was made on. (although if you are using multisplit there may be parts of a cell that are on ranks that are different from the rank associated with the gid of the cell's spike output location)