Distributed Computation

steve · Post by **steve** » Mon Jan 02, 2006 3:44 pm

I was wondering what the best way would be to simulate a 5000 mesh network of level 5 pyramidal model neurons. I assume the best way to do this is to distribute the computation over different machines. I have seen some older entries in the forum about this, but I was wondering what currently the best way would be -- some post mentioned NEOSIM, is this the suggested way?

As a software engineering student it could be a good experience for me to analyse some solutions.

Thanks

Raj · Post by **Raj** » Mon Jan 02, 2006 7:41 pm

I'll take the risk of refering you back to the old postings you mentioned. Parallel computing is available in the latest release, but there are at present only a few postings:

Topic including an invitation of Ted Carnevale to act as Guinea pig for testing parallel neuron:
https://www.neuron.yale.edu/phpBB2/view ... l+parallel

Topic including links provided by Michael Hines to relevant background information to using parallel neuron:
https://www.neuron.yale.edu/phpBB2/view ... l+parallel

From what I picked up from the discussions and documentation on the new parallel implementation my impression is that the MPI and PVM based parallelizations in neuron would be suitable for a larger class of problems than NEOSIM. I suggest you to have a look at the links provided by Michael Hines in the second topic and form your own opinion.

Post by **ted** » Mon Jan 02, 2006 11:13 pm

First, a general comment regarding PVM and MPI: for some time now, MPI has
superceded PVM as the preferred framework in which to take advantage of parallel
hardware.

Now some specific remarks. It is fair to say that tremendous progress has been made
with regard to the use of NEURON on parallel hardware. It is now possible to parallelize
models of networks that involve any combination of spike-triggered synaptic
transmission, gap junctions, and even continuous synaptic transmitter release. Aside
from MPI, no additional software is required. Here are some excerpts from a recent
communication by Michael Hines on this topic.

The present standard distribution of NEURON, version 5.8.88, supports parallel
simulation of network models in which cells on different processors are coupled by
discrete logical spike events. See
http://www.neuron.yale.edu/neuron/stati ... lelNetwork
This works for the fixed step and variable step methods, including the local variable
time step method (Lytton, W. and Hines, M.. Independent variable timestep integration
of individual neurons for network simulations. Neural Computation 17:903-921, 2005.)

Tests using three previously published models obtained from ModelDB showed
superlinear speedup on an IBM linux cluster using 128 CPUs (Hines, personal
communication; manuscript by Migliore et al. has been submitted for publication).

The bottom line is that you are almost certain to see superlinear speedup
with your simulations as long as your per machine high speed cache is much faster
than the main memory bandwidth and your problem is large enough so that each
machine is integrating more than 100 or so equations. Load balance will be extremely
good with no effort on your part if each cell type is a multiple of the number of CPUs
used. If load balance (related to the number of equations integrated on each CPU)
becomes an issue, please be aware that the biophysical cell and network specification
is completely independent from the cell distribution strategy chosen, and, even when
random connectivity and spike stimululators are used, idioms have been devised so
that simulation results are double precision quantitatively identical regardless of number
of CPUs or cell distribution.

I should also mention that our experience has shown it to be straightforward to specify
simulation setup in such a way that the setup time scales properly with the number of
CPUs. Generally, cell creation and cell connection algorithms only need to have their
outer loop modified so that the iteration is only over the cells that exist on "this" CPU.

. . .

I do recommend that parallel simulations be carried out in batch mode and only the
spike activity be saved for optimum performance. This in no way prevents a focus
on state trajectories since, with the entire network spike data, any subset (even 1)
of neurons can be re-simulated with the aid of the GUI to examine any variable as
function of time. This process makes use of the PatternStim class which provides
as input just those events that would have been generated by the rest of the network.
The results for the subnet are quantitatively identical to the full network simulation.

Lastly, I should mention that the current alpha version of NEURON
http://www.neuron.yale.edu/ftp/neuron/versions/alpha/
(sources after 5.8.105) has extended the parallel network capabilities to simulate
interprocessor gap junctions and synapses where post-synaptic state is continuously
dependent on pre-synaptic voltage. Communication overhead is greatly increased for
such models since voltages must be exchanged every time step. Gap junctions in
combination with discrete events can presently use only the fixed step method but
this will be soon extended to global variable step method.

Fixed step integration will be most appropriate for simulations in which

input spike density is likely to be more than one per equation group per
reasonable time step. Nevertheless, if gap junctions are present, it will probably make
sense to minimize the communication overhead by proper partitioning and with more
optimal MPI_Send/MPI_Receive organization than is currently provided with the general
message passing scheme.

Documentation is still at an early stage, and tutorials have yet to be created. We
expect that early adopters of these new features of NEURON may require some
consultation when organizing and debugging their models, and, if gap junctions
are present, in achieving optimal MPI_Send/MPI_Receive organization.