Preprint available as multisplit.pdf
Complex models of individual neurons implemented with NEURON can be distributed over multiple processors to achieve speedup that is almost linear with the number of processors (practical upper limit is ~16 processors). This strategy can also be used for load balancing of network models in which some cells are so large that their individual computation time is much longer than the average processor computation time, or when there are many more processors than cells.
Fully implicit parallel simulation of single neurons
Title  Fully implicit parallel simulation of single neurons 
Publication Type  Journal Article 
Year of Publication  2008 
Authors  Hines, M. L., Markram Henry, and Schürmann Felix 
Journal  Journal of computational neuroscience 
Volume  25 
Pagination  439–448 
Keywords  Computer modeling, Computer simulation, Load balance, Neuronal networks, Parallel simulation 
Abstract  When a multicompartment neuron is divided into subtrees such that no subtree has more than two connection points to other subtrees, the subtrees can be on different processors and the entire system remains amenable to direct Gaussian elimination with only a modest increase in complexity. Accuracy is the same as with standard Gaussian elimination on a single processor. It is often feasible to divide a 3D reconstructed neuron model onto a dozen or so processors and experience almost linear speedup. We have also used the method for purposes of load balance in network simulations when some cells are so large that their individual computation time is much longer than the average processor computation time or when there are many more processors than cells. The method is available in the standard distribution of the NEURON simulation program. 
Full Text 
