Parallel NEURON/H'ware - batch of individual runs
Posted: Wed Mar 03, 2010 3:44 pm
Hi...
First I'll describe the problem, or set of simulations that I'm looking to run, and the I'll outline the approach I am considering... but I'm a little unsure whether it is the preferred or desirable approach. I'm largely new to parallel NEURON (and parallel computing), but have been using serial NEURON for about 15 months or so, and have a reasonable grounding in OO programming.
The field I'm looking at is the effects of Noise and Heterogeneity on the response of MVN neurons. Currently I'm just looking at individual Neurons (we collect simulation data from X individual runs, using the same input and experimental conditions, but different initial voltages and different random noise added to the input, and consider that a population response). I have HOC code for running these simulations serially... My RUN.HOC loads some globals, an MVN neuron and an analysis.HOC (for recording spike times) and adds a point process for providing input (a sin wave current of a given frequency and amplitude)... it then runs X number of simulations, one after the other. Each simulation uses analysis.HOC to record spike times for that run to a vector, which (after each run) are appended to a list of vectors. After all the simulations are run, the spike time list is written to a file for later analysis... so I end with a .txt file with X vectors of spike times. However, the number of individual runs I'm looking to perform number in the 10,000's (with different experimental models, volumes of noise, input frequencies/amplitudes etc), and so I'm looking to utilise a ~60 node UNIX cluster to improve the time in which I can collect all the data.
The approach I am considering is to use the cluster to run the above serial sequence, but on X hosts. That is, I would have a main.HOC, which, given a number of processors (say 50) would then make those processors/hosts run through a similar sequence to the RUN.HOC described above (they'll each create their own analysis.HOC, an MVN Neuron, attach the point process and run Y simulations, storing each simulations spike times in a list of vectors). After that, the main.HOC would then collect each hosts spike time list(each with Y vectors, one for each run) and append them all together to give a list of X*Y vectors, which would be written to a file. My population size currently is 500, so I would hope to use 50 hosts, each running 10 simulations (one after the other).
Is this a worthwhile approach to take? Is it even possible? From what I've read on parallel NEURON I assume it is possible... by templating my MVN.HOC, analysis.HOC, RUN.HOC (anything for which there will be multiple instances of it created), and having the main.HOC create a new RUN.HOC on each host (which then creates its own MVN.HOC and analysis.HOC) each instance will be unique to that host. For example, the 'spikelist' list (holding the vectors of spike times for a given hosts runs) will be unique to that host (other hosts won't write spike times to that list), but can be collected/addressed by the main.HOC (using each hosts ID or similar).
Would there be any difficulties with this approach (if it is even possible)? I use 3 or 4 random streams, and am aware that each will have to be unique for a given host (with each host given a unique integer for seeding or producing a seed for their streams). If any further information is required, I'd be happy to supply it.
Thanks in advance for any help or advice that can be given. My apologies if this is a very obvious and apparent question.
James
First I'll describe the problem, or set of simulations that I'm looking to run, and the I'll outline the approach I am considering... but I'm a little unsure whether it is the preferred or desirable approach. I'm largely new to parallel NEURON (and parallel computing), but have been using serial NEURON for about 15 months or so, and have a reasonable grounding in OO programming.
The field I'm looking at is the effects of Noise and Heterogeneity on the response of MVN neurons. Currently I'm just looking at individual Neurons (we collect simulation data from X individual runs, using the same input and experimental conditions, but different initial voltages and different random noise added to the input, and consider that a population response). I have HOC code for running these simulations serially... My RUN.HOC loads some globals, an MVN neuron and an analysis.HOC (for recording spike times) and adds a point process for providing input (a sin wave current of a given frequency and amplitude)... it then runs X number of simulations, one after the other. Each simulation uses analysis.HOC to record spike times for that run to a vector, which (after each run) are appended to a list of vectors. After all the simulations are run, the spike time list is written to a file for later analysis... so I end with a .txt file with X vectors of spike times. However, the number of individual runs I'm looking to perform number in the 10,000's (with different experimental models, volumes of noise, input frequencies/amplitudes etc), and so I'm looking to utilise a ~60 node UNIX cluster to improve the time in which I can collect all the data.
The approach I am considering is to use the cluster to run the above serial sequence, but on X hosts. That is, I would have a main.HOC, which, given a number of processors (say 50) would then make those processors/hosts run through a similar sequence to the RUN.HOC described above (they'll each create their own analysis.HOC, an MVN Neuron, attach the point process and run Y simulations, storing each simulations spike times in a list of vectors). After that, the main.HOC would then collect each hosts spike time list(each with Y vectors, one for each run) and append them all together to give a list of X*Y vectors, which would be written to a file. My population size currently is 500, so I would hope to use 50 hosts, each running 10 simulations (one after the other).
Is this a worthwhile approach to take? Is it even possible? From what I've read on parallel NEURON I assume it is possible... by templating my MVN.HOC, analysis.HOC, RUN.HOC (anything for which there will be multiple instances of it created), and having the main.HOC create a new RUN.HOC on each host (which then creates its own MVN.HOC and analysis.HOC) each instance will be unique to that host. For example, the 'spikelist' list (holding the vectors of spike times for a given hosts runs) will be unique to that host (other hosts won't write spike times to that list), but can be collected/addressed by the main.HOC (using each hosts ID or similar).
Would there be any difficulties with this approach (if it is even possible)? I use 3 or 4 random streams, and am aware that each will have to be unique for a given host (with each host given a unique integer for seeding or producing a seed for their streams). If any further information is required, I'd be happy to supply it.
Thanks in advance for any help or advice that can be given. My apologies if this is a very obvious and apparent question.
James