Accessing ParallelContext() while in Python

General issues of interest both for network and
individual cell parallelization.

Moderator: hines

Post Reply
agmccrei
Posts: 24
Joined: Thu Sep 10, 2015 1:34 pm

Accessing ParallelContext() while in Python

Post by agmccrei »

Hello,

I am fairly new to parallel processing and I am running into a somewhat odd problem while trying to access and re-use a previously used ParallelContext(). When I first use a ParallelContext in my hoc code to run a parameter search (i.e. simulate many iterations of my models) using a bulletin board style parallelization (or master-slave paradigm) on Neuroscience Gateway (NSG), all of the output comes out fine and the parameter search takes a lot less time. However, I've found that with the large number of files generated from these parameter searches, it would be best to analyze all of the model traces on NSG (using the eFEL module in Python) such that I can delete the model files once the analysis is done and only have to download summary vectors of select measurements instead of having to download all of the traces generated in the parameter search. Also, because analysis of the dataset takes quite a long time in serial, I have decided to try to run it in Parallel as well (i.e. with the same bulletin board style of parallelization).

The problem that I am having is that, while the simulations are run in parallel without issue, the analysis of the traces seems to get stuck and the jobs that I submit to NSG fail to finish. Specifically, this depends on the number of processors that are available. In serial (1 processor available), the analysis completes without issue. When 2 processors are available, the 1st processor gets assigned the 1st trace and gets stuck, while the 2nd processor runs the analysis of the rest of the traces with no issue (essentially in serial, since there is only one processor remaining because the 1st one is stuck). When 3 processor are available, the first two processors get assigned the first two traces for analysis and both get stuck, while the 3rd processor runs the analysis of the rest of the traces with no issue (again, essentially in serial). As I mentioned, these jobs fail to finish since I never receive any output for traces that get submitted for analysis onto any of the N-1 processors. The tricky part is that I do not receive any errors when this happens (aside from having exceeded the runtime limit). I've recreated this scenario with a much simplified code:

init.py:

Code: Select all

from neuron import h

h.load_file("SynParamSearch.hoc")
execfile("CutSpikes_HighConductanceMeasurements.py")
h.quit()
SynsParamSearch.hoc:

Code: Select all

// This script is used to search the synaptic parameter space of the IS3 model by varying the number of excitatory and inhibitory synapses as well as their presynaptic spike rates

load_file("nrngui.hoc")

proc f() {
	count = $1
	print count
}


// Set up parallel bulletin-board context
objectvar pc
pc = new ParallelContext()
{pc.runworker()}
count = 25
// Set up parallel context
if (pc.nhost == 1){
	for l = 0, count-1 f(l)
}else{
	for l = 0, count-1 pc.submit("f",l)
	while (pc.working) { // gather results
	}
}

{pc.done()}
CutSpikes_HighConductanceMeasurements.py (Apologies for all of the print statements in this piece of code - I have been using them to try and pinpoint the problem):

Code: Select all

def getMeasures(TrIn):
	trace_index = int(TrIn)
	print('Trace Index = ' + str(trace_index))
	outputresults = [trace_index,trace_index*2,trace_index*3,trace_index*4,trace_index*5,trace_index*6]
	return outputresults

from neuron import h
import numpy
pc = h.pc
pc.runworker()
Vec_count = 25
StdVolt = numpy.zeros((Vec_count,), dtype=numpy.float64)
MeanVolt = numpy.zeros((Vec_count,), dtype=numpy.float64)
MeanAPamp = numpy.zeros((Vec_count,), dtype=numpy.float64)
ISICV = numpy.zeros((Vec_count,), dtype=numpy.float64)
NumSpikes = numpy.zeros((Vec_count,), dtype=numpy.int)
# Set up parallel context
print 'Number of Hosts = ' + str(pc.nhost())
if pc.nhost() == 1:
	for l in range(0,Vec_count): 
		results = getMeasures(l)
		StdVolt[results[0]] = results[1]
		MeanVolt[results[0]] = results[2]
		NumSpikes[results[0]] = results[3]
		MeanAPamp[results[0]] = results[4]
		ISICV[results[0]] = results[5]
else:
	print 'Step 1: submit jobs'
	for l in range(0,Vec_count): 
		pc.submit(getMeasures,l)
		print 'Trace Index Submit = ' + str(l)
	print 'Step 2: working'
	while pc.working():
		print 'Step 2A'
		print 'User ID = ' + str(pc.userid())
		results = pc.pyret()
		print 'Results = ' + str(results)
		print 'Step 2B: Store Results'
		StdVolt[results[0]] = results[1]
		MeanVolt[results[0]] = results[2]
		NumSpikes[results[0]] = results[3]
		MeanAPamp[results[0]] = results[4]
		ISICV[results[0]] = results[5]
		print 'Step 2C: Results Storage Complete'
print 'Step 3: Done'
pc.done()
Here is the output from NSG when using 25 processors (and 25 pc.submits). Note that I get the exact same outcome when I create a new ParallelContext instead of re-using the one that was used for the simulations:

Code: Select all

22 
0 
4 
6 
10 
14 
18 
23 
2 
7 
11 
15 
19 
24 
1 
5 
8 
12 
16 
20 
3 
9 
13 
17 
21
Number of Hosts = 25.0
Step 1: submit jobs
Trace Index Submit = 0
Trace Index Submit = 1
Trace Index Submit = 2
Trace Index Submit = 3
Trace Index Submit = 4
Trace Index Submit = 5
Trace Index Submit = 6
Trace Index Submit = 7
Trace Index Submit = 8
Trace Index Submit = 9
Trace Index Submit = 10
Trace Index Submit = 11
Trace Index Submit = 12
Trace Index Submit = 13
Trace Index Submit = 14
Trace Index Submit = 15
Trace Index Submit = 16
Trace Index Submit = 17
Trace Index Submit = 18
Trace Index Submit = 19
Trace Index Submit = 20
Trace Index Submit = 21
Trace Index Submit = 22
Trace Index Submit = 23
Trace Index Submit = 24
Step 2: working
Trace Index = 24
Step 2A
User ID = 50.0
Results = [24, 48, 72, 96, 120, 144]
Step 2B: Store Results
Step 2C: Results Storage Complete
From the output I find that with 25 processors, only the 25th submission gets outputted, while the other 24 are stuck until the job hits it's runtime maximum. I am not sure why it is doing this, but any help/advice would be most welcome. Alternatively, I am considering using the mpi4py module instead, but I am not sure yet if it is appropriate for the task that I am trying to perform.

Thanks for your time,

Alex GM
agmccrei
Posts: 24
Joined: Thu Sep 10, 2015 1:34 pm

Re: Accessing ParallelContext() while in Python

Post by agmccrei »

One quick thing to add is that in the simplified code, when I do not create the ParallelContext in the hoc file (i.e. I comment out execution of the "SynParamSearch.hoc" file in the "init.py" file) and I create it in the python code with pc = h.ParallelContext(), the parallelization works fine. Similarly, creating the ParallelContext in the hoc file but commenting out any other ParallelContext functions (i.e. runworker(), submit() & done()), also leads to the parallelization working fine. So I suppose the problem likely has to do with starting the ParallelContext, stopping it, then re-using it.

Alex GM
agmccrei
Posts: 24
Joined: Thu Sep 10, 2015 1:34 pm

Re: Accessing ParallelContext() while in Python

Post by agmccrei »

I actually found what looks to be a (embarrassingly) simple solution to my initial query. In fact, I am somewhat embarrassed I had not realized this before. Instead of running the ParallelContext twice for two separate problems, I could just run my hoc "simulation" function (i.e. h.f()) at the beginning of my python "analysis" function (i.e. getMeasures()) so that both functions are run within the same ParallelContext.

i.e.:

Code: Select all

def getMeasures(TrIn):
	trace_index = int(TrIn)
	h.f(trace_index)
	print('Trace Index = ' + str(trace_index))
	outputresults = [trace_index,trace_index*2,trace_index*3,trace_index*4,trace_index*5,trace_index*6]
	return outputresults
So far, this seems to resolve the issue I was having.
Post Reply