Hi all
Did somebody manage to compile Neuron with MPI for Myrinet and/or LAM-MPI?
The following code snippet from src/nrnmpi/nrnmpi.c seems to restrict usage to MPICH and ethernet networks only:
#if !ALWAYS_CALL_MPI_INIT
/* this is not good. depends on mpirun adding at least one
arg that starts with -p4 but that probably is dependent
on mpich and the use of the ch_p4 device. We are trying to
work around the problem that MPI_Init may change the working
directory and so when not invoked under mpirun we would like to
NOT call MPI_Init.
*/
{
int i, b;
b = 0;
for (i=0; i < *pargc; ++i) {
if (strncmp("-p4", (*pargv), 3) == 0) {
b = 1;
break;
}
}
if (!b) {
nrnmpi_use = 0;
return;
}
}
#endif
Does somebody know a workaround?
Thanks
Thomas
PS: Just inactivating the above code is not a workaround; it results in the subsequent errors on my Laptop (used for testing) running Kubuntu and LAM-MPI, with or without iv [doNotify() somehow links to hoc_notify_iv() in src/oc/hoc_init.c so I thought it could be iv]
mpirun -np 2 nrniv test0.hoc
NEURON -- Version 5.8 2005-10-7 13:46:29 Main (85)
by John W. Moore, Michael Hines, and Ted Carnevale
Duke and Yale University -- Copyright 1984-2005
hello from id 1 on beatrix
nrnmpi_init(): numprocs=2 myid=0
NEURON -- Version 5.8 2005-10-7 13:46:29 Main (85)
by John W. Moore, Michael Hines, and Ted Carnevale
Duke and Yale University -- Copyright 1984-2005
hello from id 0 on beatrix
0
oc: Resource temporarily unavailable
0 nrniv: errno set during call of doNotify
0 in test0.hoc near line 9
0 for i=1, 1000000 doNotify()
^
oc: Resource temporarily unavailable
0 nrniv: errno set during call of doNotify
0 in test0.hoc near line 9
0 ^
oc: Resource temporarily unavailable
0 nrniv: errno set during call of doNotify
0 in test0.hoc near line 9
0 ^
oc: Resource temporarily unavailable
0 nrniv: errno set during call of doNotify
0 in test0.hoc near line 9
0 ^
oc: Resource temporarily unavailable
No more errno warnings during this execution
0 nrniv: errno set during call of doNotify
0 in test0.hoc near line 9
0 ^
errno set 6666 times on last execution
bbs_msg_cnt_=1 bbs_poll_cnt_=6667 bbs_poll_=93
0
Same problem appears on Beowulf cluster with MPICH and Myrinet
Neuron version is 5.8 but the newest version on the Neuron-website 5.9.?? contains the same code particle. Didn't try to compile and test yet, but am pretty sure that the same problem would likely occur
Again, thanks for hints!
Best
Thomas
Neuron and Myrinet
-
- Posts: 5
- Joined: Sun May 28, 2006 6:28 am
- Location: Plymouth / UK
- Contact:
The problem is not the #if !ALWAYS_CALL_MPI_INIT code section. In fact everything is working in your example except for the benign but annoying error messages. They may just go away if you upgrade to 5.9. If not let me know. Usually the problem traces to failing to reset errno=0 after a call to an mpi function.
-
- Posts: 5
- Joined: Sun May 28, 2006 6:28 am
- Location: Plymouth / UK
- Contact:
5.3 compilation fails
Hi
Thanks for the info.
Compilation of version 5.9 fails with
-----------------------------
.....
mpicxx -g -O2 -o .libs/ivoc nrnmain.o ivocmain.o classreg.o datapath.o ocjump.o symdir.o ../oc/nocable.o ../oc/modlreg.o ../oc/.libs/libocxt.so ../oc/.libs/liboc.so -L/usr/X11R6/lib64 -lX11 ./.libs/libivoc.so ../nrnmpi/.libs/libnrnmpi.so ../memacs/.libs/libmemacs.so ../mesch/.libs/libmeschach.so ../gnu/.libs/libneuron_gnu.so /usr/local/Cluster-Apps/nrn/nrn-5.9/mx-iv-mpi-gcc/iv/x86_64/lib/libIVhines.so /usr/lib64/libstdc++.so -lreadline -lncurses -ldl -lm -Wl,--rpath -Wl,/usr/local/Cluster-Apps/nrn/nrn-5.9/mx-iv-mpi-gcc//x86_64/lib -Wl,--rpath -Wl,/usr/local/Cluster-Apps/nrn/nrn-5.9/mx-iv-mpi-gcc/iv/x86_64/lib
./.libs/libivoc.so: undefined reference to `ListImpl_best_new_count(long, unsigned int, unsigned int)'
collect2: ld returned 1 exit status
make[3]: *** [ivoc] Error 1
make[3]: Leaving directory `/home/thomas/soft/nrn-5.9/src/ivoc'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/home/thomas/soft/nrn-5.9/src'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/thomas/soft/nrn-5.9'
make: *** [all] Error 2
----------------------------------
configuration complains about varargs dperecated, but that's unlikekly the reason.
The above error is the first during compilation (beside several size mismatches in variables)
full logs of configuration and make are here: www.pion.ac.uk/~thomas/open
Best wishes
Thomas
Thanks for the info.
Compilation of version 5.9 fails with
-----------------------------
.....
mpicxx -g -O2 -o .libs/ivoc nrnmain.o ivocmain.o classreg.o datapath.o ocjump.o symdir.o ../oc/nocable.o ../oc/modlreg.o ../oc/.libs/libocxt.so ../oc/.libs/liboc.so -L/usr/X11R6/lib64 -lX11 ./.libs/libivoc.so ../nrnmpi/.libs/libnrnmpi.so ../memacs/.libs/libmemacs.so ../mesch/.libs/libmeschach.so ../gnu/.libs/libneuron_gnu.so /usr/local/Cluster-Apps/nrn/nrn-5.9/mx-iv-mpi-gcc/iv/x86_64/lib/libIVhines.so /usr/lib64/libstdc++.so -lreadline -lncurses -ldl -lm -Wl,--rpath -Wl,/usr/local/Cluster-Apps/nrn/nrn-5.9/mx-iv-mpi-gcc//x86_64/lib -Wl,--rpath -Wl,/usr/local/Cluster-Apps/nrn/nrn-5.9/mx-iv-mpi-gcc/iv/x86_64/lib
./.libs/libivoc.so: undefined reference to `ListImpl_best_new_count(long, unsigned int, unsigned int)'
collect2: ld returned 1 exit status
make[3]: *** [ivoc] Error 1
make[3]: Leaving directory `/home/thomas/soft/nrn-5.9/src/ivoc'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/home/thomas/soft/nrn-5.9/src'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/thomas/soft/nrn-5.9'
make: *** [all] Error 2
----------------------------------
configuration complains about varargs dperecated, but that's unlikekly the reason.
The above error is the first during compilation (beside several size mismatches in variables)
full logs of configuration and make are here: www.pion.ac.uk/~thomas/open
Best wishes
Thomas
-
- Posts: 5
- Joined: Sun May 28, 2006 6:28 am
- Location: Plymouth / UK
- Contact:
--without-iv worked
Hi
Configuration without interviews worked. I still had to switch off the above code-snippet in nrnmpi.c by hand. Looks like things compiled properly (up to a few warnings). Simple example programs did run without errors. The pretend-to-be error messaged I had with version 5.8 vanished.
Thanks!
Thomas
Configuration without interviews worked. I still had to switch off the above code-snippet in nrnmpi.c by hand. Looks like things compiled properly (up to a few warnings). Simple example programs did run without errors. The pretend-to-be error messaged I had with version 5.8 vanished.
Thanks!
Thomas
-
- Posts: 5
- Joined: Sun May 28, 2006 6:28 am
- Location: Plymouth / UK
- Contact:
Neuron & Intel & Myrinet
Hi again
Just for info:
Successfully compiled Neuron version 5.9. with Intel compilers for myrinet, but without interviews.
Lots of warnings, basically related to missing prototypes.
One bug: In oc/ocbbs.cpp I had to replace "nrnmpi_nhost" by "nrnmpi_numprocs". I am not entirely sure wether that made a bug just more subtle or killed it. Nonethelss, compilation run through afterwards and simple tests succeeded (src/parallel/test0.hoc and the example from the ParallelNetworkManager manpage)
static double nhost(void* v) {
#if defined(HAVE_STL)
OcBBS* bbs = (OcBBS*)v;
return double(bbs->nhost());
#else
// return nrnmpi_nhost;
return nrnmpi_numprocs;
#endif
}
config and make logs are here: www.pion.ac.uk/~thomas/open
Regards,
Thomas
Just for info:
Successfully compiled Neuron version 5.9. with Intel compilers for myrinet, but without interviews.
Lots of warnings, basically related to missing prototypes.
One bug: In oc/ocbbs.cpp I had to replace "nrnmpi_nhost" by "nrnmpi_numprocs". I am not entirely sure wether that made a bug just more subtle or killed it. Nonethelss, compilation run through afterwards and simple tests succeeded (src/parallel/test0.hoc and the example from the ParallelNetworkManager manpage)
static double nhost(void* v) {
#if defined(HAVE_STL)
OcBBS* bbs = (OcBBS*)v;
return double(bbs->nhost());
#else
// return nrnmpi_nhost;
return nrnmpi_numprocs;
#endif
}
config and make logs are here: www.pion.ac.uk/~thomas/open
Regards,
Thomas
We are going to have to straighten that out. That is a piece of code that is never supposed to be compiled because HAVE_STL is required nowadays.
Lets deal with this by email. Please send your src/parallel/bbsconf.h file to michael.hines@yale.edu.
Lets deal with this by email. Please send your src/parallel/bbsconf.h file to michael.hines@yale.edu.