I'm using jNeuroML generated NetPyNE codes to run a big scale hippocampal model on NSG (tool: OSBPYNEURON74 @ Comet)
(see model repo here: https://github.com/mbezaire/ca1/tree/development) and after increasing the size to ~650 cells I get an error, which looks like a memory error to me, but according the NSG developers the memory/node is fine!
Here is the (main) error message:
Code: Select all
terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc
Code: Select all
[comet-20-53:27930] *** Process received signal ***
[comet-20-53:27930] Signal: Aborted (6)
[comet-20-53:27930] Signal code: (-6)
[comet-20-53:27930] [ 0] /lib64/libpthread.so.0[0x3aa140f7e0]
[comet-20-53:27930] [ 1] /lib64/libc.so.6(gsignal+0x35)[0x3aa0832495]
[comet-20-53:27930] [ 2] /lib64/libc.so.6(abort+0x175)[0x3aa0833c75]
[comet-20-53:27930] [ 3] /opt/gnu/gcc/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x15d)[0x2b1e0b64c07d]
[comet-20-53:27930] [ 4] /opt/gnu/gcc/lib64/libstdc++.so.6(+0x5e0e6)[0x2b1e0b64a0e6]
[comet-20-53:27930] [ 5] /opt/gnu/gcc/lib64/libstdc++.so.6(+0x5e131)[0x2b1e0b64a131]
[comet-20-53:27930] [ 6] /opt/gnu/gcc/lib64/libstdc++.so.6(+0x5e348)[0x2b1e0b64a348]
[comet-20-53:27930] [ 7] /opt/gnu/gcc/lib64/libstdc++.so.6(+0x5e859)[0x2b1e0b64a859]
[comet-20-53:27930] [ 8] /opt/gnu/gcc/lib64/libstdc++.so.6(_Znam+0x9)[0x2b1e0b64a8b9]
[comet-20-53:27930] [ 9] /projects/ps-nsg/home/nsguser/applications/osbneuron74_py/nrn-7.4/installdir/x86_64/lib/libnrnpython.so.0(+0x130d6)[0x2b1e0a84e0d6]
[comet-20-53:27930] [10] /projects/ps-nsg/home/nsguser/applications/osbneuron74_py/nrn-7.4/installdir/x86_64/lib/libnrniv.so.0(+0x802c7)[0x2b1e0903e2c7]
[comet-20-53:27930] [11] /projects/ps-nsg/home/nsguser/applications/osbneuron74_py/nrn-7.4/installdir/x86_64/lib/libnrnoc.so.0(hoc_call_ob_proc+0x2ab)[0x2b1e08d977cb]
[comet-20-53:27930] [12] /projects/ps-nsg/home/nsguser/applications/osbneuron74_py/nrn-7.4/installdir/x86_64/lib/libnrnoc.so.0(hoc_object_component+0x76e)[0x2b1e08d9868e]
[comet-20-53:27930] [13] /projects/ps-nsg/home/nsguser/applications/osbneuron74_py/nrn-7.4/installdir/x86_64/lib/libnrnpython.so.0(+0xb0fe)[0x2b1e0a8460fe]
[comet-20-53:27930] [14] /projects/ps-nsg/home/nsguser/applications/osbneuron74_py/nrn-7.4/installdir/x86_64/lib/libnrniv.so.0(_ZN10OcJumpImpl7fpycallEPFPvS0_S0_ES0_S0_+0x61)[0x2b1e090174e1]
[comet-20-53:27930] [15] /projects/ps-nsg/home/nsguser/applications/osbneuron74_py/nrn-7.4/installdir/x86_64/lib/libnrnpython.so.0(+0xb392)[0x2b1e0a846392]
[comet-20-53:27930] [16] /opt/python/lib/libpython2.7.so.1.0(PyObject_Call+0x43)[0x2b1e0aaa7b73]
[comet-20-53:27930] [17] /opt/python/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x3b2e)[0x2b1e0ab5c00e]
[comet-20-53:27930] [18] /opt/python/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5a5d)[0x2b1e0ab5df3d]
[comet-20-53:27930] [19] /opt/python/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5a5d)[0x2b1e0ab5df3d]
[comet-20-53:27930] [20] /opt/python/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5a5d)[0x2b1e0ab5df3d]
[comet-20-53:27930] [21] /opt/python/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x830)[0x2b1e0ab5f320]
[comet-20-53:27930] [22] /opt/python/lib/libpython2.7.so.1.0(PyEval_EvalCode+0x19)[0x2b1e0ab5f449]
[comet-20-53:27930] [23] /opt/python/lib/libpython2.7.so.1.0(PyImport_ExecCodeModuleEx+0x99)[0x2b1e0ab72c79]
[comet-20-53:27930] [24] /opt/python/lib/libpython2.7.so.1.0(+0x11dfce)[0x2b1e0ab72fce]
[comet-20-53:27930] [25] /opt/python/lib/libpython2.7.so.1.0(+0x11edb9)[0x2b1e0ab73db9]
[comet-20-53:27930] [26] /opt/python/lib/libpython2.7.so.1.0(PyImport_ImportModuleLevel+0x1dd)[0x2b1e0ab74a2d]
[comet-20-53:27930] [27] /opt/python/lib/libpython2.7.so.1.0(+0x1013e8)[0x2b1e0ab563e8]
[comet-20-53:27930] [28] /opt/python/lib/libpython2.7.so.1.0(PyObject_Call+0x43)[0x2b1e0aaa7b73]
[comet-20-53:27930] [29] /opt/python/lib/libpython2.7.so.1.0(PyEval_CallObjectWithKeywords+0x47)[0x2b1e0ab57ee7]
[comet-20-53:27930] *** End of error message ***
I appears after running the simulation and gathering data for saving around 20 random traces and all the spikes (500ms long simulation)
Thanks,
András
PS: The error seems to be independent from the total number of cores and number of code per node, and the same code runs perfectly fine for ~450 cells (instead of ~650).