Optimizing multiple functions without duplicate runs

Using the Multiple Run Fitter, praxis, etc..
Post Reply
bsuter
Posts: 42
Joined: Fri May 28, 2010 9:04 am

Optimizing multiple functions without duplicate runs

Post by bsuter »

Hello,
I've been using the MRF with multiple "Run Fitness" generators. However, now I would like to simultaneously optimize multiple quantities derived from a single simulation run. I'd like to avoid running the simulation more than once for a given set of parameter values. For context: I'd like to fit firing rate, spike timing and action potential shape simultaneously, for a range of amplitudes of an injected current step. I'm not sure yet how to achieve these individually (but am working on it). More vexingly, I struggle to see how I can do all three without repeating the simulation run redundantly (i.e. 3x with a given set of parameter values). Is there a way?

A. For firing rate (i.e. f-I relationship), I see that I could paste the experimental "I" and "f" values into a Function Fitness generator, and define a function firing_rate() that (a) executes a run with a given stim amplitude "I", (b) counts APs and returns this #. I took inspiration from http://www.neuron.yale.edu/phpbb/viewto ... =23&t=1263

B. For spike timing / firing rate adaptation, I'm inclined to fit a function which plots instantaneous firing rate (1/ISI) as a function of time-since-step-onset.

C. For AP shape, which varies within a single train and across firing rates, I've been thinking of fitting first and last APs in a given train, for trains at various firing rates. There'd be about 2 (first+last AP) x 5 (# superthreshold trains) = 10 APs to fit individually. I though that doing the fit in the phase plane would avoid issues of time-alignment; however, I've seen (http://www.neuron.yale.edu/phpbb/viewto ... =23&t=1190) that MRF requires a monotonic increase in x. So if the MRF x-axis is time, then I guess I'll need to perform some alignment of the APs (to peak or to threshold).

I'm trying to approach this by breaking it into pieces I can develop/debug separately. I think I have three main questions:

1. At least for (B) and (C), I think I'll need to execute a run, then perform some calculations (e.g. extract spike times) and then compute the error between these calculated results and experiment. Should I do this by using a "Fitness Primitive" generator for this?
http://www.neuron.yale.edu/phpbb/viewto ... f=23&t=501
I suppose in choosing how to calculate the returned error value, I am defining the relative weights of e.g. B vs C.

2. How can I do this without executing the same run repeatedly (and unnecessarily)? I figure I could write a Fitness Primitive function that internally executes the run, then passes the recorded data to sub-functions for A, B and C, and returns a single combined error function. However, I'd prefer to have A, B and C each appear as individual generators, which I can then easily toggle on/off. Conceptually, I'm looking to have multiple MRF generators share (the data from) a single run. Can I do this? Is there another way entirely?

3. In particular for (C), the AP shape, I'd like to have the MRF graph show the actual Vm-vs-t graph, but after isolating and time-aligning a few ms of data around each AP. Is this possible?

For me this is a complex "situation", and while I've tried to reach some clarity before composing this post, I won't be surprised if my questions aren't yet clear ... thanks for any help (big or small picture)!

Ben.
ted
Site Admin
Posts: 6286
Joined: Wed May 18, 2005 4:50 pm
Location: Yale University School of Medicine
Contact:

Re: Optimizing multiple functions without duplicate runs

Post by ted »

bsuter wrote:I would like to simultaneously optimize multiple quantities derived from a single simulation run.
The first step would be to define an objective function that computes an "error" value from your measures of interest, and minimize that function. The function would be something of the form

Code: Select all

func ofunc() {
  run()
  . . . statements that compute an error measure from the simulation results . . .
  return the error measure
}
The next step is to get the MRF to minimize this objective function. I haven't done this myself, but would bet that it involves
Generators / Add Fitness Generator / Add Fitness Primitive
then display this new generator and in it
Fitness / Expression
where you'd enter
ofunc()
after which you'd go back to the MRF and tell it to use this new generator.
Also make sure to tell the MRF to not use any other generators.
I'd prefer to have A, B and C each appear as individual generators, which I can then easily toggle on/off.
That's not going to happen. Each generator manages its own "computational experiment" and the MRF will launch a separate run for every generator that is "used". If your fitness primitive combines several different measures, and you want to use these measures in different combinations, you'll have to control that with hoc statements. A quick and dirty way to do this is with global parameters, e.g. at the top level declare
A = 1
B = 1
C = 1
and in ofunc

Code: Select all

func ofunc() { local Aval, Bval, Cval
  Aval = 0
  Bval = 0
  Cval = 0
  if (A==1) { calculate Aval }
  if (B==1) { calculate Bval }
  if (C==1) { calculate Cval }
  return Aval+Bval+Cval
}
Then you can use an xpanel with buttons that toggle A, B, and C between 0 and 1.
for (C), the AP shape, I'd like to have the MRF graph show the actual Vm-vs-t graph, but after isolating and time-aligning a few ms of data around each AP
The MRF itself doesn't display anything other than lists of parameters and generators and the error value. It's the individual generators that may or may not have their own graphs. A fitness primitive panel does not have an associated graph, so you'll have to create one of your own via hoc code or the gui.
bsuter
Posts: 42
Joined: Fri May 28, 2010 9:04 am

Re: Optimizing multiple functions without duplicate runs

Post by bsuter »

Thank you Ted for your quick response. It confirms to me that I can (almost) do what I want, and that's good news :), but also that I'd need to implement my own "error calculation" (objective) functions for each of the measures of interest, and then combine these. As a result, your post has helped me identify more clearly my remaining question:

I'd like to calculate an error value between two sets of x-y data, like the MRF Run Fitness generator does. However, rather than the x-y data being the direct result of a run (e.g. x=t, y=Vm), my data would be derived from the run. Is this possible? How?

Another way of putting this: I'd prefer to use the MRF's existing functionality (displaying the two sets of data on the same graph, allowing regions of the data to be weighted separately, calculating the least-squares error), rather than (attempting to) implement it all myself. However, I'd like to plot (and minimize the difference between) derived quantities rather than Vm directly. How can I do this?

For context: I have two examples of "derived quantities".
i. One is an impedance profile (ZAP), so y=Z and x=frequency; the run itself would be the response to a chirp stimulus.
ii. The other is AP shape, so y=Vm and x=t, but after some post-processing at the end of the run.

Thank you again!
ted
Site Admin
Posts: 6286
Joined: Wed May 18, 2005 4:50 pm
Location: Yale University School of Medicine
Contact:

Re: Optimizing multiple functions without duplicate runs

Post by ted »

bsuter wrote:I'd like to calculate an error value between two sets of x-y data, like the MRF Run Fitness generator does. However, rather than the x-y data being the direct result of a run (e.g. x=t, y=Vm), my data would be derived from the run. Is this possible? How?
That's where the
. . . statements that compute an error measure from the simulation results . . .
comes in. Replace that with code that does the postprocessing necessary to derive whatever you need.
I'd prefer to use the MRF's existing functionality (displaying the two sets of data on the same graph, allowing regions of the data to be weighted separately, calculating the least-squares error), rather than (attempting to) implement it all myself.
You could look in lib/hoc for the source code for the MRF's generators and feel free to borrow, modify, and reuse whatever looks like it will do the job. My guess is that the time and effort necessary to understand and extract the necessary bits, then bash them into the shape you need, will be more than what it would take to write your own stuff de novo. All you're trying to do is

Code: Select all

func ofunc() {
  run a simulation
  analyze and plot simulation results; also plot the "ideal" that your trying to match
  calculate error measures and add them up
  return the total error
}
i. One is an impedance profile (ZAP), so y=Z and x=frequency; the run itself would be the response to a chirp stimulus.
IMO that's a big waste of your time and the computer's time. The computer's time because the Impedance class enables much faster and more accurate calculation of impedance over a range of frequencies than can be done by running a simulation in the time domain--especially if you are interested in a frequency range of more than about a decade and a half*. Your time because it will take far less code than would be required to extract Z(f) from a time domain simulation of a model driven by a zap function.

*--Why? dt will be 0.025 ms or smaller, but if you are interested in frequencies ~ 1 Hz, limits on the rate at which frequency can increase (imposed by the tacit quasi-steady state assumption) force very long run times.
bsuter
Posts: 42
Joined: Fri May 28, 2010 9:04 am

Re: Optimizing multiple functions without duplicate runs

Post by bsuter »

Gotcha, thanks.

Code: Select all

  calculate error measures and add them up
I get this conceptually, it's the implementation that poses questions for me. For example, when the experimental sampling rate doesn't match the simulation time steps (or adaptive time is used): how does one best calculate the error between these time-series (or in the freq domain)? Run Fitness does this nicely (and includes weights), so I'll follow your suggestion to look at the code and maybe I can pick/choose what I need from there.

Feature request: would be useful if NEURON had general-use functions to calculate the error between two x-y data sets, with weights for x ranges.
ted wrote:
i. One is an impedance profile (ZAP), so y=Z and x=frequency; the run itself would be the response to a chirp stimulus
IMO that's a big waste of your time and the computer's time. The computer's time because the Impedance class enables much faster and more accurate calculation of impedance over a range of frequencies than can be done by running a simulation in the time domain--especially if you are interested in a frequency range of more than about a decade and a half*. Your time because it will take far less code than would be required to extract Z(f) from a time domain simulation of a model driven by a zap function.
I had initially tried this without success, but your response encouraged me to try again. And ... turns out it just didn't work multi-threaded. I switched to single thread and it works. But, if I check "include dstate/dt contribution" the command line starts scrolling numbers furiously and doesn't stop (for many minutes), so I'm proceeding without that option for now.

Feature request: Ability to toggle between log10 and non-log x-axis for e.g. Frequency window.
Feature request: An "export" command in the Shape plot menu, to e.g. SWC format.

Thanks!
ted
Site Admin
Posts: 6286
Joined: Wed May 18, 2005 4:50 pm
Location: Yale University School of Medicine
Contact:

Re: Optimizing multiple functions without duplicate runs

Post by ted »

bsuter wrote:when the experimental sampling rate doesn't match the simulation time steps (or adaptive time is used): how does one best calculate the error between these time-series (or in the freq domain)?
Resample the original data at appropriate intervals. The Vector class's interpolate uses linear interpolation for this. If the original data are sampled at too wide an interval, resample the simulation results at that interval. If the original data are sampled at irregular intervals, simulate with adaptive integration and use the CVode class's record method to capture simulation results at the necessary times.
hines
Site Admin
Posts: 1682
Joined: Wed May 18, 2005 3:32 pm

Re: Optimizing multiple functions without duplicate runs

Post by hines »

I think you can do what you want as an extension to the existing
MulRunFitter RunFitness generator which manages a single run
and typically applies a list of "variable to fit" to associated "Method"s.
Try the following:
copy nrn/lib/hoc/mulfit/e_actpot.hoc to your project folder containing your
hoc files and call it mymethod.hoc
Then edit mymethod.hoc and replace every occurrence of APFitness with
MyMethod.
Then if you

Code: Select all

load_file("nrngui.hoc")
load_file("mulfit.hoc")
load_file("mymethod.hoc")
Your new MyMethod will be in the "Fitness/Change method to" the generator instance menu and you
can associate that with a section.v(x) variable.
Then you can replace the contents of
func efun() {
with your own fitness method.

In fact I can see that this might be useful as a general extension.
i.e

Code: Select all

func efun() {
  x = $o2
  y = $o1
  execute(stmt, this)
  return e
}
and the interface would be to request the name of a function so that
sprint(stmt, "e = %s(x, y)", fname)

Note that 90% of the code in e_actpot.hoc is devoted to managing the gui.
If you don't just want to ignore it you can do a good deal of surgery
in proc build()
bsuter
Posts: 42
Joined: Fri May 28, 2010 9:04 am

Re: Optimizing multiple functions without duplicate runs

Post by bsuter »

ted wrote:
bsuter wrote:when the experimental sampling rate doesn't match the simulation time steps (or adaptive time is used): how does one best calculate the error between these time-series (or in the freq domain)?
Resample the original data at appropriate intervals. The Vector class's interpolate uses linear interpolation for this. If the original data are sampled at too wide an interval, resample the simulation results at that interval. If the original data are sampled at irregular intervals, simulate with adaptive integration and use the CVode class's record method to capture simulation results at the necessary times.
The Vector class method interpolate

Code: Select all

obj = ydest.interpolate(xdest, xsrc, ysrc)
is just what I need for this. In my case, the experimental data are sampled at a fixed rate (10 or 40 kHz) while the simulation uses adaptive time steps. For the AP shape the adaptive time steps are shorter than my experimental time steps, so will resample/interpolate the sim data using the experimental (xdsrc) time vector. For comparing slower signals, I'd do it the other way around.

Code: Select all

objref tt_exp, vm_exp, tt_sim, vm_sim, tt_sim_i, vm_sim_i

// tt_exp, vm_exp are imported from experimental data, sampled at constant rate
tt_exp = new Vector()
vm_exp = new Vector()
// use File scanvar() to read in tt_exp, vm_exp from .dat file

// tt_sim, vm_sim are recorded during adaptive time step simulation
tt_sim = new Vector()
vm_sim = new Vector()
tt_sim.record(&t)
vm_sim.record(&soma.v(0.5))

// resample simulated data on the experimental time-base
tt_sim_i = new Vector()
vm_sim_i = new Vector()
tt_sim_i = tt_exp.c
vm_sim_i.interpolate(t_sim_i, tt_sim, vm_sim)

// estimate error between simulation and experiment 
// (optionally, can supply a 2nd "weight" arg)
mse = vm_sim_i.meansqerr(vm_sim)
http://www.neuron.yale.edu/neuron/stati ... nterpolate
http://www.neuron.yale.edu/neuron/stati ... #meansqerr

Many thanks Ted!
ted
Site Admin
Posts: 6286
Joined: Wed May 18, 2005 4:50 pm
Location: Yale University School of Medicine
Contact:

Re: Optimizing multiple functions without duplicate runs

Post by ted »

bsuter wrote:In my case, the experimental data are sampled at a fixed rate (10 or 40 kHz) while the simulation uses adaptive time steps. For the AP shape the adaptive time steps are shorter than my experimental time steps
That's why I suggested the CVode class's record() method.
http://www.neuron.yale.edu/neuron/stati ... tml#record
cvode.record(&rangevar, yvec, tvec, 1)
captures rangevar not at the "integration" times, but at the times specified by tvec. It does this by interpolating with a formula that is appropriate for the integration order used for the time step that contains the time of interest. Consequently cvode.record's results are more accurate than those obtained by using the Vector class's interpolate() method, which uses plain old linear interpolation between adjacent pairs of solution points.
bsuter
Posts: 42
Joined: Fri May 28, 2010 9:04 am

Re: Optimizing multiple functions without duplicate runs

Post by bsuter »

ted wrote:That's why I suggested the CVode class's record() method.
http://www.neuron.yale.edu/neuron/stati ... tml#record
cvode.record(&rangevar, yvec, tvec, 1)
captures rangevar not at the "integration" times, but at the times specified by tvec. It does this by interpolating with a formula that is appropriate for the integration order used for the time step that contains the time of interest. Consequently cvode.record's results are more accurate than those obtained by using the Vector class's interpolate() method, which uses plain old linear interpolation between adjacent pairs of solution points.
Thanks - I read your initial suggestion as being only for the case where the experimental data is "sampled at irregular intervals", so did not pursue CVode.record further at that time. Reading the documention now, I see that it says: "This option may slow integration since it requires calculation of states at those particular times."

Maybe I should do some tests to see what the trade-off in accuracy-vs-time is for various types of fitting/error conditions (e.g. AP waveforms vs. slower, longer signals).
Post Reply