-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
probe id -> probeset id, and clarification in docs #1898
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't go over the code changes, just the RST. I skipped the 2 cpp docs because they made my head spin. Unless this level of terseness makes cpp devs happy, Id' opt for a full rewrite of cpp/probe_sample.rst
, there's so much technicalities and abstract justifications of the design mixed into the text that I have a hard time finding information in it. I'd start by introducing the concepts, with simple examples, and then layering in the more detailed things towards the end of the doc.
I believe probeset address
should be replaced by probeset
. Is it still required to include the term address
for anything?
@@ -56,7 +56,7 @@ Mechanism state queries however will throw a ``cable_cell_error`` exception | |||
at simulation initialization if the requested state variable does not exist | |||
on the mechanism. | |||
|
|||
Cable cell probe addresses that are described by a ``locset`` may generate more | |||
Cable cell probeset addresses that are described by a ``locset`` may generate more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So do we want to call them probesets
or probeset addresses
. I thought we'd go for just probesets
.
The explanation here seems to contort itself a bit around the unique case of a probeset
being described by a 1-loc locset. Shouldn't the explanation just always talk about probesets as multiple probes, the user can assume that 1-loc locsets and 0-loc locsets would produce 1 and 0 probes in the probeset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically it might be worse: Locset
s are what's usually called a multiset, ie a set of unique keys
(called the support of the multiset) and an associated multiplicity for each key. Probes should be
defined on the support, thus each location is probed only once. So, the number of points in the locset
might be larger than the number of unique points.
doc/python/probe_sample.rst
Outdated
@@ -5,14 +5,14 @@ Cable cell probing and sampling | |||
|
|||
.. module:: arbor | |||
|
|||
Cable cell probe addresses are defined analogously to their counterparts in | |||
Cable cell probeset addresses are defined analogously to their counterparts in | |||
the C++ API (see :ref:`cablecell-probes` for details). Sample data recorded | |||
by the Arbor simulation object is returned in the form of a NumPy array, | |||
with the first column holding sample times, and subsequent columns holding | |||
the corresponding scalar- or vector-valued sample. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sim.samples(handle)
will return a list of length K
. In each element, you'll find a tuple containing metadata
and data
, where metadata
will be a string describing the location, and data
a numpy.ndarray
. The structure of the numpy array can vary (and may not be a numpy array), that's at the bottom of the documentation for samples()
.
I reshuffled the text there a bit, hopefully it made this clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You know me ... a graphic? A crude approximation
O------------- Cell morphology
^ ^ ^ Probe locset
| | |
L0 L1 L2 Locations, eg (cable 0 0.23)
U0 V0 W0 Values, one column per Location
U1 one entry per sampling time point
U2
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the graphic, but I think we could be even clearer by splitting it in two
- Showing the relation of handle, probeset, probes, and their locations
- Showing the relation between the sampled data and the probeset
Sorry for being picky here, but we inflict multiple layers of callbacks and handles on our
users (for good reasons), so giving them a hand is needed.
That said, I made one of the images, so maybe that plus the more schematic one you added
is enough. Tell me what you think.
Also, since I cannot comment on full files, feedback for the graphic here
- Typo: experession
- Did you check array format? I thought I remember columns of time, value. As shown it indicates rows. Maybe this would be clearer if shown as a table.
- The
->
relation is unclear
- handle -> probeset: obtain a probeset from simulation
- probeset -> probe, probe -> data|meta: ownership
- meta -> locset, data -> ndarray: is a
On 3. I would suggest showing 'has a' and 'is a' relations like this; example for probe
+---------------+
| Probe |
| meta: locset |
| data: ndarray |
+---------------+
The handle -> probeset -> probe could remain.
doc/python/simulation.rst
Outdated
Retrieve a list of sample data associated with the given ``handle``. | ||
There will be one entry in the list per probe associated with the :term:`probe id` used when the sampling was set up. | ||
Retrieve a list of sample data associated with the given :term:`handle`. | ||
There will be one entry in the list per probe associated with the :term:`probeset id` used when the sampling was set up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're doing :term:`probeset_id`
then there's a lot of missed occurences that are just probeset_id
doc/python/simulation.rst
Outdated
@@ -274,29 +274,37 @@ Definitions | |||
current, mechanism state, and a number of other quantities, measured either over the whole cell, | |||
or at specific sites (see :ref:`pycablecell-probesample`). | |||
|
|||
Probes are described by probe addresses, and the collection of probe addresses for a given cell is | |||
probeset address | |||
Probes are described by probeset addresses, and the collection of probeset addresses for a given cell is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't say that probes are described by probeset addresses. Maybe "probesets are described by probeset addresses" but I really still don't see the distinction between the two, what's a probeset address? The cell_member
? That is usually already called the probeset id.
Thanks for the review! I've taken most into account. The The definitions were moved to the concept section (logical, right?) so the I left your comments on |
doc/python/probe_sample.rst
Outdated
@@ -5,14 +5,14 @@ Cable cell probing and sampling | |||
|
|||
.. module:: arbor | |||
|
|||
Cable cell probe addresses are defined analogously to their counterparts in | |||
Cable cell probeset addresses are defined analogously to their counterparts in | |||
the C++ API (see :ref:`cablecell-probes` for details). Sample data recorded | |||
by the Arbor simulation object is returned in the form of a NumPy array, | |||
with the first column holding sample times, and subsequent columns holding | |||
the corresponding scalar- or vector-valued sample. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sim.samples(handle)
will return a list of length K
. In each element, you'll find a tuple containing metadata
and data
, where metadata
will be a string describing the location, and data
a numpy.ndarray
. The structure of the numpy array can vary (and may not be a numpy array), that's at the bottom of the documentation for samples()
.
I reshuffled the text there a bit, hopefully it made this clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found some more sections where probe and probeset were used interchangeably. Other than that it's becoming clearer I think :) The new terms seem to work at conveying the message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice endeavour :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor comments.
A general observation on this part of the docs: A graph of the process could help. |
Which process is that exactly? Of creating your own probe? |
doc/python/probe_sample.rst
Outdated
@@ -5,14 +5,14 @@ Cable cell probing and sampling | |||
|
|||
.. module:: arbor | |||
|
|||
Cable cell probe addresses are defined analogously to their counterparts in | |||
Cable cell probeset addresses are defined analogously to their counterparts in | |||
the C++ API (see :ref:`cablecell-probes` for details). Sample data recorded | |||
by the Arbor simulation object is returned in the form of a NumPy array, | |||
with the first column holding sample times, and subsequent columns holding | |||
the corresponding scalar- or vector-valued sample. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You know me ... a graphic? A crude approximation
O------------- Cell morphology
^ ^ ^ Probe locset
| | |
L0 L1 L2 Locations, eg (cable 0 0.23)
U0 V0 W0 Values, one column per Location
U1 one entry per sampling time point
U2
...
Shortcut to the requested schematic: https://github.com/arbor-sim/arbor/pull/1898/files#diff-119b7780b97222f316ae7b9bf31a04ac8d477cc2c07f1dcbcd33722b14b23222 |
Merge remote-tracking branch 'upstream/master' into doc/probeset
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great image. The only thing that's omitted is the scalar vs vector valued probing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Being picky on the graphics, sorry. I make amends as soon as I figure out how to upload an SVG here.
doc/python/probe_sample.rst
Outdated
@@ -5,14 +5,14 @@ Cable cell probing and sampling | |||
|
|||
.. module:: arbor | |||
|
|||
Cable cell probe addresses are defined analogously to their counterparts in | |||
Cable cell probeset addresses are defined analogously to their counterparts in | |||
the C++ API (see :ref:`cablecell-probes` for details). Sample data recorded | |||
by the Arbor simulation object is returned in the form of a NumPy array, | |||
with the first column holding sample times, and subsequent columns holding | |||
the corresponding scalar- or vector-valued sample. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the graphic, but I think we could be even clearer by splitting it in two
- Showing the relation of handle, probeset, probes, and their locations
- Showing the relation between the sampled data and the probeset
Sorry for being picky here, but we inflict multiple layers of callbacks and handles on our
users (for good reasons), so giving them a hand is needed.
That said, I made one of the images, so maybe that plus the more schematic one you added
is enough. Tell me what you think.
Also, since I cannot comment on full files, feedback for the graphic here
- Typo: experession
- Did you check array format? I thought I remember columns of time, value. As shown it indicates rows. Maybe this would be clearer if shown as a table.
- The
->
relation is unclear
- handle -> probeset: obtain a probeset from simulation
- probeset -> probe, probe -> data|meta: ownership
- meta -> locset, data -> ndarray: is a
On 3. I would suggest showing 'has a' and 'is a' relations like this; example for probe
+---------------+
| Probe |
| meta: locset |
| data: ndarray |
+---------------+
The handle -> probeset -> probe could remain.
What probe returns vector data? I don't think I actually ever saw one... |
I have tweaked the image, added an example of I prefer this image, because it shows what you probably care about: how to get to your sampling data, and why a handle is involved. This is now consistent, without going into a tangent about the precise kind of relationship. The probeset box is now clearly separate, and hopefully illustrates that a handle is associated to it, but gives access to its individual probes. |
Me neither, but the docs mentioned it, so I thought they exist. |
Strictly speaking, it's up to the (C++) probe to define what it returns, and also whether that is one or more values per timestamp (or a timestamp at all). Although it can be any structure, as far as I know the ones we have now return an |
In Python we do not have user-defined callbacks. Thus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for improving this. Over experimenting with the API and reading the code in
more depth for this review it became clear that there is still some potential
confusion.
For reference, I'll set up an example
class recipe(A.recipe):
# ...
def probes(self, _):
if gid == 0:
return [A.cable_probe_ion_diff_concentration_cell("na"),
A.cable_probe_membrane_voltage("(location 0 0)"), ]
else:
return [A.cable_probe_ion_diff_concentration_cell("ca"),
A.cable_probe_membrane_voltage("(terminal)"), ]
# ...
R = recipe()
S = A.simulation(R())
ts = A.regular_schedule(0.1)
hdls = [S.sample((0, 0), ts), # Sodium in gid 0
S.sample((0, 1), ts), # U_m in gid 0 soma
S.sample((1, 0), ts), # Calcium in gid 1
S.sample((1, 1), ts), # U_m in gid 1 terminals
]
sim.run(tfinal=0.5)
for hdl in hdls:
for data, meta in S.samples(hdl):
pass
Here we extract three vector probes and one scalar. For the sake of the
example assume two cells; cell 0 has N
segments and cell 1 M
segments of which T
are
terminals. The schedule evaluates to five points [0, 0.1, 0.2, 0.3, 0.4]
. We
were to tabulate the result we would see something like this
Handle | Meta | Data | Quantity | Vector |
---|---|---|---|---|
(0, 0) | [(cable ..), ...] length N |
NDArray[N + 1 , 5] |
Nad |
Y |
(0, 1) | (cable ..) |
NDArray[2, 5] | Um |
N |
(1, 0) | [(cable ..), ...] length M |
NDArray[M + 1 , 5] |
Cad |
Y |
(1, 1) | (cable ..) |
NDArray[2, 5] | Um |
N |
Regarding the improved figure, thanks for keeping up with it. In the light of the above
we should rework s.t.
meta
is a list of locations iff a vector probe is sampled else a single locationdata
is a table with one column per location above plus one for the time- why is a handle associated with a set? In my experiments never seemed to be that? Is there a fundamental difference Py/C++? If so we need a huge red warning banner
- what does the dashed backedge do?
- why does every probe have a connection from a single handle?
Feel free to use and adapt the above example.
Further it should be clarified
_cell
probes are vector probes, everything is a scalar- how the locations for
_cell
probes are made handle
associates has(cell, num of probe on cell)
- asking for a nonexistent probe
(23, 42)
will get something empty not an error - even if you -- in Python at least -- probe on locsets with cardinality > 1 you'll get one location only (BUG?)
- thus it's more accurate
cable_probe_XYZ(location)
instead oflocset
- no user callbacks in Python
I had to refresh my memory a bit.
Can the further clarifications wait for a second PR (in the interest of not letting this one slide any further than it has)? |
I'd love to make it correct on the first possible moment, since we risk users coming up with misconceptions otherwise. probe(locset('(terminal)'), membrane_voltage) does not in fact return a list of voltages at all terminal points, but rather at one of those points. Could you confirm this behaviour? Just tweaking the single_cell_recipe example should be sufficient.
The point here is that rather than returning a |
Let's start at
The other direction: Now, the term Questions:
|
There two levels to this:
We need to document clearly both these points and think whether we want to change 2. |
Another comment regarding #1929: Spikes are not recorded using the mechanism of probing. |
Thanks for clarifying that So, after checking, the nonobvious difference between a
We can call it a bug. I've update #1929 to correct this. |
Updated, see https://arbor--1898.org.readthedocs.build/en/1898/python/probe_sample.html for the new diagram. I also added a definition for I think it's now clear what the difference is, and #1929 will remove that difference, as soon as I get back. Please do merge in the meantime :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last duplicated concept: probe (arbor.probe
) and probe
(element of probeset
)
Here's a script to demonstrate the difference. import arbor
tree = arbor.segment_tree()
p = tree.append(arbor.mnpos, arbor.mpoint(-3, 0, 0, 3), arbor.mpoint(3, 0, 0, 3), tag=1)
tree.append(p, arbor.mpoint(3, 0, 0, 3), arbor.mpoint(-3, 0, 0, 3), tag=2)
tree.append(p, arbor.mpoint(3, 0, 0, 3), arbor.mpoint(-3, 0, 0, 3), tag=2)
decor = (
arbor.decor()
.set_property(Vm=-40)
.paint('"soma"', arbor.density("hh"))
.place('"midpoint"', arbor.iclamp(10, 2, 0.8), "iclamp"))
cell = arbor.cable_cell(tree, labels, decor)
class single_recipe(arbor.recipe):
def __init__(self):
arbor.recipe.__init__(self)
def num_cells(self):
return 1
def cell_kind(self, gid):
return arbor.cell_kind.cable
def cell_description(self, gid):
return cell
def probes(self, gid):
return [arbor.cable_probe_membrane_voltage('(location 0 0.5)'),
arbor.cable_probe_membrane_voltage_cell(),
arbor.cable_probe_membrane_voltage('(join (location 0 0) (location 0 1))'),
]
# (4.6) Override the global_properties method
def global_properties(self, kind):
return arbor.neuron_cable_properties()
recipe = single_recipe()
sim = arbor.simulation(recipe)
handles = [sim.sample((0, n), arbor.regular_schedule(0.1))
for n in range(3) ]
sim.run(tfinal=1)
for hd in handles:
print("Handle", hd)
for d, m in sim.samples(hd):
print(" * Meta:", m)
print(" * Payload:", d.shape)
|
Thanks @thorstenhater, that's almost the script I have here :) I'll add it as an example, but I'll remove it at soon as the shape of record outputs is made consistent between probeset and vector probes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
probe id
andprobe id
toprobeset id
andprobeset_id
to clarify that they are associated to a (loc)set.Fixes #1877, #1484