Re: [buildcheapeeg] Managing a network of nodes and data streams in one thread

From: Dave (dfisher_at_pophost.com)
Date: 2002-03-28 21:17:51


On Wed, 27 Mar 2002 23:35:35 +0000, Jim Peters wrote:

>> In OutPort, you have one single 'val' and a 'cnt'. What does 'cnt'
>> refer to? Since 'val' is not an array or pointer, I am not sure how
>> 'cnt' is being used.
>
>Okay, I'm working on the idea that there is no buffering between the
>parts of the system. So we only ever need to store a single value --
>the current value. Since different data streams might go at different
>rates, we also need some way to indicate whether the 'val' value is a
>new one, or the same one we saw last time around. This is why I'm
>having a 'cnt' counter value, which changes every time a new value
>goes into 'val'.

And then just have this value wrap back to zero when the integer maxes out
(INT_MAX, or UINT_MAX if unsigned)? I am a bit concerned about doing this
unbuffered, and would want to throw together some tests to see what happens
when we scale it upwards by adding more and more process() calls. As long as
the combined process() calls are faster than the rate at which data is
streaming, all would be fine. I suppose that would hold true for any amount of
processing we did, so buffered or unbuffered, we are going to have to handle
the data faster than it comes in. I dunno; it will take a few tests to find
out if this is an issue or not.

>Anyone watching an OutPort keeps a copy of 'cnt' from the last time
>they looked. At some later point, if 'cnt' has changed since the last
>time they looked, then there must be a new value there to process.
>
>> What is 'nn' -- the next spot on the stack to push a sample value?
>
>No, there is no stack. Each 'nn' is a different output stream. The
>whole idea was that each 'Node' would have a set of input streams, and
>a set of output streams.

What I think you have, then, is a variation on the MVC paradigm. In this case,
the "model" is your Node-supervisory code which knows about the data and all
the "views" attached to the data. The attached observer views and their
update() calls would be akin to your linked Nodes with their process() calls.
However, MVC uses a notification system to the views, telling them when new
data is available. And, as I understand it, it is up to each view to then to
call the model to actually get the updated data. This is different from your
idea where the model (i.e., node-supervisor) would be respondible for calling
the update() views directly.

But, since no one else is jumping in to help flesh out the use of MVC in the
context of this project and my previous MVC questions have gone unanswered, I'm
going to have to let it drop.

>> Also, am I right in assuming that the Sensor class would inherit
>> Node since these are the classes interested in sending data?
>
>I really don't know about the best way to set up an inheritance graph.
>I would guess that you can say "a Sensor is a Node", so that probably
>means it should inherit ... !?

I don't know.... because if we say that, then the definition of a Sensor
changes in that a sensor object now also takes on the meaning and function that
is in the Device class. Perhaps it is more accurate to say that any object
which produces or receives real time sample data inherits the Node class.
Thus, the BioDevice class would inherit Node. It would be up to BioDevice to
add the correct number of out[nn] data streams.

>> If that is the case, is it necessary for every Node to have an
>> InPort link, or can that be null to indicate that there is no input
>> link?
>
>Well, maybe this needs to be expressed differently in C++. The idea
>was that you could have an array of zero InPorts if you don't actually
>need any. I don't know about how to do arrays 'nicely' in C++, but in
>C, it could all be done like this:

Arrays can be done in the same was as in C, either as directly allocated using
brackets [], or via a pointer to allocated memory. Or, you can let an STL
template manage the array using lists, vectors, etc., depending on how you need
access to the array. I have a better understanding now on what you were
writing in Java (and why you mentioned that classnames were pointers). It
could look like this in C++:

#include <vector>

class OutPort {
double val;
int cnt;
};

class InPort {
OutPort *link;
int cnt;
};

class Node {
private:
std::vector<InPort*> in;
std::vector<OutPort*> out;
public:
virtual void process() = 0;
};

Gotta admit that Java's expression of the array was a whole lot nicer.

>> Can you give some concrete examples of what that input/output
>> linkage would look like using sensor data and/or dsp processing?
>> Could you give one or more real life case examples to help me ground
>> my understanding?
>
>Okay, let's say we have a Sensor that is generating two channels of
>data, a FilterBank that converts a single input stream into a set of
>output streams (16 in this example), and a 'Christmas tree' display
>module which accepts two sets of 16 inputs to display a double
>bar-graph up the screen.

Ok; thanks--your description is very helpful.

> Sensor *dev= sensor_new("/dev/ttyS1", "prograph/9600", ... etc ...);

I think that the above should not be a "Sensor," but instead "Device" -- or
more pointedly, BioDevice. This might simply be a difference in the use in
terminology, but I think it is important to clear up. The physical correlate
of a BioSensor is an electrode/wire. Thus, a BioDevice will have one more more
BioSensors. When you say above that "a Sensor is generating two channels of
data," I think that shoud really read as "a BioDevice is generating two
channels of data." This makes a difference in the way we use our BioDevice and
BioSensor objects internally. It also helps keep behavior and function
separate based on what we want to interact with -- a BioDevice or a BioSensor.

> FBank *fb1= fbank_new(16, 1.0, 16.0); // 16 filters between 1Hz and 16Hz
> FBank *fb2= fbank_new(16, 1.0, 16.0); // 16 filters between 1Hz and 16Hz
> XmasDisp *xmas= xmas_new(16, ... display details, whatever ...);

What I like about your plan is the patch-chord kind of arrangment of the
objects, where you can daisy-chain input and outputs together in a variety of
configurations. This could keep data flows very flexible based on what may be
needed later on down the line in terms of biofeedback protocols. What I would
like to see is it more situated within the BioDevice/BioSensor framework. This
would alleviate the need for node-supervisory code being in the node code (as
globals or whatever), and keep the flow more natural in terms of the system.

For example, we've already touched on the issue of what kicks the whole network
into motion. The natural starting place for that kick-off would be when data
is received from a sensor. This is controlled by the BioDevice object, which
reads the raw data stream in from the port/device and is able to separate it
out into individual sensor streams for our use (or, plug one byte at a time
into a receiving node as per your plan -- I am using the word "stream" here
loosely). Otherwise, where (or who) would start the traversal of the node/port
tree? If I am understanding correctly, that traversal will only occur when
there is data ready from the external device for a particular port in the node
list. Are there are other situations I'm not thinking of?

> // Connect filterbank inputs to the first two 'dev' output channels
> fb1->in[0].link= &dev->out[0];
> fb2->in[0].link= &dev->out[1];
>
> // Connect display inputs to filterbank outputs -- pretend that
> // ordering of display inputs is top-bottom, L/R/L/R, i.e. interleaved
> // and upside-down compared to the filterbank outputs
> for (int a= 0; a<32; a++) {
> FBank *fb= (a&1) ? fb2 : fb1;
> int n= 15 - a/2;
> xmas->in[a].link= &fb->out[n];
> }
>
>That's it. When this network comes to be executed, the Node handling
>code would do this sequence of calls for every input sample:
>
> dev->process();
> fb1->process();
> fb2->process();
> xmas->process();

Ok... except dev, fb1, fb2, and xmas would really be pointers kept in a list
which would then be controlled by the node-supervisory code, right? The more I
think about this, the more it seems like there should be a Network class to
handle traversal of the nodes as well as adding or removing nodes from the
list, or that traversal should only be done from the BioDevice class when new
data arrives for a particular sensor.

>For display Nodes, I suggest that they keep a data structure somewhere
>containing the current data to display, and on the ->process() call,
>they update that data and then set a flag for some other thread to
>actually do the redraw. That way it doesn't matter if the front-end
>can only update for 1 in every 20 samples. With a little bit of care,
>this makes the whole thing rock-solid, so it can work 100% reliably
>even when there are large redraws or whatever to slow down the
>front-end.

Hmmm... that sound feasible. That type of arrangement (setting an event flag
for another thread context) could be used for any processor that was going to
slow things down.

>> >Node-supervisor code somewhere can keep a list of all Nodes and scan
>> >them to see which depend on which others through their InPort 'link'
>>
>> Would the Node-supervisor be a part of BioDevice, in that it sits
>> waiting for data from the device, and then scans the node list to
>> see who gets what?
>
>I'm seeing the Node supervisor code as part of the Node class,
>represented by a few class-global functions and variables (I can't
>remember whether C++ has this -- it must do, surely. Are they
>'static' ?).

Yes, static variables still exist, and, if declared static in a class, will be
shared amongst all objects instantiated from that class. I see how you are
thinking about this, and that would be a great way to keep a list of the entire
network tree. As you mentioned in another message, this would only work,
though, for one thread since you would not want to have some object adding a
link to the list while another one was busy traversing the tree. If we do need
to add some synchronization mechanisms, things like semaphores offer low
overhead for this.

>I know what you're saying -- how does the network get kicked off in
>the first place ? This is another question, separate in some ways
>from the question of how to connect up the network itself.

>Here is one approach: we could say that anything that has outputs but
>no inputs must obviously create data from some external source, in
>which case it always gets called to trigger off a new cycle. Nodes
>that have no inputs could be written in such a way that they block
>until they have some data for us. Once the ->process() call returns,
>then the rest of the network runs through with the new data, until
>execution again returns to input device ->process() call to wait for
>some more.

Would we ever have more than one node that is a "producer only" node (meaning a
node which only has a set of outputs, and no inputs because it receives data
from an external source)? Based on prior discussions, it seemed like others
were not in favor of building in support for multiple devices, thus I am not
sure under what conditions there would be a reason to have more than one
producer node in the network. I would not mind having the flexibility of
adding such capability, but am thinking that this will effect the answer to the
question of who kicks off the traversal of the network when new data arrives.

>Certainly a process() method could be used to send output to a pipe or
>socket, so long as there isn't much chance of it blocking (or else it
>would hold other things up). If we were going to allow several Nodes
>to read from different input devices, e.g. several sockets and/or
>serial streams, then we would need some additional code that does a
>big select(), before triggering whichever Node has waiting data.

Right now the way I see it is that the whole process will block until data is
received from the external device (in BioDevice). I can see that there would
be ways to extend this design later on if I or someone else really wants
multiple device support.

>Well, that's just my personal frustration with 'object-oriented'
>coding (or perhaps 'wrapper-oriented' coding), but never mind.
>
>Anyway, I'm not going to argue about this (apart from grumbling a
>bit), and I will accept the use of wrapper functions if it will make
>things more conventional and predictable for C++ programmers.

You grumble so nicely, though. :) I hear everything you say, and yes, OO can
certainly add some "wordy" overhead through abstraction layers. But it also
makes it a bit more readable and descriptive. For example

BioDeviceProComp procomp;
FilterBank filterbank1;

filterbank1.AddInputPort( procomp.GetOutputPort(chan) )

says a bit more than

fb1->in[0].link= &dev->out[0]

and gives us the ability to make changes to the add/get methods if need be.
And, true, they may always remain just one line of code which could compile
down to an inline statement. It's a trade-off, to be sure.

What's nice is having the best of both worlds -- the abstraction for separation
of the implementation details and ease-of-use and the source so that we can
still lift the hood and check to see just how the engine is designed.
Fortunately, we have both here.

Dave.



This archive was generated by hypermail 2.1.4 : 2002-07-27 12:28:43 BST