Re: [buildcheapeeg] EDF and DDF file formats

From: Jim Peters (jim_at_uazu.net)
Date: 2002-03-13 00:13:13


Sar Saloth wrote:
> As an important note, I was not thinking of implementing a full XML parser
> with a complex nested format. It is possible to maintain XML syntax from
> an essentially "flat" structure. I don't know the real software term for
> flat, but I mean something that wouldn't have nested records (nodes?).

I know what you mean. But it is better not to call it XML, then,
because XML has loads of requirements that you aren't going to be
keeping to. For example, if the character encoding was UTF-8, then
you can't just dump binary data into the file, because that is not
legal UTF-8. Officially, you'd need to encode it in base64 or
something. Really you're talking about a kind of improvised tagged
ASCII format.

> I vowed in the future to leave the low-level bit-twiddling to
> low-level people and let the programmers have simple human readable
> text commands.

I agree, I think that is a good idea.

> > >- Someone will have to write a class or library to read/write/seek
> > > around this kind of file.
>
> ?Standard? methods for working with XML do that. It is essentially
> traversing a tree.

Only if you are willing to drag it all into memory, though, surely ?
I think there are ways to stream XML data without pulling it all into
memory at once, but still we would need a lot more code to handle a
true XML-encoded data stream than for a more simple binary format.

If it was just the descriptive information stored in XML, rather than
the raw samples, then it wouldn't be such a problem to load it all
into memory. (e.g. if we have two files: a descriptive XML file + a
binary data file)

> One of the simplifications for this problem is that usually only one
> piece of hardware collects the high data rate information and the
> other pieces can collect the low rate information. In this case,
> resampling won't cause much error in the slow signals as long as the
> re synchronizing adds only the jitter equivalent to the fast sample
> rate and NOT the slow one.

Neither EDF nor the binary format that I suggested can handle this --
they both assume that the sampling rates are locked in a fixed ratio.
Even if the jitter was okay, the drift would make problems, because at
some point you'd have either one sample too many, or one sample too
few to fill in the data chunk, if I understand what you are saying
right.

I think there has to be synchronisation between the sampling rates, or
else we have to think of a completely different approach, and much
more complex analysis and storage. (Actually, you say exactly this
later on, so I think we are agreeing)

> 2. If a sample is bad, either the suggested method or how about this
> suggestion? choose a number outside of the maximum or minimum
> binary level to signify a bad sample. Of course that means that with
> a 16 bit converter you would have to not use 1 or two of the 56636
> codes. I was thinking of leaving two unused codes, one for a bad
> sample and one for lead-off. Or would Lead-off be better handled by
> the annotation stream?

This causes problems if we're converting from existing stored data
that uses the full 16-bit range. Did you see my suggestion that we
store a separate error channel of 1 bit per sample to keep error
flags ? This wouldn't be per-channel, though.

For floating point data, it is possible to store NaN (not-a-number) as
a value, which would do what you suggest.

> Is anything else reasonable? Does loss of synch mean that the
> serial port couldn't keep up? I have been put under the impression
> that modern PCs should be able to handle 115Kbaud. if the loss is a
> very rare event, then the data could reasonable be considered
> corrupt. If the loss of the data is frequent, don't we have a
> reliability problem?

I suppose, as you say, we could take any sync loss as a complete break
in the signal, and do something like DDF, and start a new segment of
the file, or even start a new file altogether. That would save us
storing error information. It is an option.

I think the important thing, though, is that we do *something* that
will alert the user -- whether that is breaking the file at that
point, or flagging an error in an error channel. The worst thing
would be to just insert zeros and hide the error.

Jim

-- 
Jim Peters (_)/=\~/_(_) jim_at_uazu.net
(_) /=\ ~/_ (_)
Uazú (_) /=\ ~/_ (_) http://
B'ham, UK (_) ____ /=\ ____ ~/_ ____ (_) uazu.net


This archive was generated by hypermail 2.1.4 : 2002-07-27 12:28:40 BST