From: Doug Sutherland (wearable_at_earthlink.net)
Date: 2002-03-04 01:25:36
Regarding files and file formats, a few comments ...
Databases are useful when you have a large amount of data
that you want to retreive randomly by a key (ie from the
middle of the data set somewhere), where you want to do
queries based on certain search criteria, or for dynamic
aggregations or statistics. Almost all of our data will
be the sensor data, and it doesn't make sense to store
that in a database. All of the other data will be just
configuration info, and a small amount of data. I don't
see any justification for using a database. The only
benefit it will yield is ease of programming, and that
is not a good reason to use a database. I think that we
can do everything we need in flat files.
Another reason to avoid databases is to keep the system
small and maneagable. We already have two additional
packages (FFTW and SDL) and I think we should keep the
add-on to a minimum, and keep the executable code small
so it performs well even on older systems.
I don't think it matters whether there is a separate file
per channel or all one file. If different devices have
different sampling rates, then there will be a different
number of records for them, no problem for a system to
deal with, as long as we have a unique identifier for
each channel. It probably makes sense to have a config
file (updated via GUI) that specifies which channels are
for which type of data (EEG/ECG/GSR/etc) and what the
sampling rates will be.
As was already mentioned, there is no need to go low
level on file access, the OS already takes care of it,
and it's buffered too. Even at max capacity of 115.2k
stream of 6 or 8 channels, we are not talking about
huge amonts of IO here. I'm pretty sure that standard
file access methods will be fine for this.
I think that we need to allow for mutiple user setups
(configurations) and accomodate saving of configuration
and session data by user. One easy way to do this is
to have the system create a subdirectory for each user
and place the config and data files there. I think all
config can be done in simple flat files, they will be
small files. We will probably end up with one global
system configuration file, and one or more config
files per user (stored in subdirectories).
Regarding data formats, we cetainly DO need to define
these, we can't even store things in RAM if we don't!
We need to be careful in definition and make things
flexible (ie don't force channel 1 to be EEG), but we
will need to decide on formats. Almost all languages
force you to decide on integer or float or double or
string formats. Even just to read and write files we
will need type definitions. In theory we could make
arbitrary types, there are some systems that do this,
but it's not worth it, we will then need a whole
bunch of conversion routines to do processing.
I don't think that the format of the configration files
is all that important, as long as we store the right
things and allow the right associations between different
records. The most important thing to decide on is unique
identifiers (ie keys).
For the raw data streams, it might make sense to adopt an
existing standard, thus allowing more interoperability
with other software, both EEG software and statistical
analysis tools, I think we should think pretty hard on
the raw data format before going very far.
Namaste,
Doug
This archive was generated by hypermail 2.1.4 : 2002-07-27 12:28:39 BST