Efficient Table Representation
ngeiswei opened this issue · comments
The various representations suggested in issues #3, #12 and #14 are great for
reasoning but not so great for efficient calculations, thus the
following suggestion: Represent column values (i.e. values associated
to each feature) as a list of values living in the atom feature
itself. For instance assume we have table
+--+--+--+
|o |f1|f2|
+--+--+--+
|1 |0 |1 |
+--+--+--+
|1 |1 |0 |
+--+--+--+
|0 |0 |0 |
+--+--+--+
The values feature f1
would be represented as the list [0,1,0]
attached to f1
via the Atom::setValue
method. The key could be
Node "*-AS-MOSES:SchemaValuesKey-*"
and the ProtoAtom value could be
FloatValue
iff1
is numericalLinkValue
iff1
is Boolean, in such caseTrueLink
or
FalseLink
could be used to representtrue
andfalse
. An
alternative would be to implementBoolValue
that holds directly
boolean C++ values which would be more efficient.
That representation could be obtained directly from a Table or from
the various existing representation. Since reasoning isn't needed yet
it could be fine to obtain it directly from the Table.
An another thing we'll want to support is to represent duplicated rows
in the same manner that CTable does, but that's for another time and
another issue.
Implemented in singnet#3
If you get the urge to implement BoolValue, that's OK, I guess. Initially, I wanted to stay minimalist.
Also, it is OK to use the special-purpose key Node "*-AS-MOSES:SchemaValuesKey-*"
for now, but in the long run, you will want to make this user-specifiable, maybe even per-feature. That way, you can wire-in data sources from wherever.
Note that some values are designed to be time-changing...