Efficient Table Representation

Question

Efficient Table Representation

ngeiswei opened this issue 6 years ago · comments

The various representations suggested in issues #3, #12 and #14 are great for
reasoning but not so great for efficient calculations, thus the
following suggestion: Represent column values (i.e. values associated
to each feature) as a list of values living in the atom feature
itself. For instance assume we have table

+--+--+--+
|o |f1|f2|
+--+--+--+
|1 |0 |1 |
+--+--+--+
|1 |1 |0 |
+--+--+--+
|0 |0 |0 |
+--+--+--+

The values feature f1 would be represented as the list [0,1,0]
attached to f1 via the Atom::setValue method. The key could be

Node "*-AS-MOSES:SchemaValuesKey-*"

and the ProtoAtom value could be

FloatValue if f1 is numerical
LinkValue if f1 is Boolean, in such case TrueLink or
FalseLink could be used to represent true and false. An
alternative would be to implement BoolValue that holds directly
boolean C++ values which would be more efficient.

That representation could be obtained directly from a Table or from
the various existing representation. Since reasoning isn't needed yet
it could be fine to obtain it directly from the Table.

An another thing we'll want to support is to represent duplicated rows
in the same manner that CTable does, but that's for another time and
another issue.

Nil Geisweiller · Answer 1 · Mon Sep 17 2018 19:00:09 GMT+0800 (China Standard Time)

Implemented in singnet#3

Linas Vepštas · Answer 2 · Mon Sep 17 2018 19:29:24 GMT+0800 (China Standard Time)

If you get the urge to implement BoolValue, that's OK, I guess. Initially, I wanted to stay minimalist.

Linas Vepštas · Answer 3 · Mon Sep 17 2018 19:33:21 GMT+0800 (China Standard Time)

Also, it is OK to use the special-purpose key Node "*-AS-MOSES:SchemaValuesKey-*" for now, but in the long run, you will want to make this user-specifiable, maybe even per-feature. That way, you can wire-in data sources from wherever.

Note that some values are designed to be time-changing...