opencog / asmoses

MOSES Machine Learning: Meta-Optimizing Semantic Evolutionary Search for the AtomSpace (https://github.com/opencog/atomspace)

Home Page:https://wiki.opencog.org/w/Meta-Optimizing_Semantic_Evolutionary_Search

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Efficient Table Representation

ngeiswei opened this issue · comments

The various representations suggested in issues #3, #12 and #14 are great for
reasoning but not so great for efficient calculations, thus the
following suggestion: Represent column values (i.e. values associated
to each feature) as a list of values living in the atom feature
itself. For instance assume we have table

+--+--+--+
|o |f1|f2|
+--+--+--+
|1 |0 |1 |
+--+--+--+
|1 |1 |0 |
+--+--+--+
|0 |0 |0 |
+--+--+--+

The values feature f1 would be represented as the list [0,1,0]
attached to f1 via the Atom::setValue method. The key could be

Node "*-AS-MOSES:SchemaValuesKey-*"

and the ProtoAtom value could be

  1. FloatValue if f1 is numerical
  2. LinkValue if f1 is Boolean, in such case TrueLink or
    FalseLink could be used to represent true and false. An
    alternative would be to implement BoolValue that holds directly
    boolean C++ values which would be more efficient.

That representation could be obtained directly from a Table or from
the various existing representation. Since reasoning isn't needed yet
it could be fine to obtain it directly from the Table.

An another thing we'll want to support is to represent duplicated rows
in the same manner that CTable does, but that's for another time and
another issue.

Implemented in singnet#3

If you get the urge to implement BoolValue, that's OK, I guess. Initially, I wanted to stay minimalist.

Also, it is OK to use the special-purpose key Node "*-AS-MOSES:SchemaValuesKey-*" for now, but in the long run, you will want to make this user-specifiable, maybe even per-feature. That way, you can wire-in data sources from wherever.

Note that some values are designed to be time-changing...