numenta / htmresearch

Experimental algorithms. Unsupported.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Create SP Mixin

subutai opened this issue · comments

Create a spatial pooler mixing class that monitors a few metrics:

The distribution of column activity (are all columns being used equally). This is very similar to the active duty cycle.

What is the average overlap for each column before inhibition?

Are we getting good SDR's? For two patterns that have N bits of overlap, how many bits of overlap do the SDR's have? This could be represented in an NxM overlap count matrix. OverlapCount[i,j] would be the number of times two patterns that had i bits of input overlap had j bits of SDR overlap. We should see a strong diagonal and a gradual drop off as you move away from the diagonal.

A very good model for this mixin class is TemporalPoolerMonitorMixin. Especially for that last metric.

A couple of comments here:

Although the SP class maintains column duty cycle, I think this mixin should compute it explicitly. The SP's duty cycle is only during learning and is a rolling average. We would want something that works for learning and inference, and is not time averaged.

Here's is pseudocode for some stats I compute right now and find useful:

dutyCycles = numpy.zeros(numColumns, dtype=GetNTAReal())

for all patterns in a set:
    RUN_SP_COMPUTE
    dutyCycles += activeColumns

# Compute duty cycle statistics
tenPct= nTrainingIterations/10
print "My duty cycles:",fdrutilities.numpyStr(dutyCycles, format="%g")
print "Number of nonzero duty cycles:",len(dutyCycles.nonzero()[0])
print "Mean/Max duty cycles:",dutyCycles.mean(), dutyCycles.max()
print "Number of columns that won for > 10% patterns",\
            (dutyCycles>tenPct).sum()
print "Number of columns that won for > 20% patterns",\
            (dutyCycles>2*tenPct).sum()

The above gives me a very good sense for how good the SDR's are across a range of patterns.