joschu / cgt

Computation Graph Toolkit

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add support for theano's tensor.inc_subtensor command

avostryakov opened this issue · comments

In theano there is a great command:
http://deeplearning.net/software/theano/library/tensor/basic.html#theano.tensor.inc_subtensor

I can use it following way:
embedding_grads = theano.grad(cost, embedding_output)
updates[embedding.W] = T.inc_subtensor(embedding.W[T.reshape(input_var, (N_BATCH * MAX_LENGTH, ))],
-LEARNING_RATE * T.reshape(embedding_grads, (N_BATCH*MAX_LENGTH, 300)))

It helps to train only embedding word vectors that exist in current mini-batch.

Hi Magic, I've implemented inc_subtensor in 7d30be8.
The syntax is a little different than python, you have to write
inc_subtensor(x, slices, y)
which implements
x[slices] += y
See cgt/tests/test_inc_subtensor.py for an example of how to use inc_subtensor with three different types of indexing.

Your example also required a type of indexing that previously wasn't implemented, in which you need to index using an integer array along one dimension. I implemented this type of indexing in a43259d.

Let me know if these changes do what you need.

Cool, but what does your inc_subtensor return then? The docstring is a bit scarce... Theano's inc_subtensor(x[slice], y) returns an expression for x with x[slice] incremented by y. That's required so you can use it in an update dictionary as in @avostryakov's example. inc_subtensor(x, slice, y) would be fine with respect to the syntax (I even find it easier to understand), but it would still need to return something that represents the full x with a slice of it changed. Is that what it does?

Indeed, it returns a tensor variable where the slice has been incremented. I improved the docstring.

def inc_subtensor(x, slis, y):
    """
    Returns the array that is obtained by incrementing x[slis] by y
    This function corresponds to the following numpy code:
        out = x.copy()
        out[slis] += y
        return out
    Note that due to an in-place optimization, the copy operation is
    usually not performed.

    See subtensor docstring for a list of appropriate formats for `slis`
    Only formats 2-4 are allowed for inc_subtensor
    """
    See subtensor docstring for a list of appropriate formats for `slis`
    Only formats 2-4 are allowed for inc_subtensor

I'll close this issue, since I think it's resolved.

Sorry, for a late response. I was busy and installation of cgt is not trivial now :) But I installed and checked a new inc_subtensor. It works how it is expected. Thank you!

But when I was doing it I discover that not all necessary matrix indexing is supporting. I'll create new issues.

commented

idx=theano.tensor.ivector()
word_embedding=.... #a float matrix, theano shared variable
subset = word_embedding[idx]

g = T.grad(cost, subset)
updates[word_embedding] = T.inc_subtensor(x, g *lr)


I have a question: does 'idx' need to be a vector? what if idx is a matrix?
for example use mini batch to train, every training case has m words, and every batch has n cases, then 'idx ' should be a matrix of n * m.

and i gotta error when i use 'idx' as a matrix:
`ValueError: array is not broadcastable to correct shape
Apply node that caused the error: AdvancedIncSubtensor1{no_inplace,inc}(<TensorType(float64, matrix)>, Reshape{2}.0, Reshape{1}.0)
Toposort index: 18
Inputs types: [TensorType(float64, matrix), TensorType(float64, matrix), TensorType(int32, vector)]
Inputs shapes: [(40000, 50), (2, 100), (4,)]
Inputs strides: [(400, 8), (800, 8), (4,)]
Inputs values: ['not shown', 'not shown', array([1, 2, 0, 3], dtype=int32)]
Outputs clients: [['output']]

Backtrace when the node is created(use Theano flag traceback.limit=N to make it longer):
File "/usr/local/anaconda2/lib/python2.7/site-packages/traitlets/config/application.py", line 596, in launch_instance
app.start()
File "/usr/local/anaconda2/lib/python2.7/site-packages/IPython/terminal/ipapp.py", line 345, in start
self.shell.mainloop()
File "/usr/local/anaconda2/lib/python2.7/site-packages/IPython/terminal/interactiveshell.py", line 548, in mainloop
self.interact(display_banner=display_banner)
File "/usr/local/anaconda2/lib/python2.7/site-packages/IPython/terminal/interactiveshell.py", line 672, in interact
self.run_cell(source_raw, store_history=True)
File "/usr/local/anaconda2/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2723, in run_cell
interactivity=interactivity, compiler=compiler, result=result)
File "/usr/local/anaconda2/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2825, in run_ast_nodes
if self.run_code(code, result):
File "/usr/local/anaconda2/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2885, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 1, in
upda = T.inc_subtensor(x, gg)

`

@myexceptions: Sorry, you're in the wrong place here... this is the Issue tracker of CGT, a possible alternative library to Theano. Try on theano-users. And include your definition of x when doing so, otherwise it's not clear what you were doing.