hughperkins / pytorch

Python wrappers for torch and lua

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Pushing type 'numpy.ndarray' failing on OSX

cjmcmurtrie opened this issue · comments

commented

I managed to build pytorch on OSX and all the tests passed. Running python pybit.py also worked without a hitch.

However, I cannot get the more complex example from the front-page to run (the convolutional network on MNIST). Both versions of this (from the examples folder and from the frontpage) throw the same error. I'm in a Python 2.71 environment, no CUDA GPU on this machine. This is the traceback:

$ python run_torch_model.py
Traceback (most recent call last):
  File "run_torch_model.py", line 35, in <module>
    lab_batch)
  File "build/bdist.macosx-10.5-x86_64/egg/PyTorchAug.py", line 147, in mymethod
  File "build/bdist.macosx-10.5-x86_64/egg/PyTorchAug.py", line 109, in pushSomething
Exception: ("pushing type <type 'numpy.ndarray'> not implemented, value ", array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ..., 
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]], dtype=uint8))

This is the Python script I'm running (I grabbed the MNIST data from Sklearn but the error is the same using the python-mnist package):

from __future__ import print_function, division
import PyTorch
import PyTorchHelpers
import numpy as np
from sklearn.datasets import fetch_mldata


batchSize = 32
numEpochs = 2
learningRate = 0.02

TorchModel = PyTorchHelpers.load_lua_class('torch_model.lua', 'TorchModel')
torchModel = TorchModel('cpu', 28, 10)

mnist = fetch_mldata('MNIST original', '')

images = mnist.data
labels = mnist.target

labels += 1  # since torch/lua labels are 1-based
N = labels.shape[0]

numBatches = N // batchSize
for epoch in range(numEpochs):
  epochLoss = 0
  epochNumRight = 0
  for b in range(numBatches):

    im_batch = images[b * batchSize:(b+1) * batchSize]
    lab_batch = labels[b * batchSize:(b+1) * batchSize]

    res = torchModel.trainBatch(
      learningRate,
      im_batch,
      lab_batch)

    numRight = res['numRight']
    epochNumRight += numRight
  print('epoch ' + str(epoch) + ' accuracy: ' + str(epochNumRight * 100.0 / N) + '%')

This is the Torch script:

require 'torch'
require 'nn'

local TorchModel = torch.class('TorchModel')

function TorchModel:__init(backend, imageSize, numClasses)
  self:buildModel(backend, imageSize, numClasses)
  self.imageSize = imageSize
  self.numClasses = numClasses
  self.backend = backend
end

function TorchModel:buildModel(backend, imageSize, numClasses)
  self.net = nn.Sequential()
  local net = self.net

  net:add(nn.SpatialConvolutionMM(1, 16, 5, 5, 1, 1, 2, 2))
  net:add(nn.ReLU())
  net:add(nn.SpatialMaxPooling(3, 3, 3, 3))
  net:add(nn.SpatialConvolutionMM(16, 32, 3, 3, 1, 1, 1, 1))
  net:add(nn.ReLU())
  net:add(nn.SpatialMaxPooling(2, 2, 2, 2))
  net:add(nn.Reshape(32 * 4 * 4))
  net:add(nn.Linear(32 * 4 * 4, 150))
  net:add(nn.Tanh())
  net:add(nn.Linear(150, numClasses))
  net:add(nn.LogSoftMax())

  self.crit = nn.ClassNLLCriterion()

  self.net:float()
  self.crit:float()
end

function TorchModel:trainBatch(learningRate, input, labels)
  self.net:zeroGradParameters()

  local output = self.net:forward(input)
  local loss = self.crit:forward(output, labels)
  local gradOutput = self.crit:backward(output, labels)
  self.net:backward(input, gradOutput)
  self.net:updateParameters(learningRate)

  local _, prediction = output:max(2)
  local numRight = labels:int():eq(prediction:int()):sum()
  return {loss=loss, numRight=numRight}  -- you can return a table, it will become a python dictionary
end

function TorchModel:predict(input)
  local output = self.net:forward(input)
  local _, prediction = output:max(2)
  return prediction:byte()
end

Oh, it's because it only currently works with np.float32, np.float64 or np.uint8. I should make it have a more meanginful error message. The releavtn current code is:

    typestring = str(type(something))
    if typestring == "<class 'numpy.ndarray'>":
      dtypestr = str(something.dtype)
      if dtypestr == 'float32':
        pushSomething(lua, PyTorch._asFloatTensor(something))
        return
      if dtypestr == 'float64':
        pushSomething(lua, PyTorch._asDoubleTensor(something))
        return
      if dtypestr == 'uint8':
        pushSomething(lua, PyTorch._asByteTensor(something))
        return

    raise Exception('pushing type ' + str(type(something)) + ' not implemented, value ', something)

You can see if dtypestr isnt one of thoese 3 types, it falls off the end, and gives a generic excpetion . I'll change that now.

The generic error message issue is addressed in c326c5e

commented

But in the code above, all dtypes were of the allowed types no? float32 and uint8? The array that throws the error is uint8. This is the code from the examples folder/readme, it isn't my own code.

hmmmm, can you pull down the latest version of pytorch, rebuild, and check what hte new error message is?

commented

I'll have to wait til I'm at work tomorrow to do that - will do it first thing, I'm on UK time :) will you be around tomorrow too?

Thanks for your help so far!

Yeah, I'll be around during UK day-time

commented

Great, let's have a look tomorrow

Oh... it's because i convert to a string, and compare with "<class 'numpy.ndarray'>", but on mac apparently it is "<type 'numpy.ndarray'>" (or maybe in py2.7; anyway, I will address this now)

Addressed in 9f017aa

commented

Great! I'll try it in the morning and post my results

commented

Worked perfectly Hugh! I have not run it on a CUDA machine yet but for now this is great!

Fantastic piece of software, great utility!

k, cool :-)

Cant find your question, but I remember a question asking about CudaTensor. So, it is actually possible to create CudaTensors from python using https://github.com/hughperkins/pycudatorch However, I assume your data is coming from main memory, via python, eg from files on disk, or from a website, or something similar? So, I think the easiest is to leave them in main memory, until they get to the lua side, and then convert them in the lua. A (rather overly verbose) example here:

https://github.com/hughperkins/pytorch/blob/master/examples/luamodel/torch_model.lua#L85-L97