Pushing type 'numpy.ndarray' failing on OSX

Question

Pushing type 'numpy.ndarray' failing on OSX

cjmcmurtrie opened this issue 8 years ago · comments

I managed to build pytorch on OSX and all the tests passed. Running python pybit.py also worked without a hitch.

However, I cannot get the more complex example from the front-page to run (the convolutional network on MNIST). Both versions of this (from the examples folder and from the frontpage) throw the same error. I'm in a Python 2.71 environment, no CUDA GPU on this machine. This is the traceback:

$ python run_torch_model.py
Traceback (most recent call last):
  File "run_torch_model.py", line 35, in <module>
    lab_batch)
  File "build/bdist.macosx-10.5-x86_64/egg/PyTorchAug.py", line 147, in mymethod
  File "build/bdist.macosx-10.5-x86_64/egg/PyTorchAug.py", line 109, in pushSomething
Exception: ("pushing type <type 'numpy.ndarray'> not implemented, value ", array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ..., 
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]], dtype=uint8))

This is the Python script I'm running (I grabbed the MNIST data from Sklearn but the error is the same using the python-mnist package):

from __future__ import print_function, division
import PyTorch
import PyTorchHelpers
import numpy as np
from sklearn.datasets import fetch_mldata


batchSize = 32
numEpochs = 2
learningRate = 0.02

TorchModel = PyTorchHelpers.load_lua_class('torch_model.lua', 'TorchModel')
torchModel = TorchModel('cpu', 28, 10)

mnist = fetch_mldata('MNIST original', '')

images = mnist.data
labels = mnist.target

labels += 1  # since torch/lua labels are 1-based
N = labels.shape[0]

numBatches = N // batchSize
for epoch in range(numEpochs):
  epochLoss = 0
  epochNumRight = 0
  for b in range(numBatches):

    im_batch = images[b * batchSize:(b+1) * batchSize]
    lab_batch = labels[b * batchSize:(b+1) * batchSize]

    res = torchModel.trainBatch(
      learningRate,
      im_batch,
      lab_batch)

    numRight = res['numRight']
    epochNumRight += numRight
  print('epoch ' + str(epoch) + ' accuracy: ' + str(epochNumRight * 100.0 / N) + '%')

This is the Torch script:

require 'torch'
require 'nn'

local TorchModel = torch.class('TorchModel')

function TorchModel:__init(backend, imageSize, numClasses)
  self:buildModel(backend, imageSize, numClasses)
  self.imageSize = imageSize
  self.numClasses = numClasses
  self.backend = backend
end

function TorchModel:buildModel(backend, imageSize, numClasses)
  self.net = nn.Sequential()
  local net = self.net

  net:add(nn.SpatialConvolutionMM(1, 16, 5, 5, 1, 1, 2, 2))
  net:add(nn.ReLU())
  net:add(nn.SpatialMaxPooling(3, 3, 3, 3))
  net:add(nn.SpatialConvolutionMM(16, 32, 3, 3, 1, 1, 1, 1))
  net:add(nn.ReLU())
  net:add(nn.SpatialMaxPooling(2, 2, 2, 2))
  net:add(nn.Reshape(32 * 4 * 4))
  net:add(nn.Linear(32 * 4 * 4, 150))
  net:add(nn.Tanh())
  net:add(nn.Linear(150, numClasses))
  net:add(nn.LogSoftMax())

  self.crit = nn.ClassNLLCriterion()

  self.net:float()
  self.crit:float()
end

function TorchModel:trainBatch(learningRate, input, labels)
  self.net:zeroGradParameters()

  local output = self.net:forward(input)
  local loss = self.crit:forward(output, labels)
  local gradOutput = self.crit:backward(output, labels)
  self.net:backward(input, gradOutput)
  self.net:updateParameters(learningRate)

  local _, prediction = output:max(2)
  local numRight = labels:int():eq(prediction:int()):sum()
  return {loss=loss, numRight=numRight}  -- you can return a table, it will become a python dictionary
end

function TorchModel:predict(input)
  local output = self.net:forward(input)
  local _, prediction = output:max(2)
  return prediction:byte()
end

Hugh Perkins · Answer 1 · Fri Apr 08 2016 07:32:01 GMT+0800 (China Standard Time)

Oh, it's because it only currently works with np.float32, np.float64 or np.uint8. I should make it have a more meanginful error message. The releavtn current code is:

    typestring = str(type(something))
    if typestring == "<class 'numpy.ndarray'>":
      dtypestr = str(something.dtype)
      if dtypestr == 'float32':
        pushSomething(lua, PyTorch._asFloatTensor(something))
        return
      if dtypestr == 'float64':
        pushSomething(lua, PyTorch._asDoubleTensor(something))
        return
      if dtypestr == 'uint8':
        pushSomething(lua, PyTorch._asByteTensor(something))
        return

    raise Exception('pushing type ' + str(type(something)) + ' not implemented, value ', something)

You can see if dtypestr isnt one of thoese 3 types, it falls off the end, and gives a generic excpetion . I'll change that now.

Hugh Perkins · Answer 2 · Fri Apr 08 2016 07:35:24 GMT+0800 (China Standard Time)

The generic error message issue is addressed in c326c5e

Conan · Answer 3 · Fri Apr 08 2016 07:37:10 GMT+0800 (China Standard Time)

But in the code above, all dtypes were of the allowed types no? float32 and uint8? The array that throws the error is uint8. This is the code from the examples folder/readme, it isn't my own code.

Hugh Perkins · Answer 4 · Fri Apr 08 2016 07:37:56 GMT+0800 (China Standard Time)

hmmmm, can you pull down the latest version of pytorch, rebuild, and check what hte new error message is?

Conan · Answer 5 · Fri Apr 08 2016 07:41:43 GMT+0800 (China Standard Time)

I'll have to wait til I'm at work tomorrow to do that - will do it first thing, I'm on UK time :) will you be around tomorrow too?

Thanks for your help so far!

Hugh Perkins · Answer 6 · Fri Apr 08 2016 07:42:45 GMT+0800 (China Standard Time)

Yeah, I'll be around during UK day-time

Conan · Answer 7 · Fri Apr 08 2016 07:43:58 GMT+0800 (China Standard Time)

Great, let's have a look tomorrow

Hugh Perkins · Answer 8 · Fri Apr 08 2016 07:49:43 GMT+0800 (China Standard Time)

Oh... it's because i convert to a string, and compare with "<class 'numpy.ndarray'>", but on mac apparently it is "<type 'numpy.ndarray'>" (or maybe in py2.7; anyway, I will address this now)

Hugh Perkins · Answer 9 · Fri Apr 08 2016 07:51:22 GMT+0800 (China Standard Time)

Addressed in 9f017aa

Conan · Answer 10 · Fri Apr 08 2016 07:57:19 GMT+0800 (China Standard Time)

Great! I'll try it in the morning and post my results

Conan · Answer 11 · Fri Apr 08 2016 17:55:11 GMT+0800 (China Standard Time)

Worked perfectly Hugh! I have not run it on a CUDA machine yet but for now this is great!

Fantastic piece of software, great utility!

Hugh Perkins · Answer 12 · Fri Apr 08 2016 20:11:25 GMT+0800 (China Standard Time)

k, cool :-)

Hugh Perkins · Answer 13 · Fri Apr 08 2016 20:14:54 GMT+0800 (China Standard Time)

Cant find your question, but I remember a question asking about CudaTensor. So, it is actually possible to create CudaTensors from python using https://github.com/hughperkins/pycudatorch However, I assume your data is coming from main memory, via python, eg from files on disk, or from a website, or something similar? So, I think the easiest is to leave them in main memory, until they get to the lua side, and then convert them in the lua. A (rather overly verbose) example here:

https://github.com/hughperkins/pytorch/blob/master/examples/luamodel/torch_model.lua#L85-L97