Add Tensor#toSeq and Tensor#toArray methods
Atry opened this issue · comments
We need Tensor#toSeq
and Tensor#toArray
methods for creating n-dimensional scala.collection.Seq
or scala.Array
, as the reverse conversion of Tensor.apply
.
This method can be implemented from existing Tensor#flatArray
and Tensor#shape
Hey! Taking a look at this as a first issue.
For the toArray
method, we could do something like:
def f(flatArray:Array[A], shape:Array[Int]):Array[B]= {
// if desired shape is 1d, we're done
if (shape.length == 1){
flatArray
} else {
// desired shape must match number of elements
if (shape.product != flatArray.length){
throw new IllegalArgumentException
}
// pick off last dimension, partitioning into slices
val oneReduced = (0 until shape.product by shape(shape.length-1)).map {
i => flatArray.slice(i, i + shape(shape.length-1))
};
f(oneReduced.toArray, shape.slice(0, shape.length-1))
}
}
(with the appropriate types A&B worked out)
But in https://github.com/ThoughtWorksInc/Compute.scala/blob/0.4.x/Tensors/src/main/scala/com/thoughtworks/compute/Tensors.scala#L1105
the output of flatArray
is a Future
. So do we want to ensure that result is computed before being passed into this helper function, or should this function also deal with and return a Future
?
Welcome!
I think it should be a Future
, since it is a slow action. For now, all slow actions are Future
or Do
, except toString
, because toString
is an overridden method.
But there are other considerations.
- Since
toArray
ortoSeq
in Scala collection library is not asynchronous, the nametoArray
will surprise people if it returns aFuture
- What is the type of
B
? How to check the type?
I see. One option is two different methods.
It would return an Array
of Float
s or possibly an Array
of Array
s (of either Array
s or Floats
). So we could define a custom type or just use Either
.
Given there are too many possible dimensions, it is hard to be represented in Either
.
def readScalar: Future[Float]
def read1DArray: Future[Array[Float]]
def read2DArray: Future[Array[Array[Float]]]
def read3DArray: Future[Array[Array[Array[Float]]]]
We probably want the ability to work with n-dimensional Tensor
s / Array
s, right?
That's the purpose of this issue
Yeah - I was trying to say that we can't explicitly give read2DArray
,read3DArray
, etc. since we want the ability to work with any number of dimensions.
If we want to avoid read2DArray
, read3DArray
, then a type class for arbitrary dimensions is required.
def read[Out](implicit tensorReader: TensorReader[Out]): Future[Out]
// Usage
tensor1.read[Float]
tensor2.read[Seq[Array[Float]]
tensor3.read[Vector[List[Array[Float]]]
Okay I'll give that a try!
Another option is returning an Any
. It is not type safe on dimensions but it can be understood since Tensor
is not type safe on dimensions as well.
Yeah I was originally thinking something like Array[Either[Float, Array[A]]] forSome {type A}
Either
has to be recursive
def read: T forSome { type T <: Either[Float, Array[T]] }
However it is very inefficient to create an Array[Left[Float]]
.
So if the user is calling toArray
, they probably want a result of type Array[Array[...Array[Float]...]
, right? So any use of Either
type or custom class doesn't really solve the problem, right? And if we return Any
, that's also no good:
def f : Any = {Array(2,3)}
var y = f;
y(0)
//error: scala.this.Any does not take parameters
and we have a similar problem if we use Array[Any]
.
I see why they only defined Array.ofDim
for up to 5 arguments.
relevant: https://stackoverflow.com/questions/30623062/6-or-more-dimensional-arrays-in-scala
In this:
def reshapeArray(a: Any, b:Array[Int]): Any = {
if (b.length == 1) {
a.asInstanceOf[Array[Any]]
} else {
val last = b(b.length - 1)
val oneReduced = Array.tabulate(last)(i => a.asInstanceOf[Array[Any]].slice(i*last, (i+1)*last))
reshapeArray(oneReduced, b.slice(0, b.length-1))
}
}
def toArray: Future[Any] = {
flatArray.flatMap((z) => Future {reshapeArray(z,shape)})
}
I'm running into problems with
[error] found : Any
[error] required: com.thoughtworks.tryt.covariant.TryT[com.thoughtworks.continuation.UnitContinuation,Any]
Is this related to your version of Future
instead of scala.concurrent.Future
?
flatArray.map { z => reshapeArray(z, shape) }
Try map
.
Thanks - that indeed does compile. Now to write tests (and actually have them pass) 😄
Hint: you can use grouped.toArray
/ grouped.toSeq
instead of tabulate
and slice
.
Noted about grouped
.
In writing some tests I ran into some runtime problems regarding types. I've fixed that, but again I'm running into the same problem, i.e. flatArray
returns a thoughtworks Future
, but I can't seem to either pass that as an argument to reshapeArray
, nor can I write a callback on such a thing. map
and flatMap
don't seem to work.
I'm getting:
found : com.thoughtworks.future.Future[Array[_]]
[error] (which expands to) com.thoughtworks.future.opacityTypes.Future[Array[_]]
[error] required: Array[_]
map
works with scala.concurrent.Future
. I can't see any relevant examples in the documentation.
That is already imported in Tensors.scala
.
The error message looks like you are calling a function that accepts an Array
while you provides a Future[Array[_]]
. Try asking the question on StackOverflow with a minimal reproducible example.