hollance / CoreMLHelpers

Types and functions that make it a little easier to work with Core ML in Swift.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MultiArray transposed + reshaped

axmav opened this issue · comments

Hello!
a: MultyArray
Shape: 1 x 1 x 48 x 17 x 27 Strides: 22032 x 22032 x 459 x 27 x 1 - INPUT OK
b = a.transposed([0, 1 , 3, 4, 2])
Shape: 1 x 1 x 17 x 27 x 48 Strides: 22032 x 22032 x 27 x 1 x 459 - OK
c = b.reshaped([5508, 4])
Shape: 5508 x 4 Strides: 4 x 1 - NOT OK

What do you mean by NOT OK?

@hollance all floats from variable "c" in wrong order. Only first float is OK.
c[0,0] - OK
c[0,1] - wrong float (pointer - 0x4 + 1x1 = 1), but this float in position 459.
c[0,2] - should be in position 459 x 2 = 918.
I tried to change strides of "c" to [1, 459] and first row in output is OK, but other are wrong.

Can you share an example that reproduces this?

@hollance thank you for your CoreMLHelpers! Yes, i can.

let bbox_deltas = MultiArray<Double>(prediction.rpn_bbox_pred)
// bbox_deltas: 1x1x48x17x27
let bbox_deltas_tr = bbox_deltas
                .transposed([0, 1 , 3, 4, 2])
// bbox_deltas_tr: 1x1x17x27x48
let bbox_deltas_reshaped = bbox_deltas_tr
                .reshaped([5508, 4])

bbox_deltas output (from [0,0,0,0,0] to [0,0,0,0,9]):
-0.3277042210 -0.1190757155 +0.2304603755 +0.2117528319 +0.0469222963 +0.1395139992 +0.2265842110 +0.2274794430 +0.2227494419 +0.2037462592
bbox_deltas_tr output (from [0,0,0,0,0] to [0,0,0,0,9]):
-0.3277042210 -0.0869039297 +0.9364890456 -0.1947032362 -1.0869287252 -0.3563345075 +1.3238990307 -1.4743229151 -0.2934076190 -0.2424605638
bbox_deltas_reshaped output (from [0,0] to [3,3]):
-0.3277042210 -0.1190757155 +0.2304603755 +0.2117528319
+0.0469222963 +0.1395139992 +0.2265842110 +0.2274794430
+0.2227494419 +0.2037462592 +0.1262510419 +0.0418393202
-0.0091561433 +0.1211979464 +0.1764091402 +0.1841013283
bbox_deltas_reshaped NEED output (from [0,0] to [3,3]): (numbers from another platform, so -0.3280062079 is equal with my -0.3277042210)
-0.3280062079 -0.0861499831 +0.9359490871 -0.1947806776
-1.0853735209 -0.3559190333 +1.3249928951 -1.4733866453
-0.2930260003 -0.2428345680 +1.4895144701 -1.6780786514
-1.0336806774 -0.1753887385 +1.7062969208 -2.4873535633

I verified that this goes wrong indeed. I'll look into it shortly.

OK, here's what happens. The strides for array c should be [4, 1] so that's correct -- but that is relative to b's view of the memory. However, what currently happens is that when c looks up memory it does this in a's view of the memory while it should interpret the memory as if it comes from b.

So instead, MultiArray should be a little smarter and tell c how it should convert to b's indices and then to a's indices. So in addition to keeping track of the strides, it needs to know how to transform these to the strides of the original MLMultiArray.

I don't have time to do this right now, but pull requests are welcome. ;-) I've added a failing test case for this issue.

I've downgraded MultiArray to the status of "experimental code". :-) I've also added reshaped() and transposed() methods to MLMultiArray, which is a more robust solution.