patrickroocks / listcompr

An R package for list comprehension

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

gen.matrix

ggrothendieck opened this issue · comments

This works but it might be more convenient if a gen.matrix existed.

 as.matrix(gen.data.frame(gen.vector(i+j, i = 1:4), j = 1:4))
##      V1 V2 V3 V4
## [1,]  2  3  4  5
## [2,]  3  4  5  6
## [3,]  4  5  6  7
## [4,]  5  6  7  8

That's an excellent idea! I just added this functionality in my last commit (v 0.2.1)

But instead just taking gen.matrix as a short-cut for as.matrix(gen.data.frame(...)) I decided that the auto-generated column names are not used in the matrix. But explicitly defined column names are taken:

> gen.matrix(gen.vector(i+j, i = 1:4), j = 1:4)
     [,1] [,2] [,3] [,4]
[1,]    2    3    4    5
[2,]    3    4    5    6
[3,]    4    5    6    7
[4,]    5    6    7    8
> gen.matrix(gen.named.vector('col{i}', i+j, i = 1:4), j = 1:4)
     col1 col2 col3 col4
[1,]    2    3    4    5
[2,]    3    4    5    6
[3,]    4    5    6    7
[4,]    5    6    7    8

Given that we know that the result has 2 dimensions would it be possible to change this to eliminate the need for gen.vector to have something like:

gen.matrix(i+j, i=1:4, j=1:4)

gen.matrix(+(i == j), i=1:4, j=1:4) # diagonal matrix

gen.matrix(+(i == j + 1), i=1:4, j=1:4) 

Another nice idea!

Everything implemented with the last commit.

I would have expected the first to give a column matrix and the second to give a row matrix but it is the other way around.

> gen.matrix(i, i = 1:4, j = 1)
     [,1] [,2] [,3] [,4]
[1,]    1    2    3    4
> gen.matrix(i, i = 1, j = 1:4)
     [,1]
[1,]    1
[2,]    1
[3,]    1
[4,]    1

I had in mind that gen.matrix(i+j, i = 1:4, j = 1:3) is a short-cut for gen.matrix(gen.vector(i+j, i = 1:4), j = 1:3) and thus the order was gen.matrix(expr, col_var, row_var)

But I totally agree to you that it reads very counter-intuitive to specify the cols before the rows. It's a mathematical convention that the row index precedes the column index.

I decided to change it according to your suggestion. Fixed with the last commit.

You mean the byrow parameter?
To be honest, I don't really like the idea of additional parameters in the gen... function of this package. The parametrization should be as lightweight as possible.

Instead of writing something like gen.matrix(i+j, i=1:3, j=1:2, byrow = FALSE) I suggest to consider t(gen.matrix(i+j, i=1:3, j=1:2)) (using the transpose function t(...) from base R) as the canonical alternative.

There is also crossprod and tcrossprod as another way of doing this. This eliminates the cost of the transpose by generating it directly.

The bycol parameter (cf. #4) solves the issue:

> gen.matrix(i+j, i=1:3, j=1:2, bycol = TRUE)
     [,1] [,2] [,3]
[1,]    2    3    4
[2,]    3    4    5

That's great but shouldn't the default be TRUE since the main data structures in R, matrices and data frames, are stored column by column.

I renamed bycol to byrow. But still byrow = TRUE means that the inner index refers to the rows.

Anyway I think it is more "canonical" because it perfectly fits to converting the analog gen.vector result to a matrix:

> matrix(gen.vector(i+j, i=1:3, j=1:2), ncol = 3, byrow = TRUE)
     [,1] [,2] [,3]
[1,]    2    3    4
[2,]    3    4    5


> gen.matrix(i+j, i=1:3, j=1:2, byrow = TRUE)
     [,1] [,2] [,3]
[1,]    2    3    4
[2,]    3    4    5