ELToulemonde / dataPreparation

Data preparation for data science projects.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Using fastHandleNA with duplicated column names

d-ghale opened this issue · comments

It looks like your fastHandleNa does not work when there are duplicate column names. It might be worth handling NAs for duplicate column names possible or showing a warning message to change the column names. If it is only an issuse in my computer please let me know i so that I can figure out what is happening. Thanks!

site <- c("A", "B", "C", "D", "E", "B")
D01 <- c(1, 0, 0, 0, 1, 0)
D01 <- c(1, 1, 0, 1, 1, 1)
D02 <- c(1, 0, 1, 0, 1, 0)
D02 <- c(0, 1, 0, 0, 1, 1)
D03 <- c(1, 1, 0, 0, 0, 1)
D03 <- c(0, 1, 0, 0, 1, 1)
D04 <- c(NA, NA, 0, 1, 1, NA)
D04 <- c(NA, 0, NA, 1, 1, 0)

df1 <- data.table(site, D01, D01, D02, D02, D03, D03, D04, D04, check.names = FALSE)
df2 <- data.table(site, D01, D01, D02, D02, D03, D03, D04, D04, check.names = TRUE)

df1 <- dataPreparation::fastHandleNa(df1, set_num = 0) 
df2 <- dataPreparation::fastHandleNa(df2, set_num = 0) 

Hi,

Sorry for the delay.

Thanks for reporting this issue.

It has been fixed in #49 .

But please not that it is not recommended to use dataPreparation with duplicated column names.

Should be on cran soon under version 0.3.10