RevolutionAnalytics / RHadoop

RHadoop

Home Page:https://github.com/RevolutionAnalytics/RHadoop/wiki

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

rmr2 rhadoop - outofmemory exception

ssetty opened this issue · comments

Hello, We trying rmr2 package from RHADOOP - get outofmemory exception
Below is sample code - aggregate sales quarterly.
What is signature of map,reduce & keyval function?

Some documents says
"key must be a matrix with a column and the same number of rows as the value." kindly explain

thanks

`

sample input data

time, sales
1,206
2,245
3,185
4,169
5,162
6,177
7,207
8,216
9,193
10,230
11,212

12,192

library(rmr2)

map <- function(key,value)
{
print("sales are ")
print(value)
salesLine <- unlist(strsplit(value, "[,]"))
month <- salesLine[1]
sales <- salesLine[2]

if (month >=1 && month <= 4)
month <- 4
else if (month >=5 && month <= 8)
month <- 8
else month <- 12

return(keyval(month,sales))

}

reduce <- function(key,values)
{
keyval(key, sum(values))
}

forecast <- function(input,output=NULL)
{
mapreduce(input=input,input.format="text",map=map,output=output)
}

hdfs.root <- "forecast"
hdfs.data <- file.path(hdfs.root,"input")
hdfs.out <- file.path(hdfs.root,"output")

out <- forecast(hdfs.data,hdfs.out)
`