covars_make_all returns NAs for baselines

Question

covars_make_all returns NAs for baselines

kmunger opened this issue 7 years ago · comments

When I run the function covars_make_all on hansard speeches, 29 of the 33 measures are returned correctly, but not the 4 measures related to word rarity.

However, when I run covars_make_baselines, these 4 measures work on the same corpus.

setwd("C:/Users/kevin/Dropbox/Benoit_Spirling_Readability/hansard_data/")
files<-list.files()
  
 
##initialize
all_files<-read.csv(paste0(files[2]), stringsAsFactors = F)
restricted<-filter(all_files, party == "Conservative" | party == "Labour")
speakers<-all_files$speaker
tab<-table(speakers)
speakers_morethan10 <- names(tab[tab > 10])
restricted <- filter(restricted, speaker %in% speakers_morethan10)


restricted<-restricted[which(ntoken(restricted$text)>10),]

data_corpus_speeches66 <- corpus(restricted)

pos<-covars_make_all(data_corpus_speeches66, dependency=F)`

> pos$google_min_2000[100]
[1] NA

> pos$brown_mean[1000]
[1] NA

Kenneth Benoit · Answer 1 · Wed Oct 04 2017 23:50:07 GMT+0800 (China Standard Time)

@kmunger is this still a concern, or just an issue to fix (eventually) in the software?

Kevin Munger · Answer 2 · Thu Oct 05 2017 00:15:18 GMT+0800 (China Standard Time)

@kbenoit Not an immediate concern, there's an easy workaround, just something to fix at some point