EmilHvitfeldt / textdata

Download, parse, store, and load text datasets instead of storing it in packages

Home Page:https://emilhvitfeldt.github.io/textdata/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

afinn dataset has improperly labelled columns

ebridge2 opened this issue · comments

For some reason, the afinn dataset seems to have improperly named columns on my local mac installation.

The columns are "word" and "value", instead of "word" and "sentiment" like the documentation would suggest (and a previous version of the tidytext package reflects a third specification, "word" and "score").

For reference:

> afinn=lexicon_afinn()
> names(afinn)
[1] "word"    "value"
> sessionInfo()
R version 3.5.2 (2018-12-20)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.5

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] textdata_0.3.0

loaded via a namespace (and not attached):
 [1] readr_1.3.1     compiler_3.5.2  R6_2.4.0        hms_0.4.2       tools_3.5.2     pillar_1.3.1    fs_1.2.7        rstudioapi_0.10
 [9] rappdirs_0.3.1  tibble_2.1.1    yaml_2.2.0      crayon_1.3.4    Rcpp_1.0.1      pkgconfig_2.0.2 rlang_0.3.4    

value = col_double()
is the cause of the issue.

Hello @ebridge2!

thank you for catching this error. I have fixed it to be score with is consistent with other functions. score is used for numerical vectors in lexicons and sentiment is used for categorical vectors.