eddelbuettel / rcppsimdjson

Rcpp Bindings for the 'simdjson' Header Library

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add a vignette and pkgdown web page

melsiddieg opened this issue · comments

I would like to volunteer a vignette and pkgdown site to highlight this amazing package

We have discussed the need, or lack thereof, for pkgdown and don't think we're there yet.

There is also "nothing to do" (in the sense of cookie-cutter-all-the-same sites) and a lot to do in terms of nicer css.

But we can talk about a vignette, and in fact, @knapply and I have. What did you have in mind?

The documentation is already really good, however, I guess it would help to format it as a vignette with longer explanation and better-formatted examples. I have the CORD-19 dataset from Kaggle in mind as a use case as it contains about 150,000 JSON files comprising scientific papers about coronavirus which has been made available for NLP analysis. I think it would be a compelling use case. I found that I can extract paper abstracts from all those files in about 1 minute using Rcppsimdjson.


"In principle" a vignette is meant to be re-ran at each package build. Downloading and parsing 150k files each time would be madness.

"In practice" one can also do static vignettes, and I often do so. All that said, we don't even have to put it into the package. This sounds like it would also be a nice use case for the Rcpp Gallery which is also markdown based. Maybe you would want to write a post there?

@eddelbuettel that is true and would be better in a blogpost, However, I still think that the package examples would make for a good vignette.

Closing this for lack of follow-up. We'll write a vignette one day, I just did for another years-old package....