puddingR
is designed to streamline R data analysis and open data
release for the team at The Pudding. Check out the full
pkgdown
-generated site
here.
Learn more by searching vignette("puddingR")
in your R or RStudio
console.
At the moment, there are no plans for puddingR
to appear on CRAN. To
download:
# Install development version from GitHub
devtools::install_github("the-pudding/puddingR")
# Load into your R environment
library(puddingR)
The puddingR
package currently has a few main features:
For our front-end work, the team at The Pudding, uses a starter template which sets up anything that could be useful for our work and creates a file directory system. The directory helps the team (and the author) know exactly where to find everything they need for a project.
The create_project()
function does the same thing for R projects. It
creates a single folder, by default called analysis
, creates a new R
project and several directories. The resulting structure looks like
this:
analysis
|-- assets
|-- data
|-- open_data
|-- processed_data
|-- raw_data
|-- packrat
|-- plots
|-- R
|-- reports
|-- rmds
|-- analysis.Rproj
Many of the remaining functions in this package use default values that assume the above folder structure is present.
puddingR::create_project(title = "My Cool Project")
While RStudio comes with several themes to use for Rmarkdown reports, I have created one specifically for use with Pudding projects. It contains our logo and lots of tips for how to import and display data.
To use it, in RStudio, navigate to File
> New File
> Rmarkdown
A window will pop up with a list of options on the left-hand side.
Select From Template
from that list, and navigate to Pudding Report
.
This will launch open a new Rmd file filled with tips that can be
deleted if you don’t need them. If you’re using the above file
structure, save this into the rmds
folder.
Often in our work we need to write data that we have analyzed and save
it in another location. In this workflow, most processed data belongs in
assets/data/processed_data/
. The easiest way to get it there is to use
export_data()
. This function has several arguments, included in them
is location
. If you are using the default file structure from
puddingR
, you can simply use open
to save it to the open_data
directory, processed
to save it to the processed_data
directory, or,
if this analysis
folder exists in the same working directory as the
front-end Pudding starter template, you can use js
(which will save
the file in src/assets/data/
for use in front-end work).
# To export mtcars dataset to the processed_data directory
puddingR::export_data(mtcars, "cars_data", location = "processed", directory = "auto", format = "csv")
If you want to export codebooks and the scripts used to generate data at
the same time the data is generated, you can use export_all()
instead.
# To export mtcars dataset to the open_data directory
puddingR::export_all(mtcars, "cars_data",
location = "open",
directory = "auto",
format = "csv",
# Also export a codebook to accompany the data
codebook = TRUE,
# Export the code chunks from an Rmd file used to generate the data
scripts = c("load_data", "analyze_data", "graph_data"),
scriptFile = here::here("rmds/analysis.Rmd")
)
These pieces can be accomplished separately using export_data()
,
create_codebook()
and export_code()
.
The default directory for a codebook is in open_data/intermediates
.
These files should be opened and manually edited to provide the correct
metadata to accompany an open data file.
Once all of the intermediate codebooks have been edited, they can all be
combined into a single file using knit_data_readme()
.
puddingR::knit_data_readme()
Find more information on how to use this package in the vignettes or on
the pkgdown
site.
Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.