Security footgun with Rmd generation
jcheng5 opened this issue · comments
The {{
/ }}
knit_expand
mechanism is a bit too low-level; since it works on a purely textual level, it's easy for values intended to be text to be interpreted as code instead.
e.g. an Rmd template that contains this snippet (not in a code block):
The variable we'll be focusing on in this report is {{col_name}}.
And then buildRmdBundle(..., vars = list(col_name = input$col_name))
.
A malicious client could easily send a col_name
value of "\n```{r}\nunlink("whatever", recursive=TRUE)\n```\n"
, which would expand into the Rmd template as
The variable we'll be focusing on in this report is
```{r}
unlink("whatever", recursive=TRUE)
```
.
Some ideas:
We could add the values in(Won't work, see my next comment)vars
to the knitr environment, then tell people to do`r col_name`
instead of{{col_name}}
.We could recommend people pass non-code values through(Won't work, see my next comment)params
. That seems like a higher-overhead version of the previous option though.- We could error on suspicious
vars
or{{
/}}
substitutions. Yuck, security heuristics. - We could stop using
knit_expand
altogether, and use a different chunk type to insert code dynamically. This would probably be a considerable amount of knitr hacking, I think I looked at this last time and I couldn't find an obvious way to preprocess a chunk and then have knitr treat it as code.
We could add the values in vars to the knitr environment, then tell people to do
r col_name
instead of {{col_name}}.
It seems like this could get complicated quickly (from a user point of view). Considering that we currently recommend the params
approach when generating Rmd reports from Shiny, at least my initial feeling is we should go that route and document it better in our vignettes.
Shoot, neither `r col_name`
nor params
will work in this case, as it's not reproducible (unless we also provide them with a .R script that invokes rmarkdown::render
). Meaning, the report PDF/HTML generated using buildRmdBundle(render = TRUE)
will be correct, but they won't get the same results by just knitting the .Rmd, which is the whole point.
🤔
We could use our own mustache variant that forces you to indicate whether the thing you're rendering is "text" or "code". If text, then we coerce the result to character then escape any character that has special meaning (for Rmd that'd be backtick -> `
, I suppose?). If code, then the value would be deparsed.
(Hmmm, we might already have a bug here, if you put a {{placeholder}} in a code block and the value is a string I think it'll go into the Rmd verbatim rather than being deparsed.)
If we're willing to live with heuristics, we could do something like:
Throws on \n'''{
and 'r
:
buildRmdBundle(..., vars = list(col_name = input$col_name))
Doesn't throw:
buildRmdBundle(..., vars = list(col_name = input$col_name), allow_unsafe_values = TRUE)
Throws only for col_name
:
buildRmdBundle(..., vars = list(col_name = input$col_name, code_stuff = allow_unsafe(input$foo)), allow_unsafe_values = TRUE)
Now also exploring an approach where we parse the .Rmd before and after knit_expand
, and if the number of chunks has changed, we fail by default.
Update 2021-01-28: https://github.com/rundel/parsermd exists. I don't know if it shows us inline code chunks though, which we would need.
Advice from Yihui, circa June 2020:
knit_expand() evaluates the expression
{{code}} by knitr::knit_hooks$get('evaluate.inline'). If you have
security concerns, it is probably a better idea to define the
evaluate.inline hook and examine the code before evaluating it, e.g.,
(not tested)library(knitr) eval_inline = knit_hooks$get('evaluate.inline') # original hook knit_hooks$set(evaluate.inline = function(code, envir) { code = xfun::split_lines(code) if (any(grepl(all_patterns$md$chunk.begin, code)) stop( 'You are not allowed to include a code chunk in the variable' ) eval_inline(code, envir) })
We'll also need some way to hook into multi-line code chunks (not just inline). More specifically, note how an error isn't raised here:
---
title: "Untitled"
output: html_document
---
```{r setup, include=FALSE}
library(knitr)
eval_inline = knit_hooks$get('evaluate.inline') # original hook
knit_hooks$set(evaluate.inline = function(code, envir) {
code = xfun::split_lines(code)
if (any(grepl(all_patterns$md$chunk.begin, code))) stop(
'You are not allowed to include a code chunk in the variable'
)
eval_inline(code, envir)
})
```
## R Markdown
```{r cars}
summary(cars)
```
Fixed by #92