rstudio / shinymeta

Record and expose Shiny app logic using metaprogramming

Home Page:https://rstudio.github.io/shinymeta

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Security footgun with Rmd generation

jcheng5 opened this issue · comments

The {{ / }} knit_expand mechanism is a bit too low-level; since it works on a purely textual level, it's easy for values intended to be text to be interpreted as code instead.

e.g. an Rmd template that contains this snippet (not in a code block):

The variable we'll be focusing on in this report is {{col_name}}.

And then buildRmdBundle(..., vars = list(col_name = input$col_name)).

A malicious client could easily send a col_name value of "\n```{r}\nunlink("whatever", recursive=TRUE)\n```\n", which would expand into the Rmd template as

The variable we'll be focusing on in this report is 
```{r}
unlink("whatever", recursive=TRUE)
```
.

Some ideas:

  • We could add the values in vars to the knitr environment, then tell people to do `r col_name` instead of {{col_name}}. (Won't work, see my next comment)
  • We could recommend people pass non-code values through params. That seems like a higher-overhead version of the previous option though. (Won't work, see my next comment)
  • We could error on suspicious vars or {{/}} substitutions. Yuck, security heuristics.
  • We could stop using knit_expand altogether, and use a different chunk type to insert code dynamically. This would probably be a considerable amount of knitr hacking, I think I looked at this last time and I couldn't find an obvious way to preprocess a chunk and then have knitr treat it as code.

We could add the values in vars to the knitr environment, then tell people to do r col_name instead of {{col_name}}.

It seems like this could get complicated quickly (from a user point of view). Considering that we currently recommend the params approach when generating Rmd reports from Shiny, at least my initial feeling is we should go that route and document it better in our vignettes.

Shoot, neither `r col_name` nor params will work in this case, as it's not reproducible (unless we also provide them with a .R script that invokes rmarkdown::render). Meaning, the report PDF/HTML generated using buildRmdBundle(render = TRUE) will be correct, but they won't get the same results by just knitting the .Rmd, which is the whole point.

🤔

We could use our own mustache variant that forces you to indicate whether the thing you're rendering is "text" or "code". If text, then we coerce the result to character then escape any character that has special meaning (for Rmd that'd be backtick -> `, I suppose?). If code, then the value would be deparsed.

(Hmmm, we might already have a bug here, if you put a {{placeholder}} in a code block and the value is a string I think it'll go into the Rmd verbatim rather than being deparsed.)

If we're willing to live with heuristics, we could do something like:

Throws on \n'''{ and 'r :

buildRmdBundle(..., vars = list(col_name = input$col_name))

Doesn't throw:

buildRmdBundle(..., vars = list(col_name = input$col_name), allow_unsafe_values = TRUE)

Throws only for col_name:

buildRmdBundle(..., vars = list(col_name = input$col_name, code_stuff = allow_unsafe(input$foo)), allow_unsafe_values = TRUE)

Now also exploring an approach where we parse the .Rmd before and after knit_expand, and if the number of chunks has changed, we fail by default.


Update 2021-01-28: https://github.com/rundel/parsermd exists. I don't know if it shows us inline code chunks though, which we would need.

Advice from Yihui, circa June 2020:

knit_expand() evaluates the expression
{{code}} by knitr::knit_hooks$get('evaluate.inline'). If you have
security concerns, it is probably a better idea to define the
evaluate.inline hook and examine the code before evaluating it, e.g.,
(not tested)

library(knitr)
eval_inline = knit_hooks$get('evaluate.inline')  # original hook
knit_hooks$set(evaluate.inline = function(code, envir) {
  code = xfun::split_lines(code)
  if (any(grepl(all_patterns$md$chunk.begin, code)) stop(
    'You are not allowed to include a code chunk in the variable'
  )
  eval_inline(code, envir)
})

We'll also need some way to hook into multi-line code chunks (not just inline). More specifically, note how an error isn't raised here:

---
title: "Untitled"
output: html_document
---

```{r setup, include=FALSE}
library(knitr)
eval_inline = knit_hooks$get('evaluate.inline')  # original hook
knit_hooks$set(evaluate.inline = function(code, envir) {
  code = xfun::split_lines(code)
  if (any(grepl(all_patterns$md$chunk.begin, code))) stop(
    'You are not allowed to include a code chunk in the variable'
  )
  eval_inline(code, envir)
})
```

## R Markdown

```{r cars}
summary(cars)
```

Fixed by #92