rstudio / renv

renv: Project environments for R.

Home Page:https://rstudio.github.io/renv/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

lockfile contains more than just packages which appear in my project

johnForne opened this issue · comments

Kia ora

I am new to renv:: and think this sounds like it could be really useful for helping make our code more reproducible.

However, I've run into an issue that I don't know whether it is me or the code...

The situation is that I have an existing project that I've run 'renv::init()' within and I then checked the lockfile to see what packages it listed. The issue is that it seems to list every package that I've ever used in R - rather than only the packages within the project.

renv reference material suggests that

(The default) Capture only packages which appear to be used in your project, as determined by renv::dependencies(). This ensures that only the packages actually required by your project will enter the lockfile; the downside if it might be slow if your project contains a large number of files. If speed becomes an issue, you might consider using .renvignore files to limit which files renv uses for dependency discovery, or switching to explicit mode, as described next.

I tested this by calling 'renv::dependences()' and found that it listed the 43 packages I expected with my project. In contrast, 'renv::lockfile_read()' lists 254 packages!

library(renv)
library(tidyverse)
init()
d <- dependencies()
d %>% 
  distinct(Package) %>% 
  arrange(Package)

> d %>% 
+   distinct(Package) %>% 
+   arrange(Package)
           Package
1               DT
2              GMS
3       Polychrome
4     RColorBrewer
5           aws.s3
6             base
7             brms
8          dggridR
9            dplyr
10      er.helpers
11    er.templates
12         forcats
13         ggplot2
14            glue
15           haven
16         janitor
17           knitr
18         leaflet
19        magrittr
20            pals
21          plotly
22           purrr
23     rcartocolor
24           readr
25          readxl
26            renv
27           rlang
28       rmarkdown
29       rsconnect
30          scales
31              sf
32           sfdep
33           shiny
34    shinyWidgets
35 shinycssloaders
36       simplevis
37       snakecase
38         stringr
39           tidyr
40       tidyverse
41         viridis
42     wesanderson
43             zip
> 

l <- lockfile_read()
l$Packages %>% 
  names()

<html>
<body>
<!--StartFragment-->
> l$Packages %>%  +   names()   
[1] "BH"                "Brobdingnag"       "DBI"               "DT"                "GMS"                 
[6] "KernSmooth"        "MASS"              "Matrix"            "Polychrome"        "QuickJSR"          
 [11] "R6"                "RColorBrewer"      "Rcpp"              "RcppEigen"         "RcppParallel"       
[16] "StanHeaders"       "abind"             "anytime"           "askpass"           "aws.s3"             
[21] "aws.signature"     "backports"         "base64enc"         "bayesplot"         "bit"                
[26] "bit64"             "blob"              "boot"              "brew"              "bridgesampling"     
[31] "brio"              "brms"              "broom"             "bslib"             "cachem"             
[36] "callr"             "cellranger"        "checkmate"         "class"             "classInt"           
[41] "cli"               "clipr"             "coda"              "codetools"         "colorspace"        
[46] "colourpicker"      "commonmark"        "conflicted"        "cpp11"             "crayon"             
[51] "credentials"       "crosstalk"         "curl"              "data.table"        "dbplyr"             
[56] "deldir"            "desc"              "devtools"          "dggridR"           "dichromat"          
[61] "diffobj"           "digest"            "distributional"    "downlit"           "dplyr"             
[66] "dtplyr"            "dygraphs"          "e1071"             "ellipsis"          "er.helpers"        
[71] "er.templates"      "evaluate"          "extraDistr"        "fansi"             "farver"             
[76] "fastmap"           "fontawesome"       "forcats"           "fs"                "future"             
[81] "gargle"            "generics"          "geojsonsf"         "geometries"        "get"               
[86] "ggplot2"           "ggridges"          "gh"                "git2r"             "gitcreds"          
[91] "globals"           "glue"              "googledrive"       "googlesheets4"     "gridExtra"          
[96] "gtable"            "gtools"            "haven"             "highr"             "hms"               
[101] "htmltools"         "htmlwidgets"       "httpuv"            "httr"              "httr2"             
[106] "ids"               "igraph"            "ini"               "inline"            "isoband"           
[111] "janitor"           "jquerylib"         "jsonify"           "jsonlite"          "knitr"             
[116] "labeling"          "later"             "lattice"           "lazyeval"          "leafem"            
[121] "leaflet"           "leaflet.providers" "leafpop"           "lifecycle"         "listenv"           
[126] "loo"               "lubridate"         "lwgeom"            "magrittr"          "mapproj"           
[131] "maps"              "markdown"          "matrixStats"       "memoise"           "mgcv"              
[136] "mime"              "miniUI"            "modelr"            "munsell"           "mvtnorm"          
[141] "networkD3"         "nleqslv"           "nlme"              "numDeriv"          "odbc"             
[146] "openssl"           "packrat"           "pals"              "parallelly"        "pillar"            
[151] "pkgbuild"          "pkgconfig"         "pkgdown"           "pkgload"           "plotly"            
[156] "plyr"              "png"               "posterior"         "praise"            "prettyunits"       
[161] "processx"          "profvis"           "progress"          "promises"          "proxy"             
[166] "ps"                "purrr"             "ragg"              "rapidjsonr"        "rappdirs"          
[171] "raster"            "rcartocolor"       "rcmdcheck"         "readr"             "readxl"            
[176] "rematch"           "rematch2"          "remotes"           "renv"              "reprex"            
[181] "reshape2"          "rgeos"             "rlang"             "rmarkdown"         "roxygen2"          
[186] "rprojroot"         "rsconnect"         "rstan"             "rstantools"        "rstudioapi"        
[191] "rversions"         "rvest"             "s2"                "sass"              "scales"            
[196] "scatterplot3d"     "selectr"           "sessioninfo"       "sf"                "sfdep"             
[201] "sfheaders"         "shiny"             "shinyWidgets"      "shinycssloaders"   "shinyjs"           
[206] "shinystan"         "shinythemes"       "simplevis"         "snakecase"         "sourcetools"       
[211] "sp"                "spData"            "spdep"             "stars"             "string"           
[216] "stringr"           "svglite"           "sys"               "systemfonts"       "tensorA"           
[221] "terra"             "testthat"          "textshaping"       "threejs"           "tibble"            
[226] "tidyr"             "tidyselect"        "tidyverse"         "timechange"        "tinytex"           
[231] "trend"             "tzdb"              "units"             "urlchecker"        "usethis"           
[236] "utf8"              "uuid"              "vctrs"             "viridis"           "viridisLite"       
[241] "vroom"             "waldo"             "wesanderson"       "whisker"           "withr"            
[246] "wk"                "xfun"              "xml2"              "xopen"             "xtable"           
[251] "xts"               "yaml"              "zip"               "zoo"               
--
 
  |  
 

<br class="Apple-interchange-newline"><!--EndFragment-->
</body>
</html>
 

Interestingly, I then tested what happened if I set up a brand new project with only one 'test.R' file in it with the following code...

This time I found that 'd' had the two packages (renv:: + tidyverse::) that I expected. However, the lockfile seemed to contain all sorts of packages (108) that were more than the 31 packages in tidyverse + the 1 renv package.

library(renv)
library(tidyverse)
renv::init()

library(renv)
library(tidyverse)

d <- dependencies()
d %>% 
  distinct(Package) %>% 
  arrange(Package)

l <- lockfile_read()
l$Packages %>% 
  names()

tidyverse_packages()

Can you please let me know how to actually "Capture only packages which appear to be used in your project"?

Thanks in advance,

John

The lockfile captures both the top-level package dependencies, as well as those package's recursive dependencies. Could that explain why? For example, the tidyverse package has a large number of recursive dependencies:

> tools::package_dependencies("tidyverse", recursive = TRUE)[[1]]
  [1] "broom"         "conflicted"    "cli"           "dbplyr"
  [5] "dplyr"         "dtplyr"        "forcats"       "ggplot2"
  [9] "googledrive"   "googlesheets4" "haven"         "hms"
 [13] "httr"          "jsonlite"      "lubridate"     "magrittr"
 [17] "modelr"        "pillar"        "purrr"         "ragg"
 [21] "readr"         "readxl"        "reprex"        "rlang"
 [25] "rstudioapi"    "rvest"         "stringr"       "tibble"
 [29] "tidyr"         "xml2"          "backports"     "ellipsis"
 [33] "generics"      "glue"          "lifecycle"     "utils"
 [37] "memoise"       "blob"          "DBI"           "methods"
 [41] "R6"            "tidyselect"    "vctrs"         "withr"
 [45] "data.table"    "grDevices"     "grid"          "gtable"
 [49] "isoband"       "MASS"          "mgcv"          "scales"
 [53] "stats"         "gargle"        "uuid"          "cellranger"
 [57] "curl"          "ids"           "rematch2"      "cpp11"
 [61] "pkgconfig"     "mime"          "openssl"       "timechange"
 [65] "fansi"         "utf8"          "systemfonts"   "textshaping"
 [69] "clipr"         "crayon"        "vroom"         "tzdb"
 [73] "progress"      "callr"         "fs"            "knitr"
 [77] "rmarkdown"     "selectr"       "stringi"       "processx"
 [81] "rematch"       "rappdirs"      "evaluate"      "highr"
 [85] "tools"         "xfun"          "yaml"          "graphics"
 [89] "cachem"        "nlme"          "Matrix"        "splines"
 [93] "askpass"       "prettyunits"   "bslib"         "fontawesome"
 [97] "htmltools"     "jquerylib"     "tinytex"       "farver"
[101] "labeling"      "munsell"       "RColorBrewer"  "viridisLite"
[105] "bit64"         "sys"           "bit"           "base64enc"
[109] "sass"          "fastmap"       "digest"        "lattice"
[113] "colorspace"    "ps"

Thanks Kevin - much appreciated.
That's good to know. I wonder if it is possible/makes sense to have an optional argument to limit the lock file to include (direct) dependences only? Like what 'dependencies()' returns?

But that would give you an incomplete lockfile -- if you tried to call renv::restore(), we wouldn't know what versions of those package's dependencies you need, and so you'd risk issues due to a change in the R library state.

I wonder if it is possible/makes sense to have an optional argument to limit the lock file to include (direct) dependences only? Like what 'dependencies()' returns?

I think this would be useful from my purposes. I find the inclusion of recursive dependencies to be distracting and sometimes leads to dependency conflicts.

But that would give you an incomplete lockfile -- if you tried to call renv::restore(), we wouldn't know what versions of those package's dependencies you need, and so you'd risk issues due to a change in the R library state.

Is there a way to just rely on the underlying package dependency specifications to identify these versions? Coming from Python, you can just add the main packages you need installed to the requirements.txt (or environment.yml if using conda), and pip will automatically install the dependencies of those packages as needed.

For example, if I have a project that requires pandas 2.2.1, pip/conda will also install numpy 1.26.4 as a dependency (based on pandas specifying numpy<2 in its environment.yml) without needing to pin that version of numpy to the dependency specs.

I'd find it easier to just manage these direct dependencies, but having less familiarity with R than Python's ecosystem, I might be missing the mark here.