rstudio / rsconnect

Publish Shiny Applications, RMarkdown Documents, Jupyter Notebooks, Plumber APIs, and more

Home Page:http://rstudio.github.io/rsconnect/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Problems when CRAN returns multiple matches for a package

slodge-work opened this issue · comments

Background: https://community.rstudio.com/t/posit-packagemanager-too-many-rodbcs/173831/9

This has been quite an elusive problem to track down and reproduce... not least because first time I noticed it I made incomplete notes on what it was (sorry - and thanks to Greg for assisting!)

The problem is that in some rare cases CRAN will contain old versions of packages as well as new ones.

In this situation, rsconnect isDevVersion (and other functions) can pick out the wrong package - https://github.com/rstudio/rsconnect/blob/41907afefe219a5e00496c0ba99746eca38fbf01/R/bundlePackagePackrat.R#L107C1-L118C2

To repro this, you can look at the differences between these two CRANs:

available <- rsconnect:::availablePackages(list(CRAN = "https://packagemanager.posit.co/cran/latest"))
pkg <- list(Package = "RODBC", Version = "1.3-16")
rsconnect:::isDevVersion(pkg, available)
pkg <- list(Package = "RODBC", Version = "1.3-21")
rsconnect:::isDevVersion(pkg, available)

available <- rsconnect:::availablePackages(list(CRAN = "https://cran.rstudio.com"))
pkg <- list(Package = "RODBC", Version = "1.3-16")
rsconnect:::isDevVersion(pkg, available)
pkg <- list(Package = "RODBC", Version = "1.3-21")
rsconnect:::isDevVersion(pkg, available)

Current:

> available <- rsconnect:::availablePackages(list(CRAN = "https://packagemanager.posit.co/cran/latest"))
> pkg <- list(Package = "RODBC", Version = "1.3-16")
> rsconnect:::isDevVersion(pkg, available)
[1] FALSE
> pkg <- list(Package = "RODBC", Version = "1.3-21")
> rsconnect:::isDevVersion(pkg, available)
[1] TRUE

> available <- rsconnect:::availablePackages(list(CRAN = "https://cran.rstudio.com"))
> pkg <- list(Package = "RODBC", Version = "1.3-16")
> rsconnect:::isDevVersion(pkg, available)
[1] FALSE
> pkg <- list(Package = "RODBC", Version = "1.3-21")
> rsconnect:::isDevVersion(pkg, available)
[1] FALSE

... although I can't guarantee that repro will keep working - it depends on the CRANs being constant ... currently it's just "luck" that cran.rstudio.com is returning new RODBC before old RODBC:

available <- rsconnect:::availablePackages(list(CRAN = "https://packagemanager.posit.co/cran/latest"))
available %>% as_tibble() %>% filter(Package == "RODBC")

available <- rsconnect:::availablePackages(list(CRAN = "https://cran.rstudio.com"))
available %>% as_tibble() %>% filter(Package == "RODBC")

image

@slodge-work Are you regularly switching between these two repository URLs?

You can occasionally see similar problems if you switch between two CRAN mirrors (unrelated to Posit Package Manager); mirrors are often a little behind the main CRAN repository. The sync delay is longer with Package Manager, which makes this issue easier to see.

Could you share your getOption("repos") and the repositories listed in your renv.lock (if applicable)?

@hadley - we may need isDevVersion to look across all repository entries for a named package.

No. we are not flip-flopping:

  • We used to use cran.rstudio.com as our CRAN mirror...
  • For the last year, we have been changing that to https://packagemanager.posit.co/cran/latest
  • ...but it's amazing how persistent the old one is (it keeps turning up in people's .RProfile files!)
  • we do sometimes see error messages in builds if someone checks in a renv.lock with a version which is not in packagemanager.posit.co yet (so I understand and agree with your point!)

Our machines are generally set up:

> options("repos")
$repos
                                           CRAN                                       OurCo.IMS 
"https://packagemanager.rstudio.com/all/latest"                    "https://cran.OurCo.app/ims" 

(although some do have cran-uat.OurCo.App entries for using our packages-in-test CRAN)

The top of our normal renv.lock is:

  "R": {
    "Version": "4.3.0",
    "Repositories": [
      {
        "Name": "CRAN",
        "URL": "https://packagemanager.rstudio.com/all/latest"
      },
      {
        "Name": "OurCo.IMS",
        "URL": "https://cran.OurCo.app/ims"
      }
    ]
  }

I also note that RODBC isn't the only package like this - and that some thought has been given to this before for foreign -

# the package "foreign" requires "R (>= 4.0.0)" but older versions of R

In other, more unfortunate news, I am seeing only one version returned by rsconnect:::availablePackage() (a thin wrapper around available.packages() with a custom filter).

Do you happen to set getOption("available_packages_filters")?

It feels like rsconnect should not use available_packages_filters when it wants the "duplicates" filter to be used. If we want to allow a hook for override of those filter choices, it should be different than the default available.packages filter option... I'll push a PR for you to try.

Adjustment to the available.packages() filters originally landed in #467

@slodge-work - could you try #1005 to see if that resolves your RODBC problem?

Do you happen to set getOption("available_packages_filters")?

Turns out this is in my RProfile...

# see https://github.com/rstudio/rsconnect/issues/431
options(available_packages_filters = character())

So it turns out a different version of me has found a different fix and contributed a PR to a very similar issue before (but not updated my RProfile after the PR was completed!)

Linked to: #431 #432

So glad that's explained! I'll merge #1005 and we'll consider this one sorted, then.