Update on_cran function?
mpadge opened this issue · comments
Hi @briandconnelly. I need a generic function to test whether code is on_cran()
, but unfortunately not the one you've coded here which only applies in devtools/testthat/r-lib environments. I was wondering whether you'd be open to updating/improving your on_cran()
function here to instead check envvars typically set on CRAN machines? That kind of function already exists in the fda::CRAN()
function, but your ami
package seems like a more appropriate home for a general-purpose function.
The idea would be to compare envvars in an environment with those set during R CMD check
, or possibly other environments like Kurt Hornik's checking environment. Direct comparison of full variable names would be impractical, as they may change regularly and so require too much maintenance. But something along the lines of fda::CRAN()
might be feasible, where the generic _R_CHECK_
pattern is matched, and CRAN is set TRUE
is >= some minimal number of matches.
There are also issues that R CMD check
is not the only environment in which CRAN runs code (for example, reverse-dependency checks), and so those check variables wouldn't cover everything. But they'd be a start to a more general function that your current implementation here. What do you think?
@mpadge While I like consistency with the devtools/testthat/r-lib environments, I think what you're proposing here is the better way to go.
It seems like you have a lot more familiarity with this than I do, so I'd absolutely welcome a PR if you've got the time.
Sure, luckily I need to do it for work-related stuff anyway, so should be able to find time next week (from 13th May) to give it a go. Thanks for your quick response!
A bit more background research on PR#15, some of which should be summarized in documentation of proposed function changes.
Environment Variables in R source code
This following lists exclude:
- Environmental variable manipulation in test suites of R itself.
- Calls to
Sys.setenv()
withinon.exit()
.
Beyond those, the current R source code sets the following environment variables:
- Generic usage and modification of
R_LIBS_SITE
,R_LIBS_USER
,DISPLAY
,PATH
,LANGUAGE
(for example here.) _R_RD_MACROS_PACKAGE_DIR_
, set during package build process, and also during check hereR_BUILD_TEMPLIB
during build (along with other genericR_LIBS
,R_PACKAGE_NAME/DIR
,R_LIBRARY_DIR
instances).- During installation, several instances of
R_OS_TYPE
,R_HOME
,R_ARCH
,R_PACKAGE_NAME
,R_INSTALL_PKG
,R_LIBRARY_DIR
,R_LIBS
,CLINK_CPPFLAGS
,R_ENABLE_JIT
, and the following sub-scripted variables:_R_INSTALL_NO_DONE_
,_R_INSTALL_SUPPRESS_NO_STAGED_MESSAGE
. - During
texi2dvi
production,TEXTINPUTS
,BSTINPUTS
, andTEXTINDY
. _R_NS_LOAD_
(and here), in namespace attachment, both of which are thenon.exit(Sys.unsetenv())
, so present only during namespace loading.TZDIR
on MacOS only indatetime.R
.
Other than those, the main set of variables is set during R CMD check
, which currently including the following:
_R_CHECK_BASHISMS_
_R_CHECK_BROWSER_NONINTERACTIVE_
_R_CHECK_CODE_USAGE_VIA_NAMESPACES_
_R_CHECK_CODE_USAGE_WITH_ONLY_BASE_ATTACHED_
_R_CHECK_CODOC_VARIABLES_IN_USAGES_
_R_CHECK_COMPILATION_FLAGS_
_R_CHECK_CONNECTIONS_LEFT_OPEN_
_R_CHECK_DATALIST_
_R_CHECK_DEPENDS_ONLY_DATA_
__R_CHECK_DOC_FILES_NOTE_IF_ALL_SPECIAL__
_R_CHECK_DOT_FIRSTLIB_
_R_CHECK_EXCESSIVE_IMPORTS_
_R_CHECK_FF_AS_CRAN_
_R_CHECK_FUTURE_FILE_TIMESTAMPS_
_R_CHECK_INSTALL_DEPENDS_
_R_CHECK_LIMIT_CORES_
_R_CHECK_MATRIX_DATA_
_R_CHECK_MBCS_CONVERSION_FAILURE_
_R_CHECK_NATIVE_ROUTINE_REGISTRATION_
_R_CHECK_NEWS_IN_PLAIN_TEXT_
_R_CHECK_NO_RECOMMENDED_
_R_CHECK_NO_STOP_ON_TEST_ERROR_
_R_CHECK_ORPHANED_
_R_CHECK_PACKAGE_DATASETS_SUPPRESS_NOTES_
_R_CHECK_PACKAGES_USED_CRAN_INCOMING_NOTES_
_R_CHECK_PACKAGES_USED_IGNORE_UNUSED_IMPORTS_
_R_CHECK_PACKAGES_USED_IN_TESTS_USE_SUBDIRS_
_R_CHECK_PRAGMAS_
_R_CHECK_RD_CONTENTS_KEYWORDS_
_R_CHECK_R_DEPENDS_
_R_CHECK_RD_NOTE_LOST_BRACES_
_R_CHECK_R_ON_PATH_
_R_CHECK_SCREEN_DEVICE_
_R_CHECK_SHLIB_OPENMP_FLAGS_
_R_CHECK_TIMINGS_
_R_CHECK_VALIDATE_UTF8_
_R_CHECK_XREFS_MIND_SUSPECT_ANCHORS_
_R_CHECK_XREFS_PKGS_ARE_DECLARED_
_R_CXX_USE_NO_REMAP_
R_DEFAULT_PACKAGES
R_ENABLE_JIT
_R_NO_S_TYPEDEFS_
R_RD4PDF
TMPDI
The includes 40 _R_
variables, all of which are _R_CHECK_
(and one __R_CHECK_
), three R_
variables, plus one other. Almost all of the _R_
-type variables are set within if (as_cran) {...}
, suggesting that the function only reliably identifies function calls made within R CMD check --as-cran
, and nothing else. This notably excludes reverse dependency checks, which use only R CMD check --timings
and not --as-cran
. The --timings
flag currently doesn't set any envvars, and uses (that is, Sys.getenv()
) only two: _R_CHECK_EXAMPLE_TIMING_THRESHOLD_
, and _R_CHECK_EXAMPLE_TIMING_CPU_TO_ELAPSED_THRESHOLD_
. Neither of these are explicitly set anywhere in the R source itself, so must be presumed set beyond that in scripts used on the CRAN machines themselves. Which brings us to ...
Variables set on CRAN machines
The definitive public source for environmental variables set by CRAN seems to be the suite of tests in https://github.com/r-devel/r-dev-web/tree/main/CRAN/QA (I guess "Quality Assurance"?). This is a suite of tests by the 5 main CRAN members, of which 4 contain code for actual test and check suites, as detailed in the following sub-sections.
BDR
This contains sub-directories for many operating systems, with the following summarising Linux-auk
only, presuming other systems to be similar:
- 14
_R_CHECK_
variables intests
- 16
_R_CHECK_...
variables intests-devel
- For incoming checks, numbers of
_R_
vars, all of which are_R_CHECK_
are:- 20 in check, checkPat, checkPre, check-donttest
- 17 in checkClang, checkNoFB, checkVG
- 14 in checkDeps
- 16 in check-fake, checkLD, check-lo
Kurt
- 71 in
.R/check.Renviron
, along with more set during individual checks.
Simon
- Only 2
_R_CHECK_
variables set in the "nightly build" checks
Uwe
- 45 set in incoming checks
- 28 set in presumably general
set_ENV.bat
- 22 set in compilation checks
- Only 1 set in making
CRANbinaries.R
Conclusions
Using environment variables prefixed with _R_
seems to be a reliable way to identify code being executed on CRAN machines, or on R CMD check --as-cran
. Minimal number of variables seem to be 14 in some of the BDR tests (although that number may be higher in reality through additional variables being set elsewhere). The only environments within the CRAN/QA sub-directory which contain fewer than that number seems to be the nightly builds, which use only two, and the binary compilation, which uses only one.
It would thus be safe to set n_CRAN_envvars
to a default of 10, although this is unlikely to make any difference from currently proposed value of 5. The CRAN_pattern
could also be made more explicit through extending from current _R_
to _R_CHECK_
, but again this would not make any difference in practice, and the two are identical.
closed by #15