dmlc / xgboost

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

Home Page:https://xgboost.readthedocs.io/en/stable/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[R] CRAN issues with `noRemap`

david-cortes opened this issue · comments

In the CRAN checks for the last released version of XGBoost:
https://www.stats.ox.ac.uk/pub/bdr/noRemap/xgboost.out

There is now a check noRemap with this description:

Checks with -DR_NO_REMAP used for C++ code

Tests using R_CXX_NO_REMAP= true, which compiles C++ code with
R_NO_REMAP defined. Otherwise as for the fedora-gcc checks:
see https://www.stats.ox.ac.uk/pub/bdr/Rconfig/r-devel-linux-x86_64-fedora-gcc

It is planned that this will become the default in due course, first
for CRAN incoming checks from 2024-04-14.

This gives compilation errors in XGBoost in all of the calls to R C-level functions that are not prefixed with Rf_, such as this:

xgboost_R.cc:684:28: error: 'allocVector' was not declared in this scope; did you mean 'Rf_allocVector'?
  684 |   out_shape_sexp = PROTECT(allocVector(INTSXP, out_dim));
      |                            ^~~~~~~~~~~

I am guessing that perhaps these short-hand no-Rf function names could be enabled by manually adding some define. Perhaps @jameslamb could comment here since LightGBM doesn't appear to suffer from these issues in the checks.

Otherwise, it might be a good idea to change all usages of R C-level functions to their equivalents that have Rf_ prefix (e.g. mkChar -> Rf_mkChar).

Thanks for the @.

In LightGBM, we have for a long time used this pattern:

#define R_NO_REMAP
#define R_USE_C99_IN_CXX
#include <Rinternals.h>

ref: https://github.com/microsoft/LightGBM/blob/5cd95a5b161d7630731d50e9ac529c6bf3dc809f/R-package/src/lightgbm_R.h#L10-L12

Combined with, as @david-cortes mentioned above, using the Rf_ prefix for things included from the R headers (e.g. Rf_allocVector()).

I suspect that's why {lightgbm} is not seeing this issue on CRAN. It should be safe for {xgboost} to do the same.

I think the best solution is just to do that (define R_NO_REMAP + use the Rf_ prefix). I'd be happy to put up a PR with that change.

Thank you for opening an issue. If we were to fix the CRAN errors, the fix needs to target the 1.7 branch. Will look into it and keep XGB on CRAN while the new interface is still ongoing.