eddelbuettel / rcppsimdjson

Rcpp Bindings for the 'simdjson' Header Library

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

integers

knapply opened this issue · comments

simdjson itself uses int64_t and uint64_t for integers.

"Big-ints" are something I have to deal with on a regular basis, so it would be nice for RcppSimdJson to give the user options for handling them.

It's easy to check if numbers exceed R's integer type and safely cast accordingly, but I'd like the option to cast them to characters or bit64::integer64. At the R level, it looks something like this:

@eddelbuettel Is it acceptable to add bit64 to the DESCRIPTION's Suggests? I'm not suggesting RcppSimdJson code itself touch bit64 (except for testing), but this flexibility would be nice to have:

int64_opts <- list(double = 0, string = 1, integer64 = 2)

two_billion <- "2000000000"
three_billion <- "3000000000"

typeof(RcppSimdJson:::.parse_json(two_billion))
#> [1] "integer"
typeof(RcppSimdJson:::.parse_json(three_billion))
#> [1] "double"

RcppSimdJson:::.parse_json(three_billion, int64_T = int64_opts$double)
#> [1] 3e+09
RcppSimdJson:::.parse_json(three_billion, int64_T = int64_opts$string)
#> [1] "3000000000"

suppressPackageStartupMessages(library(bit64))
RcppSimdJson:::.parse_json(three_billion, int64_T = int64_opts$integer64)
#> integer64
#> [1] 3000000000

Underneath the hood it's just an enum-class template argument.

The current implementation can be found in @cran-dev/inst/include/RcppSimdJson/utils.hpp and an example R interface can be found in @cran-dev/src/rcppsimdjson_utils_check.cpp.

Without having looked into details, and as much as I hate added Depends: yes!

I do the same in package nanotime which relies on bit64 and its integer64 class. We use that at work too. R's limited set of types bites us with large ints, and this is the best we have. And I have a good working relationship with Jens (upstream) so thumbs up from me.