eddelbuettel / rcppsimdjson

Rcpp Bindings for the 'simdjson' Header Library

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Wishlist (draft)

knapply opened this issue · comments

@eddelbuettel, There's enough things that have accumulated (and frankly, lessons-learned on my end) that I'm rethinking the design from the ground up as time permits.

Looking toward the future, I'd like to consolidate the outstanding PR (#70) and previous issues that have potential solutions (e.g., #71) into a design/capability wishlist that facilitates better future-proofing the package A list of all the things that need to be considered will assist the redesign.

These are things I'm tracking now.

  • Sync w/ upstream simdjson
    • On-demand parser
  • NDJSON/JSONL support
  • con()nections
  • drop-in jsonlite::fromJSON() replacement
    • nested data frame columns (#71)
  • drop-in jsonify::from_json() replacement?

@NicolasJiaxin Is working on an On Demand prototype at lemire#1 The purpose is to prove that it can be done.

By the end of the summer, we should have simdjson 1.0 though it would not affect #70 much since the DOM API did not change between 0.9 and 1.0 (it is quite stable at this point). However, it can make On Demand more appealing.

All good, actually. I am not too concerned about the state of things. The combination of two orthogonal sets of wickedness in the simdjson library and the clever (and quickly written) package by @knapply mean that we have something rather useful and performant. There will always be users asking for a shot of cream and two sugars to go along with the strong and freshly brewed coffee we over here but we cannot always be all things to all people all the time -- and for free.

Later redesign update during/after 1.0 release sounds good to me too.

If memory serves, the obstacle for On Demand was the inability to obtain the size of arrays, but it looks like that was solved by array::count_elements() while I've been distracted elsewhere...
https://github.com/simdjson/simdjson/blob/b79261eebcd7b9a784f1e2d17de904841713f80c/include/simdjson/generic/ondemand/array-inl.h#L92-L102

Awesome!

@knapply Indeed. There might be other obstacles, but @NicolasJiaxin should stumble on them. If he manages to create the prototype, then we know it is probably all good.