RoadMap

Question

RoadMap

tqchen opened this issue 8 years ago · comments

Follow up on #574 This will serve as roadmap issue for Year 2016

This issue will be the centralized place for links to on going proposals and plans on improving xgboost. Please reply to the issue for discussion. The major goals and specifications will be marked as Roadmap Label.

Distributed Version

Distributed python version of xgboost, enable features such as custom objective function
XGBoost JVM Package.

Data Frame Integration

Have a clear specification of dataframe to DMatrix conversion.

External Memory

Being able to use external memory version in language packages, R/python/Julia from native data structure

Tianqi Chen · Answer 1 · Tue Mar 01 2016 08:42:03 GMT+0800 (China Standard Time)

distributed python is included in this PR #897

gugatr0n1c · Answer 2 · Tue Mar 01 2016 15:28:04 GMT+0800 (China Standard Time)

not tried this yet, but this iterated bagging seems to me very similar to gradient boosting and paper shows better performance

http://www.cs.utexas.edu/~ml/papers/bv-ecml-05.pdf

But I understand this is actually little bit different algorithm... something between boosting and randomforest...
I have no insight at deep code of xgboost (out of my skill), and not sure how difficult this can be...
But another view: bagging can be paralelize at higher level = building several trees at same time.. so I guess this can be even faster...

But sure, it can be a lot of work for maybe small difference...??

Simon-Liu622 · Answer 3 · Fri May 06 2016 14:31:24 GMT+0800 (China Standard Time)

Proposal 1

Elie A. · Answer 4 · Wed Feb 15 2017 20:56:57 GMT+0800 (China Standard Time)

I believe that this was the Roadmap for last year, is there any update on it ?