RoadMap
tqchen opened this issue · comments
Follow up on #574 This will serve as roadmap issue for Year 2016
This issue will be the centralized place for links to on going proposals and plans on improving xgboost. Please reply to the issue for discussion. The major goals and specifications will be marked as Roadmap Label.
Distributed Version
- Distributed python version of xgboost, enable features such as custom objective function
- XGBoost JVM Package.
Data Frame Integration
External Memory
- Being able to use external memory version in language packages, R/python/Julia from native data structure
distributed python is included in this PR #897
not tried this yet, but this iterated bagging seems to me very similar to gradient boosting and paper shows better performance
http://www.cs.utexas.edu/~ml/papers/bv-ecml-05.pdf
But I understand this is actually little bit different algorithm... something between boosting and randomforest...
I have no insight at deep code of xgboost (out of my skill), and not sure how difficult this can be...
But another view: bagging can be paralelize at higher level = building several trees at same time.. so I guess this can be even faster...
But sure, it can be a lot of work for maybe small difference...??
Proposal 1
I believe that this was the Roadmap for last year, is there any update on it ?