Xgboost: XGBoost 0.90 Roadmap

Created on 21 Apr 2019 · 56Comments · Source: dmlc/xgboost

This thread is to keep track of all the good things that will be included in 0.90 release. It will be updated as the planned release date (~May 1, 2019~ as soon as Spark 2.4.3 is out) approaches.

[x] XGBoost will no longer support Python 2.7, since it is reaching its end-of-life soon. This decision was reached in #4379.
[x] XGBoost4J-Spark will now require Spark 2.4+, as Spark 2.3 is reaching its end-of-life in a few months (#4377) (https://github.com/dmlc/xgboost/issues/4409)
[x] XGBoost4J now supports up to JDK 12 (#4351)
[x] Additional optimizations for gpu_hist (#4248, #4283)
[x] XGBoost as CMake target; C API example (#4323, #4333)
[x] GPU multi-class metrics (#4368)
[x] Scikit-learn-like random forest API (#4148)
[x] Bugfix: Fix GPU histogram allocation (#4347)
[x] [BLOCKING][jvm-packages] fix non-deterministic order within a partition (in the case of an upstream shuffle) on prediction https://github.com/dmlc/xgboost/pull/4388
[x] Roadmap: additional optimizations for hist on multi-core Intel CPUs (#4310)
[x] Roadmap: hardened Rabit; see RFC #4250
[x] Robust handling of missing values in XGBoost4J-Spark https://github.com/dmlc/xgboost/pull/4349
[x] External memory with GPU predictor (#4284, #4438)
[x] Use feature interaction constraints to narrow split search space (#4341)
[x] Re-vamp Continuous Integration pipeline; see RFC #4234
[x] Bugfix: AUC, AUCPR metrics should handle weights correctly for learning-to-rank task (#4216)
[x] Ignore comments in LIBSVM files (#4430)
[x] Bugfix: Fix AUCPR metric for ranking (#4436)

roadmap

Source

hcho3

Most helpful comment

This is not a question for Databricks but for the Spark project. The default policy is maintenance releases for branches for 18 months: https://spark.apache.org/versioning-policy.html That would put 2.3.x at EOL in about July, so wouldn't expect more 2.3.x releases after that from the OSS project.

srowen on 22 Apr 2019

👍4

All 56 comments

as we are going to have breaking changes like https://github.com/dmlc/xgboost/pull/4349 and https://github.com/dmlc/xgboost/pull/4377

shall we bump version to 0.9?

CodingCat on 22 Apr 2019

@CodingCat Sure, we can bump to 0.90, if the breaking change is significant. Can you do me a favor and write one-paragraph description of why #4349 was needed?

hcho3 on 22 Apr 2019

sure,

CodingCat on 22 Apr 2019

* Spark 2.3 is reaching its end-of-life in a few months

Is there an official statement on that? They released 2.2.3 in January and 2.3.3 in February. Our vendor (MapR) still ships 2.3.1.

alexvorobiev on 22 Apr 2019

@alexvorobiev https://github.com/dmlc/xgboost/issues/4350, you can check with @srowen from databricks

CodingCat on 22 Apr 2019

srowen on 22 Apr 2019

👍4

@srowen Thanks!

alexvorobiev on 22 Apr 2019

@srowen @CodingCat @alexvorobiev Let's also discuss the possibility of supporting Scala 2.12 / 2.13. Right now, XGBoost4J is compiled for Scala 2.11:
https://github.com/dmlc/xgboost/blob/2c61f02add72cce8f6dc1ba87e016e3c5f0b7ea6/jvm-packages/pom.xml#L38-L39

A user reported that XGBoost4J JARs compiled for Scala 2.11 is not binary compatible with Scala 2.12.

hcho3 on 24 Apr 2019

Yeah, 2.11 / 2.12 are still binary-incompatible, and Spark has two distributions. Both are supported in 2.4.x though 2.12 is the default from here on in 2.4.x. 3.0 will drop Scala 2.11 support.

It may just be a matter of compiling two versions rather than much or any code change. If you run into any funny errors in 2.12 let me know because I stared at lots of these issues when updating Spark.

2.13 is still not GA and think it will be a smaller change from 2.12->2.13 than 2.11->2.12 (big difference here is totally different representation of lambdas).

srowen on 24 Apr 2019

@hcho3 I assume you wanted to tag @alexvorobiev?

alexeygrigorev on 24 Apr 2019

@alexeygrigorev Oops, sorry about that.

hcho3 on 24 Apr 2019

the only issue is that we need to introduce a breaking change to the artifact name of xgboost in maven, xgboost4j-spark => xgboost4j-spark_2.11/xgboost4j-spark_2.12, like spark https://mvnrepository.com/artifact/org.apache.spark/spark-core and we need to double check if we have any transient dependency on 2.11 (I think no)

Hi, @srowen though 2.12 is the default from here on in 2.4.x, I checked branch-2.4 pom.xml, if you don't specify profile scala-2.12, you still get a 2.11 build, no?

CodingCat on 25 Apr 2019

You could choose to only support 2.12 in 0.9x, and then you don't have to suffix the artifact name. If you support both, yeah, you'd really want to change the artifact name unfortunately and have _2.11 and _2.12 versions.

Yes the default Spark 2.4.x build will be for 2.11; -Pscala-2.12 gets the 2.12 build.

srowen on 25 Apr 2019

thanks, I'd stay conservative in supporting 2.12 at least for the coming version

as far as I know, most of Spark users are still using 2.11 since they are used to following previous versions of Spark

I may not have bandwidth to go through every test I have for introducing 2.12 support

I would choose to support 2.12 + 2.11 or 2.12 in 1.0 release...

CodingCat on 25 Apr 2019

@hcho3 FYI, I just removed the dense matrix support from the roadmap given the limited bandwidth

CodingCat on 25 Apr 2019

@hcho3 Could you take a look at https://github.com/dmlc/dmlc-core/pull/514 when time allows? It might be worth merging before the next release hit.

trivialfis on 26 Apr 2019

@trivialfis Will look at it

hcho3 on 26 Apr 2019

@CodingCat I think we should push back the release date, as Spark 2.4.1 and 2.4.2 have issues. What do you think?

@srowen Do you know when Spark 2.4.3 would be out?

hcho3 on 28 Apr 2019

I think it’s fine to have some slight delay

CodingCat on 28 Apr 2019

Okay, let’s wait until Spark 2.4.3 is out

hcho3 on 28 Apr 2019

Would there be the last 0.83 release for Spark 2.3.x?

tovbinm on 29 Apr 2019

@CodingCat What if we make two parallel releases 0.83 and 0.90, where 0.83 includes all commits just before #4377? The 0.83 version would be only released as JVM packages, and Python and R packages would get 0.90. It won't be any more work for me, since I have to write a release note for 0.90 anyway.

One issue though is the user experience with missing value handling. Maybe forcing everyone to use Spark 2.4.x will prevent them from messing up with missing values (the issue which motivated #4349)

hcho3 on 29 Apr 2019

👍1

@hcho3 I am a bit concerned on the inconsistency of different versions in the availability of pkgs.

I can imagine questions like hey, I find 0.83 in maven so I upgrade our Spark pkg, but I cannot use 0.83 in notebook when attempting to explore my new model setup with a small amount of data with python pkg?

I would suggest we either have a full maintenance release on 0.8x branch or nothing

CodingCat on 29 Apr 2019

@CodingCat Got it. We'll do consistent releases for all packages. What's your take on 0.83 release then? Should we do it?

hcho3 on 29 Apr 2019

@CodingCat Actually, this will create work for other maintainers, we'll need to ask them first

hcho3 on 29 Apr 2019

short answer from a personal view is yes in theory, but it might be more than cutting right before a commit (as you said, it will create work for others as well) (but I am kind of hesitated to do this because of the limited resources in the community...)

here is my 2 cents about how we should think about maintenance release like 0.8x

the reason to have a maintenance release is to bring in critical bug fixes, like https://github.com/dmlc/xgboost/commit/2d875ec0197d5a83e7d585daf472b8201aa97c51 and https://github.com/dmlc/xgboost/commit/995698b0cb1da75f066d7e0531302a3bfa5a49a4
on the other side, to make the community sustainable other than burning out all committers, we should drop support of previous version periodically
the innovations and improvements should be brought to the users through a feature release (jump from 0.8 to 0.9)

if we decide to go 0.83, we need to collect opinions from @RAMitchell @trivialfis as well and use their judge to see if we have important (more about correctness) bug fixes which are noticed by them

and then create a 0.83 branch based on 0.82 to cherry-pick commits......a lot of work actually

CodingCat on 29 Apr 2019

If I understand correctly, 0.9 will not support older versions of spark, hence the proposal to support a 0.83 version as well as 0.9 to continue support for older spark versions while including bug fixes?

Generally I am against anything that uses developer time. Aren't we busy enough already? I do see some value in having a stable version however.

RAMitchell on 30 Apr 2019

👍3

@CodingCat Is there any way to incorporate bug fixes (2d875ec and 995698b) without upgrading to Spark 2.4.x?

If making maintenance releases is more than just cutting branches (e.g. need to cherry-pick), I would rather not make such commitment.

Generally I am against anything that uses developer time. Aren't we busy enough already?

I agree.

hcho3 on 30 Apr 2019

@CodingCat Is there any way to incorporate bug fixes (2d875ec and 995698b) without upgrading to Spark 2.4.x?

@hcho3 unfortunately no, due to the breaking changes in the library depended by Spark, we can only compile and run xgboost with the consistent version of spark

CodingCat on 30 Apr 2019

if in future, we are interested in maintenance release, the workflow (after releasing 0.9)

backport necessary fix to 0.9-branch
release 0.9x for every, say, 2 months, or triggered by an important bug fix
major features and all fixes backported to 0.9x should be available in master
when release 1.0, cut a branch from master......

but again, once we have a big refactor in master and want to backport fix to 0.9 after that...tons of work

CodingCat on 30 Apr 2019

@CodingCat Given the current size of dev community, let's punt on maintenance releases.

@tovbinm Sorry, I don't think we'll be able to do 0.83 release, due to lack of bandwidth. Is upgrading to Spark 2.4.3 feasible to you?

hcho3 on 30 Apr 2019

That’s unfortunate. No, not in the short term. We are still on 2.3.x.

What’s the commit that upgraded Spark from 2.3 to 2.4? Perhaps we can cut there (if it’s above 0.82 of course).

tovbinm on 1 May 2019

@tovbinm You can build XGBoost with commit 711397d6452d596d7acbb68f1052ffebdee3e3af to use Spark 2.3.x.

hcho3 on 1 May 2019

Great. So why not make a public release from that commit?

tovbinm on 1 May 2019

As @CodingCat said, maintenance releases are not simply a matter of cutting before a commit. Also, making public releases are implicit promises of support. I do not think maintainers are up for supporting two new releases at this point in time.

I'll defer to @CodingCat as to whether we should make a release from 711397d6452d596d7acbb68f1052ffebdee3e3af

hcho3 on 1 May 2019

External memory with GPU predictor - this would mean code would not crash with what(): std::bad_alloc: out of memory anymore? (i.e. temporarily swap into RAM?)

related issue I guess https://github.com/dmlc/xgboost/issues/4184 - this was mainly on temporal bursts of memory, the process of fitting itself never require so much memory

hlbkin on 1 May 2019

@hlbkin You'll need to explicitly enable external memory, according to https://xgboost.readthedocs.io/en/latest/tutorials/external_memory.html

hcho3 on 1 May 2019

I assume its not possible to switch otherwise without a major version bump (i.e. 1.0), but when you do, could you consider supporting conformant PEP 440 version numbers (i.e. x.y.z), and preferably semantic versioning? The standard interpretation of 0.90 (rather than 0.9.0) would be that it is the 90th minor release of the major version 0.x (i.e. pre-stable-release) series, and is no more significant than 0.83. Furthermore, this restricts you to a maximum of 9 point releases per minor version, and creates difficulties for some tools (and people) to interpret. Thanks!

CAM-Gerlach on 3 May 2019

CodingCat on 3 May 2019

@CAM-Gerlach We'll consider it when we release 1.0. On the other hand, we don't want to rush to 1.0. We want 1.0 to be a milestone of some sort, in terms of features, stability, and performance.

hcho3 on 3 May 2019

👍1

Thanks for the explanation, @hcho3 .

You probably want to make sure you set the python_requires argument to '>=3.5' in setup() to ensure users with Python 2 don't get upgraded to an incompatible version accidentally.

CAM-Gerlach on 3 May 2019

@hcho3 External memory is not available with GPU algorithms

hlbkin on 4 May 2019

@hlbkin You are right. External memory will be available only for GPU predictor, not training.

@rongou @sriramch Am I correct that GPU training isn't available with external memory?

hcho3 on 4 May 2019

@hcho3 yes you are correct. we are working on it. the changes are here if you are interested. i'll have to sync this change with master and write some tests.

sriramch on 6 May 2019

@sriramch Awesome! Should we aim to include external memory training in the 0.90 release, or should we come back to it after 0.90?

hcho3 on 6 May 2019

just my two cents, let's reserve a bit on compacting many new features in 0.x (in a rush manner) and consider what is to be put in 1.0 as a milestone version

CodingCat on 6 May 2019

👍1

@CodingCat I agree. FYI, I deleted distributed customized objective from 0.90 roadmap, since there was substantial disagreement in #4280. We'll consider it again after 0.90.

@sriramch Let's consider external memory training after 0.90 release. Thanks a lot for your hard work.

hcho3 on 6 May 2019

This might be a good time to release the cuda 9.0 binaries instead of 8.0. I think 9.0 will now be sufficiently supported by users driver version. Additionally the 9.0 binaries will not need to be JIT compiled for the newer Volta architectures.

RAMitchell on 7 May 2019

@hcho3 are we ready to go?

CodingCat on 10 May 2019

Almost. I think #4438 should be merged.

hcho3 on 10 May 2019

All good now. I will go ahead and start working on the next release. ETA: May 16, 2019

[x] Require Python 3 in setup.py
[x] Change CI to build CUDA 9.0 wheels (#4459)
[x] Fix Windows compilation (#4463)
[x] Set up a minimal viable CI for Windows with GPU (#4463)

hcho3 on 10 May 2019

@RAMitchell Should we use CUDA 9.0 or 9.2 for wheel releases?

hcho3 on 11 May 2019

Lets use 9.2 as that is already set up on CI. The danger is that we require Nvidia drivers that are too new. For reference here is the table showing the correspondence between cuda version and drivers: https://docs.nvidia.com/deploy/cuda-compatibility/index.html#binary-compatibility__table-toolkit-driver

As far as I know this should not impact CPU algorithms in anyway. If users begin to report issues then we can address this in future with better error messages around driver compatibility.

RAMitchell on 12 May 2019

Hmm in that case I can try down-grading one of CI worker to CUDA 9.0. Since we are using Docker containers extensively, it should not be too difficult.

hcho3 on 12 May 2019

I'm going to prepare 0.90 release now. My goal is to have the release note complete by end of this week.

hcho3 on 14 May 2019

Closed by #4475

hcho3 on 20 May 2019

Was this page helpful?

0 / 5 - 0 ratings