Xgboost: Model produced in 1.0.0 cannot be loaded into 0.90

Created on 22 Feb 2020  ·  74Comments  ·  Source: dmlc/xgboost

Following the instructions here: https://xgboost.readthedocs.io/en/latest/R-package/xgboostPresentation.html

> install.packages("drat", repos="https://cran.rstudio.com")
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.6/drat_0.1.5.zip'
Content type 'application/zip' length 87572 bytes (85 KB)
downloaded 85 KB

package ‘drat’ successfully unpacked and MD5 sums checked

The downloaded binary packages are in
        C:\Users\lee\AppData\Local\Temp\RtmpiE0N3D\downloaded_packages
> drat:::addRepo("dmlc")
> install.packages("xgboost", repos="http://dmlc.ml/drat/", type = "source")
Warning: unable to access index for repository http://dmlc.ml/drat/src/contrib:
  Line starting '<!DOCTYPE html> ...' is malformed!
Warning message:
package ‘xgboost’ is not available (for R version 3.6.0) 

It also fails on R 3.6.2 with the same error.

Note: I would much prefer to use the CRAN version. But models I train on linux and Mac and save using the saveRDS function don't predict on another system (windows), they just produce numeric(0). If anyone has any guidelines on how to save an XGBoost model for use on other computers, please let me know. I've tried xgb.save.raw and xgb.load - both produce numeric(0) as well. But on the computer I trained the model on (a month ago), readRDS in R works just fine. Absolutely baffling to me.

Most helpful comment

I've successfully reproduced the issue on linux - see attached screenshot.

I am uploading the virtual box to dropbox. But in case you'd prefer to install from scratch, instructions to replicate are as follows:

  1. Download and install virtualbox
  2. Download a build of ubuntu bionic beaver (I used ubuntu-18.04.3-desktop-amd64).
  3. Follow instructions here: https://medium.com/@mannycodes/installing-ubuntu-18-04-on-mac-os-with-virtualbox-ac3b39678602. I used a fixed 15 gigabyte virtual drive
  4. Configure for R 3.6.2 by the following:
sudo apt-get update
sudo apt-get install vim
sudo vim /etc/apt/sources.list
i
deb https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/
<esc> :wq
gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9
gpg -a --export E298A3A825C0D65DFD57CBB651716619E084DAB9 | sudo apt-key add -
sudo apt-get update
sudo apt-get install r-base-dev
  1. install packages in R

install.packages(c("caret", "xgboost"))

  1. load packages in R
library(caret)
library(xgboost)
  1. Load the data
modelList <- readRDS("~/Desktop/modelList.rdata")
attach(modelList)
  1. Attempt to predict
predict(object=xgbModel, newdata=as.matrix(caretModel$trainingData[,!colnames(caretModel$trainingData) %in% ".outcome"]))

And this returned the now familiar:

[13:00:55] WARNING: amalgamation/../src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.
numeric(0)

Screen Shot 2020-02-23 at 12 33 13 PM

note: I tried explicit calling libraries, and just trying the prediction. Both resulted in the same numeric(0) response.

All 74 comments

Did you try install.packages(‘xgboost’) ?

Did you try install.packages(‘xgboost’) ?

That works just fine, but then all predictions from saved models (using saveRDS) produce numeric(0). It seems that models trained using openMP don't work on computers without that framework. My hope was that compiling from source would resolve this.

This is how you compile XGBoost from the source:

mkdir build
cd build
cmake .. -G"Visual Studio 14 2015 Win64" -DR_LIB=ON
cmake --build . --target install --config Release

Can you share the model file so that we can try to find why you are getting a 0 for prediction?

Thanks - do you want the R object or would you prefer it converted via xgb.save.raw?

Both

Thank you for your help and support (and with making such an extraordinary learning program).

Attached is a zipped .rdata file. On 3 systems (Mac, MacBook, Ubuntu), I can run the following code:

modelList <- readRDS("<path_to_folder>/modelList.rdata")
attach(modelList)
predictions_caret <- predict(object=caretModel, newdata=caretModel$trainingData[,!colnames(caretModel$trainingData) %in% ".outcome"])
predictions_xgbbase <- predict(object=xgbModel, newdata=as.matrix(caretModel$trainingData[,!colnames(caretModel$trainingData) %in% ".outcome"]))

And get the following output:

> predictions_caret <- predict(object=caretModel, newdata=caretModel$trainingData[,!colnames(caretModel$trainingData) %in% ".outcome"])
[23:12:50] WARNING: ../..//amalgamation/../src/objective/regression_obj.cu:167: reg:linear is now deprecated in favor of reg:squarederror.
> head(predictions_caret)
[1] 14.0180358887 13.0352602005 12.9208145142 13.2430124283 13.6698570251
[6] 12.9033651352

> predictions_xgbbase <- predict(object=xgbModel, newdata=as.matrix(caretModel$trainingData[,!colnames(caretModel$trainingData) %in% ".outcome"]))
[23:16:17] WARNING: ../..//amalgamation/../src/objective/regression_obj.cu:167: reg:linear is now deprecated in favor of reg:squarederror.
> head(predictions_xgbbase)
[1] 14.0180358887 13.0352602005 12.9208145142 13.2430124283 13.6698570251
[6] 12.9033651352

On system 4 (windows base R with xgboost installed from CRAN) I get very different outcomes:

> predictions_caret <- predict(object=caretModel, newdata=caretModel$trainingData[,!colnames(caretModel$trainingData) %in% ".outcome"])
[23:18:04] WARNING: amalgamation/../src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.
> head(predictions_caret)
numeric(0)

> predictions_xgbbase <- predict(object=xgbModel, newdata=as.matrix(caretModel$trainingData[,!colnames(caretModel$trainingData) %in% ".outcome"]))
[23:18:41] WARNING: amalgamation/../src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.
> head(predictions_xgbbase)
numeric(0)

I suspect this may need a new issue opened, but I appreciate any guidance you can give for why the same code would behave differently on different systems.

Lastly, xgbRaw is included (should be callable if you use attach()), it was generated with the following code:

xgbRaw=xgb.save.raw(xgb.Booster.complete(caretModel$finalModel))

Thank you once again.

modelList.rdata.zip

Probably due to our dmlc guard against omp exception?

@leedrake5 Does your machine not support OpenMP?

@hcho3 Did you find the cause? I just compiled XGBoost without OpenMP (with some patches to CMakeLists), external memory tests are failing as expected, also hist parallel group test is failing (which I fixed in local branch). But so far I can get perfect prediction values.

@hcho3 My machine supports openmp. But we are deploying these models on an industrial scale, and lots of the computers at specific sites don't have openmp. We just discovered this after 6 months of prep and one week till implementation, so we are scrambling to figure out why the models work on some computers and not others.

If there's some philosophy about why models should only work on their construction environment, would there be some way to get an informative error about why?

@leedrake5 It's a bug. I have been trying to reproduce it on Ubuntu with openmp disabled but so far no luck.

@trivialfs If it helps, I and most my colleagues experiencing the bug on windows. Though a colleague of mine was able to reproduce it on ubuntu without openmp.

I suspect a minimal installation of R with no fancy stuff (openmp, intelmkl, openblas, etc.) may reproduce it, but I'm still wrapping my head around what part of the system is causing problems.

@leedrake5 If there's a way to reproduce it on Linux distributions it would be of great help!

I disabled all options including google test, omp ... But still have the correct result loading your models. The installation of R is the default distribution from apt, I don't think it has anything to do with XGBoost prediction.

It might not be caused by OpenMP. As even the prediction goes really wrong there's still global bias, not 0.

The only reason I can think of is XGBoost is not actually being installed. I think we run tests on CRAN right? @hcho3

I tried both CMake build and autotools build. This is my compilation flags reported by install.packages(...) (autotools build):

g++ -std=gnu++11 -I"/usr/share/R/include" -DNDEBUG -I../..//include -I../..//dmlc-core/include -I../..//rabit/include -I../../ -DXGBOOST_STRICT_R_MODE=1 -DDMLC_LOG_BEFORE_THROW=0 -DDMLC_ENABLE_STD_THREAD=1 -DDMLC_DISABLE_STDIN=1 -DDMLC_LOG_CUSTOMIZE=1 -DXGBOOST_CUSTOMIZE_LOGGER=1 -DRABIT_CUSTOMIZE_MSG_ -DRABIT_STRICT_CXX98_    -DDMLC_CMAKE_LITTLE_ENDIAN=1 -pthread -fpic  -g -O2 -fdebug-prefix-map=/build/r-base-el7SHG/r-base-3.5.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c ../..//amalgamation/xgboost-all0.cc -o ../..//amalgamation/xgboost-all0.o

@trivialfis Even more curious, as XGBoost installs successfully from CRAN, and library(xgboost) works without any problem.

Though a colleague of mine was able to reproduce it on ubuntu without openmp.

Is it possible that he/she can kindly share how to reproduce?

Though a colleague of mine was able to reproduce it on ubuntu without openmp.

Is it possible that he/she can kindly share how to reproduce?

Not sure they know enough about how linux systems work - but if you know of any diagnostics they can provide let me know.

For my part, I am trying to create a virtual box ubuntu installation that can replicate the problem - if successful I will send a download link for that.

I've successfully reproduced the issue on linux - see attached screenshot.

I am uploading the virtual box to dropbox. But in case you'd prefer to install from scratch, instructions to replicate are as follows:

  1. Download and install virtualbox
  2. Download a build of ubuntu bionic beaver (I used ubuntu-18.04.3-desktop-amd64).
  3. Follow instructions here: https://medium.com/@mannycodes/installing-ubuntu-18-04-on-mac-os-with-virtualbox-ac3b39678602. I used a fixed 15 gigabyte virtual drive
  4. Configure for R 3.6.2 by the following:
sudo apt-get update
sudo apt-get install vim
sudo vim /etc/apt/sources.list
i
deb https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/
<esc> :wq
gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9
gpg -a --export E298A3A825C0D65DFD57CBB651716619E084DAB9 | sudo apt-key add -
sudo apt-get update
sudo apt-get install r-base-dev
  1. install packages in R

install.packages(c("caret", "xgboost"))

  1. load packages in R
library(caret)
library(xgboost)
  1. Load the data
modelList <- readRDS("~/Desktop/modelList.rdata")
attach(modelList)
  1. Attempt to predict
predict(object=xgbModel, newdata=as.matrix(caretModel$trainingData[,!colnames(caretModel$trainingData) %in% ".outcome"]))

And this returned the now familiar:

[13:00:55] WARNING: amalgamation/../src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.
numeric(0)

Screen Shot 2020-02-23 at 12 33 13 PM

note: I tried explicit calling libraries, and just trying the prediction. Both resulted in the same numeric(0) response.

Link to 7z compressed virtual instance is here: https://www.dropbox.com/s/ld06zz798m0segf/UbuntuDebug.7z?dl=0

Let me get virtual box first. I usually use KVM.

Looking at this now.

@leedrake5 I managed to reproduce the issue on the virtual machine you provided. It appears that there is an issue with saveRDS and xgb.save.raw. Did you try using xgb.save and xgb.load instead? This set of functions use "native" format of XGBoost model and does not save attributes that are specific to R environment. It is useful if you want to transfer your model from a R package to Python.

The save raw should not be a problem. The problem might be in how R serialize the R code. We added a new optional parameter to predict, which might affect loading these kinds of serialization. I tried looking into it but seems really quirky, just like the problem with Python pickle.

@trivialfis any ideas on options in the predict function that could potentially circumvent this? My thought is that it has something to do with CPU instructions, possibly deeper in R than can be addressed by the package.

@hcho3 I will try adding xgb.save.raw to the current data pipe and see if that helps. Though in the example I sent, I did try this by using xgb.save.raw on the computer where predictions worked, and it unfortunately still lead to numeric(0) on xgb.load. Admittedly, this was after the model had already been saved via saveRDS, perhaps something about that process corrupts it, even though it still works in the environment it was created.

As far as I can tell, there is still something about the environment where the model was created that needs to be present in every environment it is used. I don't know what it is though.

No, I meant xgb.save, not xgb.save.raw.

@trivialfis Note that the XGBoost package is still 0.90 on CRAN.

I tried xgb.save, loaded it in R via xgb.load(), and same thing, numeric(0). See attached from xgb.save

xgbModelSaved.xgb.zip

Also- tested the same method using the data in modelList earlier in this thread on the same computer that generated the model - it worked there.

No matter how you slice it, xgb.save, xgb.save.raw, saveRDS, etc., there is something about the computer environment the model was created on that precludes its use elsewhere.

I am able to reproduce the issue using the model file you provided. However, I have no idea as to how the model file was generated. Can you provide more context, e.g.

  • OS
  • XGBoost version
  • Command to save the model

OS X (Catalina). XGBoost build from source, 1.0 (openMP supported on the system). Model originally saved using saveRDS. I will try rerunning it and saving via different ways and report back.

I regenerated a model from scratch, saved it using xgb.save (e.g. no use of saveRDS or readRDS except for prediction data) and loaded using xgb.load. Same results - prediction works on the computer system used to generate the model (OS X, XGB 1.0) and fails on another system (Windows 7, XGB 0.9). Attaching it and the modelList to this message. If there is anything else I can do to help diagnose this issue please let me know.

xgb_model <- xgb.load("Z:/Desktop/newxgbModelSaved.xgb")
modelList <- readRDS("Z:/Desktop/modelList2.rdata")
attach(modelList)
predict(object=xgb_model, newdata=as.matrix(caretModel$trainingData[,!colnames(caretModel$trainingData) %in% ".outcome"]))
numeric(0)

newxgbModelSaved.xgb.zip

modelList2.rdata.zip

@leedrake5 Is it possible to use XGBoost 1.0 on Windows?

@trivialfis This might be a problem with backward compatibility of model files.

It's definitely versioning. I just dropped by 1.0 version of XGBoost, installed 0.9 from CRAN, and the method outlined above worked.

I tried to install 0.90 from source using the latest release, but the dmlc-core folder is empty, so Cmake throws the necessary error. Would very much like to keep openMP functionality however this goes.

I see. Did you try to install 1.0.0 version on Windows?

That was the initial reason for this thread, oddly enough. Will try, but there will be a delay as the project can't wait any longer. If this is a problem, I suspect that installing 1.0 from source on the ubuntu virtual image I sent would address the same thing.

@leedrake5 Yes, you should match the version of XGBoost. Either use 0.90 everywhere or 1.0.0 everywhere.

@trivialfis I do recall that we made a breaking change in the binary model format. Is that right?

No, we are backward compatible, 1.0 should definitely load model from 0.9

@trivialfis Thanks for clarification. In that case, let’s treat this ticket as a bug, since we expected backward compatibility and found a counterexample. I will look at it this week.

I think the proposed title change has it backwards - model produced in 1.0 can't be used to predict in 0.90. I don't know what degree of compatibility is intended between versions, but at minimum an informative error should help.

Oops, my bad.

@trivialfis I believe what @leedrake5 is asking is forward compatibility, i.e. loading model from 1.0.0 into 0.90. Not sure if this is what we promise.

@leedrake5 I tried installing 1.0.0 in the Ubuntu virtual machine, and now the RDS file loads fine and I get a vector of real numbers. So most likely it is the version issue.
1 0 0

EDIT. The warning message in the screenshot is odd: Loading model from < 1.0.0, consider saving it again. This shouldn't be here, given that the model was produced by 1.0.0.

I agree. Very glad to set convergence on what the problem is.

Unfortunately, from a production standpoint, I've got to make models deployable in the mass version (ergo, CRAN installation). But, I know how to make XGBoost work from here on out, which is the absolute critical point.

I am still having troubles installing 0.9 from source downloaded from releases or the tree, but I think I found the problem. cub, dmlc-core, rabit, all are empty folders in these downloads. They also have @ after the folders - I think that is related to which folders are empty. If I follow those links, I can rebuild folders and continue installation.

See screenshot:
Screen Shot 2020-03-03 at 9 41 33 PM

Yes, we use git submodules, so the build instruction asks you to use git clone --recursive to install from the source. Are you in a situation where CRAN installation is inadequate? If using 0.90 everywhere fixes your problem, you could just use CRAN install everywhere (Linux, Mac, Windows)?

Yes - CRAN doesn't allow for openMP, which for model generation is a huge boon. So will spend some time figuring out how to compile xgboost 0.90 from source to get that working again. I appreciate all of your help on diagnosing this issue.

@leedrake5

Got it. For now, here is the command you can use to get the full source code, including the git submodules:

git clone --recursive https://github.com/dmlc/xgboost -b release_0.90

This should fetch the XGBoost source code and populate rabit, cub, dmlc-core directories. -b release_0.90 is used to get 0.90 version, not latest.

Aside:

CRAN doesn't allow for openMP

This is actually fixed in 1.0.0. I will try to get 1.0.0 version on CRAN soon.

Thank you, much appreciated on all counts!

@hetong007 Can we submit 1.0.0 to CRAN?

@hcho3 Sounds good, given that we have a valid endian detection. I'll make a submission in a week.

@hetong007 Any updates? Let me know if I can help in any way.

@hcho3 oh yes. I'd like to ask if we have the environment for solaris testing? There's an unresolved error on solaris from 0.9: https://www.r-project.org/nosvn/R.check/r-patched-solaris-x86/xgboost-00check.html I'd like to have it checked before the submission.

@hetong007 I just downloaded Solaris VM image from Oracle that can run with VirtualBox. Let me try to reproduce the issue.

@hetong007 I am having lots of trouble setting up the R environment in Solaris. In particular, I cannot install R packages like igraph and testthat because utilities like awk behave differently on Solaris. I have no idea how to proceed further.

@hcho3 Would you please be clearer? I imaging one could install packages in R console with install.packages('testthat'), given that testthat has passed solaris check on CRAN.

@hetong007 The package igraph uses awk in configure step, and Solaris's awk is not feature compatible with GNU awk. It is possible to install GNU awk, but it is called gawk. Still trying to figure out how to get R to use gawk instead of awk.

@hcho3 OK this sounds awful. I have no knowledge in how to proceed either. I just took a look at igraph's result page on CRAN: https://cran.r-project.org/web/checks/check_results_igraph.html. Feels that we are not going to be worse. I'll make a submission regardless.

@hetong007 Thanks. I'll try to get back to Solaris testing at some point. I'm sure there is a way.

@hetong007 Ah yes, make sure to use the branch release_1.0.0 when submitting the code.

@hcho3 The pre-check warns:

* checking whether package 'xgboost' can be installed ... WARNING
Found the following significant warnings:
  amalgamation/../src/common/hist_util.cc:666:72: warning: 'void* memset(void*, int, size_t)' clearing an object of non-trivial type 'struct xgboost::tree::GradStats'; use assignment or value-initialization instead [-Wclass-memaccess]
See 'd:/RCompile/CRANguest/R-devel_gcc8/xgboost.Rcheck/00install.out' for details.

and the 00install.out file is here (available in ~72 hours): https://win-builder.r-project.org/v6XJQBjzgMaW/00install.out

Would it be possible to have a quick fix?

@hetong007 Is this a blocker? Memset was used to speed up zeroing out histogram (array of structure). If this is blocking CRAN, I will put a macro guard so that the R package will use ordinary constructor instead.

@hcho3 Yes It is a blocker. The pre-check is automated and the submission should pass without warning. The macro sounds good enough, thanks!

@hcho3 Could you please try using std::fill first? Even the compiler may not be able to optimize it into memset, I don't think it's critical enough to have any visible performance impact.

@hetong007 I made the necessary change (https://github.com/dmlc/xgboost/commit/3550b16a34055bc8ec33bf0b7006448e8c1a4eca). Can you try again?

@trivialfis I'll file a separate pull request to replace memset.

@hcho3 thanks for the fix! If nothing serious goes wrong, it is on the way to CRAN.

@hcho3 So it fails on Soalris, from Prof. Ripley:

On Solaris it was quicker to diagnose the problems with the GCC compilers:

amalgamation/../src/common/io.cc:120:27: error: ‘POSIX_FADV_SEQUENTIAL’
was not declared in this scope
    posix_fadvise(fd, 0, 0, POSIX_FADV_SEQUENTIAL);

amalgamation/../src/common/io.cc:120:48: error: ‘posix_fadvise’ was not
declared in this scope
    posix_fadvise(fd, 0, 0, POSIX_FADV_SEQUENTIAL);

As 'Writing R Extensions' told you, you need to use configure to test
for the presence of non-standard C/C++ functions.  In this case, it
seems that line is not actually necessary.

@hetong007 Nice. Please tell Prof. Ripley thanks for taking time to debug the problem. I will go ahead and guard that line to only run in proper Linux environment only, since we only have proper testing facility to test XGBoost with Windows and Linux. Ideally, any non-standard C++ facilities should not be used unless they are first tested as part of CI. I will also have to scan the codebase for other non-standard C++ features and try to remove them as well. cc @trivialfis

@hcho3 sounds good to me. The deadline from Prof. Ripley for a new submission to CRAN is 2020-04-06. If you are interested in detailed installation log, please visit https://cran.r-project.org/web/checks/check_results_xgboost.html

@hetong007 The commit https://github.com/dmlc/xgboost/commit/d83db4844bae5969609e58cd6bb201e7831cfaa3 should address the problem. I finally managed to get the latest R working with my Solaris VM and run CRAN checks for XGBoost.

@hcho3 Brilliant! Can I take it as you have made it pass the check on Solaris?

Yes, we should submit it now.

Hi, I'm faced with similar issues with Python. I saved the model via xgboost 1.0.1, and loaded it via xgboost 0.9, and the predictions are empty lists. Since our production environment is using xgboost 0.9, I'm wondering if there's any way I can load a saved-via-xgboost-1.0 model into xgboost 0.9?

@hcho3 Is there a way to share your snapshot? It is valuable for our further submission as well.

@sunhs I'm also facing a similar issue. Were you able to solve this?

@giladmaya Nope man. After trying several methods, I finally decided that retraining the model saved my time.

Closing this, since we are unable to promise that models from newer versions would be supported by older versions. It's like buying a PlayStation 4 and attempting to load a game developed for PlayStation 5.

We promise only backward compatibility, so you can save models using old version and load it back using a new version. This is so that the model format can gradually evolve over time.

Was this page helpful?
0 / 5 - 0 ratings