XGBoost has been developed and used by a group of active community members. Everyone is more than welcome to contribute. It is a way to make the project better and more accessible to more users.
Guidelines
Before submit, please rebase your code on the most recent version of master, you can do it by
git remote add upstream https://github.com/dmlc/xgboost
git fetch upstream
git rebase upstream/master
If you have multiple small commits, it might be good to merge them together(use git rebase then squash) into more meaningful groups.
Send the pull request!
First rebase to most recent master
# The first two steps can be skipped after you do it once.
git remote add upstream https://github.com/dmlc/xgboost
git fetch upstream
git rebase upstream/master
The git may show some conflicts it cannot merge, say conflicted.py
.
Manually modify the file to resolve the conflict.
After you resolved the conflict, mark it as resolved by
git add conflicted.py
Then you can continue rebase by
git rebase --continue
Finally push to your fork, you may need to force push here.
git push --force
Sometimes we want to combine multiple commits, especially when later commits are only fixes to previous ones, to create a PR with set of meaningful commits. You can do it by following steps.
Before doing so, configure the default editor of git if you haven’t done so before.
git config core.editor the-editor-you-like
Assume we want to merge last 3 commits, type the following commands
git rebase -i HEAD~3
It will pop up an text editor. Set the first commit as pick
, and change later ones to squash
.
After you saved the file, it will pop up another text editor to ask you modify the combined commit message.
Push the changes to your fork, you need to force push.
git push --force
The previous two tips requires force push, this is because we altered the path of the commits. It is fine to force push to your own fork, as long as the commits changed are only yours.
By default, sanitizers are bundled in GCC and Clang/LLVM. One can enable sanitizers with GCC >= 4.8 or LLVM >= 3.1, But some distributions might package sanitizers separately. Here is a list of supported sanitizers with corresponding library names:
Memory sanitizer is exclusive to LLVM, hence not supported in XGBoost.
One can build XGBoost with sanitizer support by specifying -DUSE_SANITIZER=ON. By default, address sanitizer and leak sanitizer are used when you turn the USE_SANITIZER flag on. You can always change the default by providing a semicolon separated list of sanitizers to ENABLED_SANITIZERS. Note that thread sanitizer is not compatible with the other two sanitizers.
cmake -DUSE_SANITIZER=ON -DENABLED_SANITIZERS="address;leak" /path/to/xgboost
By default, CMake will search regular system paths for sanitizers, you can also supply a specified SANITIZER_PATH.
cmake -DUSE_SANITIZER=ON -DENABLED_SANITIZERS="address;leak" \ -DSANITIZER_PATH=/path/to/sanitizers /path/to/xgboost
Runing XGBoost on CUDA with address sanitizer (asan) will raise memory error. To use asan with CUDA correctly, you need to configure asan via ASAN_OPTIONS environment variable:
ASAN_OPTIONS=protect_shadow_gap=0 ../testxgboost
For details, please consult official documentation for sanitizers.
std::thread
.make lint
make lint
We follow Google’s C++ Style guide for C++ code.
You can check the style of the code by typing the following command at root folder.
make rcpplint
When needed, you can disable the linter warning of certain line with `// NOLINT(*)`
comments.
We use roxygen for documenting the R package.
Rmarkdown vignettes are placed in R-package/vignettes. These Rmarkdown files are not compiled. We host the compiled version on doc/R-package.
The following steps are followed to add a new Rmarkdown vignettes:
Add the original rmarkdown to R-package/vignettes
.
Modify doc/R-package/Makefile
to add the markdown files to be build.
Clone the dmlc/web-data repo to folder doc
.
Now type the following command on doc/R-package
:
make the-markdown-to-make.md
This will generate the markdown, as well as the figures in doc/web-data/xgboost/knitr
.
Modify the doc/R-package/index.md
to point to the generated markdown.
Add the generated figure to the dmlc/web-data
repo.
git add
Create PR for both the markdown and dmlc/web-data
.
You can also build the document locally by typing the following command at the doc
directory:
make html
The reason we do this is to avoid exploded repo size due to generated images.
Since version 0.6.4.3, we have adopted a versioning system that uses x.y.z (or core_major.core_minor.cran_release
)
format for CRAN releases and an x.y.z.p (or core_major.core_minor.cran_release.patch
) format for development patch versions.
This approach is similar to the one described in Yihui Xie’s
blog post on R Package Versioning,
except we need an additional field to accomodate the x.y core library version.
Each new CRAN release bumps up the 3rd field, while developments in-between CRAN releases would be marked by an additional 4th field on the top of an existing CRAN release version. Some additional consideration is needed when the core library version changes. E.g., after the core changes from 0.6 to 0.7, the R package development version would become 0.7.0.1, working towards a 0.7.1 CRAN release. The 0.7.0 would not be released to CRAN, unless it would require almost no additional development.
According to R extension manual,
it is good practice to register native routines and to disable symbol search. When any changes or additions are made to the
C++ interface of the R package, please make corresponding changes in src/init.c
as well.