Parameters Tuning¶
This page contains parameters tuning guides for different scenarios.
List of other helpful links
Tune Parameters for the Leaf-wise (Best-first) Tree¶
LightGBM uses the leaf-wise tree growth algorithm, while many other popular tools use depth-wise tree growth. Compared with depth-wise growth, the leaf-wise algorithm can converge much faster. However, the leaf-wise growth may be over-fitting if not used with the appropriate parameters.
To get good results using a leaf-wise tree, these are some important parameters:
num_leaves. This is the main parameter to control the complexity of the tree model. Theoretically, we can setnum_leaves = 2^(max_depth)to obtain the same number of leaves as depth-wise tree. However, this simple conversion is not good in practice. The reason is that a leaf-wise tree is typically much deeper than a depth-wise tree for a fixed number of leaves. Unconstrained depth can induce over-fitting. Thus, when trying to tune thenum_leaves, we should let it be smaller than2^(max_depth). For example, when themax_depth=7the depth-wise tree can get good accuracy, but settingnum_leavesto127may cause over-fitting, and setting it to70or80may get better accuracy than depth-wise.min_data_in_leaf. This is a very important parameter to prevent over-fitting in a leaf-wise tree. Its optimal value depends on the number of training samples andnum_leaves. Setting it to a large value can avoid growing too deep a tree, but may cause under-fitting. In practice, setting it to hundreds or thousands is enough for a large dataset.max_depth. You also can usemax_depthto limit the tree depth explicitly.
For Faster Speed¶
- Use bagging by setting
bagging_fractionandbagging_freq - Use feature sub-sampling by setting
feature_fraction - Use small
max_bin - Use
save_binaryto speed up data loading in future learning - Use parallel learning, refer to Parallel Learning Guide
For Better Accuracy¶
- Use large
max_bin(may be slower) - Use small
learning_ratewith largenum_iterations - Use large
num_leaves(may cause over-fitting) - Use bigger training data
- Try
dart
Deal with Over-fitting¶
- Use small
max_bin - Use small
num_leaves - Use
min_data_in_leafandmin_sum_hessian_in_leaf - Use bagging by set
bagging_fractionandbagging_freq - Use feature sub-sampling by set
feature_fraction - Use bigger training data
- Try
lambda_l1,lambda_l2andmin_gain_to_splitfor regularization - Try
max_depthto avoid growing deep tree