Gradient Boosted Decision Trees

trox · on Oct 6, 2020

Very well written article! For further reading, I would also recommend diving into the conceptual overview of the gradient boosting framework LightGBM. It features some interesting optimization techniques for better overall performance.

https://github.com/microsoft/LightGBM/blob/master/docs/Featu...

shoo · on Oct 6, 2020

see also: Friedman's 1999 paper - "Greedy Function Approximation: A Gradient Boosting Machine" https://projecteuclid.org/download/pdf_1/euclid.aos/10132034...

folli · on Oct 7, 2020

Nice write-up! Any info under what circumstances gradient boosted trees behave better versus traditional random forests?

shoo · on Oct 7, 2020

(this answer from limited practical experience 10 years ago, but at least the theory doesn't go out of date):

random forest is less prone to over fitting as each tree in the ensemble is independent, if the base tree doesn't over fit then a random Forest of them also will not over fit. Whereas trees in a boosted model are not independent, boosting trains a sequence of models where model n depends on the previous models.

This is a double edged sword: you can probably get better predictive accuracy with boosting if you have enough data & have controls to prevent over fitting. Whereas a random forest is much more idiot proof to over fitting but it will not perform as well as a boosted model trained but not overfit on a large dataset.

ttul · on Oct 6, 2020

Why not also predict the error of the error prediction, and correct for that too? /s

DSingularity · on Oct 7, 2020

You mock, but overfitting is is how most results are realized.

fer · on Oct 7, 2020

The whole series is pretty damn good so far, despite the use, bordering abuse, of emojis, but that's a personal take.

What other ML topics do you plan to cover, Simon?