CADET: Interpretable Parametric Conditional Density Estimation with Decision Trees and Forests
Cyrus Cousins and Matteo Riondato
Abstract
We introduce CADET, an algorithm for parametric Conditional Density Estimation (CDE) based on decision trees and random forests. CADET uses the empirical cross entropy impurity criterion for tree growth, which incentivizes splits that improve predictive accuracy more than the regression criteria or estimated mean-integrated-square-error used in previous works. CADET also admits more efficient training and query procedures than existing tree-based CDE approaches, and stores only a bounded amount of information at each tree leaf, by using sufficient statistics for all computations. Previous tree-based CDE techniques produce complicated uninterpretable distribution objects, whereas CADET may be instantiated with easily interpretable distribution families, making every part of the model easy to understand. Our experimental evaluation on real datasets shows that CADET usually learns more accurate, smaller, and more interpretable models, and is less prone to overfitting than existing tree-based CDE approaches.
Keywords
Interpretable Machine Learning ♦ Decision Trees ♣ Conditional Density Estimation ♥ Parametric Methods ♠ Cross Entropy Minimization
Read the full paper