This directory contains runnable examples demonstrating OpenBoost's capabilities.
# Run any example
uv run python examples/basic_regression.py
# Or with standard Python
python examples/basic_regression.py| Example | Description | Key Features |
|---|---|---|
| basic_regression.py | Standard gradient boosting for regression | GradientBoosting, callbacks, feature importance |
| binary_classification.py | Binary classification with probability outputs | OpenBoostClassifier, ROC AUC, calibration |
| multiclass_classification.py | Multi-class classification with softmax | MultiClassGradientBoosting, confusion matrix |
| uncertainty_quantification.py | Probabilistic predictions with uncertainty | NaturalBoostNormal, prediction intervals, CRPS |
| kaggle_insurance.py | Insurance claims with Tweedie distribution | NaturalBoostTweedie, zero-inflated data |
| kaggle_sales.py | Sales forecasting with Negative Binomial | NaturalBoostNegBin, overdispersed counts |
| custom_loss.py | Custom loss functions | Quantile, Huber, asymmetric losses |
| gpu_training.py | GPU acceleration guide | Backend selection, benchmarking |
| gam_explainability.py | Interpretable GAM models | OpenBoostGAM, shape functions |
| sklearn_pipeline.py | sklearn Pipeline integration | Pipeline, GridSearchCV, preprocessing |
| model_persistence.py | Saving and loading models | save(), load(), checkpointing |
Learn the fundamentals of OpenBoost with a standard regression task.
Topics covered:
- Training
GradientBoostingwith various hyperparameters - Using callbacks (
EarlyStopping,Logger) - Computing feature importances
- sklearn-compatible API with
OpenBoostRegressor - Cross-validation utilities
Train a binary classifier with probability calibration analysis.
Topics covered:
- Binary classification with
loglossobjective OpenBoostClassifiersklearn wrapper- ROC AUC, precision, recall, F1 metrics
- Calibration analysis (Brier score, ECE)
- Out-of-fold probability predictions
The power of NaturalBoost: full probability distributions, not just point estimates!
Topics covered:
- Training
NaturalBoostNormalfor probabilistic predictions - Prediction intervals (90%, 80%, 50%)
- Quantile predictions
- Sampling from predicted distributions
- Proper scoring rules (CRPS, NLL)
- Heteroscedastic uncertainty
Tweedie distribution for insurance claim prediction (like Porto Seguro, Allstate).
Topics covered:
NaturalBoostTweediefor zero-inflated positive continuous data- Risk segmentation analysis
- Probability of large claims
- Individual risk assessment
- Comparison with simple MSE model
Negative Binomial for sales/demand forecasting (like Rossmann, Bike Sharing).
Topics covered:
NaturalBoostNegBinfor overdispersed count data- Inventory planning (service levels)
- Day-of-week and promotional effects
- Probability of high demand
- Comparison with Poisson model
Build any loss function you need!
Topics covered:
- Quantile regression for different percentiles
- Huber loss for outlier robustness
- Asymmetric loss for business costs
- Log-cosh smooth approximation
- How to write custom loss functions
Get the most out of GPU acceleration.
Topics covered:
- Automatic GPU detection
- Manual backend selection
- Performance benchmarking
- Best practices for GPU training
- Multi-GPU training overview
Interpretable machine learning with OpenBoostGAM.
Topics covered:
- Training interpretable GAM models
- Visualizing shape functions
- Per-feature contribution analysis
- Explaining individual predictions
- Trade-offs vs black-box models
All examples work with the base OpenBoost installation:
pip install openboostSome examples benefit from optional dependencies:
# For sklearn integration examples
pip install scikit-learn
# For visualization
pip install matplotlib
# For GPU examples
pip install numba # CUDA support included# From the repository root
cd openboost
# Run with uv
uv run python examples/basic_regression.py
# Or standard Python
python examples/basic_regression.py# Copy-paste code from examples into Jupyter/Colab cells
import openboost as ob
model = ob.GradientBoosting(n_trees=100)
model.fit(X_train, y_train)# Examples work on cloud GPU instances
import modal
app = modal.App()
@app.function(gpu="A100")
def train_model():
import openboost as ob
# ... example code ...- Start simple: Begin with
basic_regression.pyto understand the API - Check GPU: Run
gpu_training.pyto verify GPU setup - Explore uncertainty:
uncertainty_quantification.pyshows NaturalBoost's unique value - For Kaggle:
kaggle_insurance.pyandkaggle_sales.pyare ready-to-adapt templates - Custom needs:
custom_loss.pyshows how to extend OpenBoost
Example won't run?
- Ensure OpenBoost is installed:
pip install openboost - For sklearn examples:
pip install scikit-learn
GPU not detected?
- Check CUDA installation:
nvidia-smi - Ensure numba is installed:
pip install numba - See
gpu_training.pyfor debugging tips
Plots not showing?
- Install matplotlib:
pip install matplotlib - In headless environments, plots save to files
Have a cool example to share? PRs welcome!
Guidelines:
- Self-contained (generates synthetic data or uses sklearn datasets)
- Well-commented
- Demonstrates a clear use case
- Follows existing style