Ted Lentsch1, Santiago Montiel-Marín2, Holger Caesar2, and Dariu Gavrila1
1 Technical University of Delft, 2 University of Alcalá
📝 arXiv · 🤗 Hugging Face · 📚 Cite us!
- [2026-06-05] Poster presented at CVPR 2026!
- [2026-05-29] Code released on GitHub!
- [2026-05-28] Model weights released on Hugging Face!
- [2026-03-28] Paper released on arXiv (v1)!
- [2026-02-21] TerraSeg has been accepted to CVPR 2026!
▶︎ 90 seconds video on YouTube
TerraSeg is the first self-supervised, domain-agnostic model for LiDAR ground segmentation. It is trained on the OmniLiDAR dataset (~22M raw scans aggregated from 12 public autonomous-driving benchmarks across 15 distinct LiDAR sensors) using pseudo-labels produced by our self-supervised PseudoLabeler, and uses an adapted Point Transformer v3 backbone with dataset-specific normalization disabled.
The released terraseg_s.pth and terraseg_b.pth checkpoints are re-trained with the cleaned, public code release on OmniLiDAR excluding View-of-Delft (license-restricted; cannot be redistributed as part of OmniLiDAR). Performance below is the mean mIoU across the three evaluation splits used in the paper (nuScenes val, SemanticKITTI val, and Waymo Perception val):
| Checkpoint | Params | Throughput (A100) | Mean mIoU (val) |
|---|---|---|---|
terraseg_s.pth |
~12M | 17 - 50 Hz | 93.43 |
terraseg_b.pth |
~46M | 10 - 28 Hz | 94.02 |
These are obtained without any manual annotations during training (self-supervised). For full per-dataset results and ablations, see Tables 3 - 5 of the paper.
Add the library and its PTv3 backbone to your project:
uv add "ptv3 @ git+https://github.com/TedLentsch/TerraSeg.git#subdirectory=ptv3"
uv add "terraseg @ git+https://github.com/TedLentsch/TerraSeg.git#subdirectory=terraseg_lib"Run inference on a point cloud. The trained weights are downloaded once and cached automatically from Hugging Face:
import torch
from terraseg import TerraSegPredictor
predictor = TerraSegPredictor(
variant="S", # "S" for Small (~12M) or "B" for Base (~46M)
checkpoint_path="hf://TedLentsch/TerraSeg/terraseg_s.pth",
)
coord = torch.randn(50_000, 3, device="cuda") # Your (N, 3) point cloud in meters.
labels = predictor.predict(coord=coord) # Shape (N,) with datatype uint8. Labels: 0 = ground, 1 = non-ground.That is the entire integration. TerraSeg runs in FP32 (sparse-conv stability). The predictor enables TF32 tensor-core matmuls by default (tf32=True) for a small speedup on Ampere+ GPUs; pass tf32=False to reproduce the paper's exact FP32 numerics. It also accepts compile_model=True to wrap the model with torch.compile, though the gain is limited because PTv3's spconv path is opaque to the compiler.
Note: TerraSegPredictor expects point cloud to be in TerraSeg-standardized frame (z = 0 approximately ground-aligned, +x forward).
The ROS2 node ships with a pixi environment that provides ROS2 Humble and the PyTorch CUDA stack in a single Python interpreter, so rclpy and torch share one interpreter and no system ROS2 or separate virtualenv is needed. Install pixi (curl -fsSL https://pixi.sh/install.sh | bash), then:
git clone https://github.com/TedLentsch/TerraSeg.git && cd TerraSeg
pixi install # One-time: ROS2 Humble + torch stack (large download).
pixi shell # Enter the env (ros2 and colcon are on PATH).
pixi run build # The colcon build --packages-select terraseg_ros2 --symlink-install.
source install/setup.bash
ros2 launch terraseg_ros2 terraseg.launch.pyThe node subscribes to a sensor_msgs/PointCloud2 topic (default /lidar/points, sensor-data QoS) and publishes a labeled sensor_msgs/PointCloud2 on /terraseg/segmented with an added uint8 label field (0 = ground, 1 = non-ground). By default it transforms each incoming cloud into base_link via TF, so the model receives a ground-aligned cloud regardless of sensor mounting, and it loads weights directly from Hugging Face out of the box. See TerraSeg_ros2/README.md for the topic API, the full configuration reference (target_frame, tf32, compile_model), frame handling, and measured performance.
This repository is organized as a uv workspace plus a sibling ROS2 package:
terraseg_lib/: The shared TerraSeg library: model definition, BatchNorm → GroupNorm swap, feature engineering, and theTerraSegPredictorclass used for single-frame deployment. This is the package youuv addinto your own project.ptv3/: Vendored Point Transformer v3 backbone, pinned to upstream commit3229e9b7de1770c8ad17c316f8e349982de509f8. Seeptv3/README.mdfor the one-time vendoring step.TerraSeg_scripts/: Training and offline-evaluation scripts for both the TerraSeg-B (~46M params) and TerraSeg-S (~12M params) variants. A singleVARIANTconstant switches between them. Theterraseg_test.pyevaluation script accepts checkpoints either by local path or byhf://URI.TerraSeg_ros2/: The ament_python ROS2 package that wrapsTerraSegPredictorand exposes TerraSeg as asensor_msgs/PointCloud2filter. Built withcolconinside the bundledpixienvironment. SeeTerraSeg_ros2/README.mdfor the full topic API and configuration.OmniLiDAR_scripts/: Dataset converters that aggregate 12 public autonomous-driving datasets into the unified OmniLiDAR format.PseudoLabeler_scripts/: The self-supervised PseudoLabeler module that produces ground / non-ground pseudo-labels on every OmniLiDAR scan, plus the ablation studies from the paper.utils/: Shared dataset, evaluation, and split utilities.
The released TerraSeg-B and TerraSeg-S checkpoints are hosted on Hugging Face at TedLentsch/TerraSeg as terraseg_b.pth and terraseg_s.pth. Both checkpoints bundle the model weights, the tuned decision threshold, and training metadata.
You almost never need to download these manually: both the Python API (TerraSegPredictor) and the ROS2 node accept either a local filesystem path or a Hugging Face URI of the form hf://<user>/<repo>/<filename>. The file is fetched once and cached locally by huggingface_hub.
The weights are released under CC BY-NC-SA 4.0 (non-commercial, share-alike). This reflects the most restrictive terms of the upstream datasets that contributed to OmniLiDAR (most notably the Waymo Open Dataset, nuScenes, SemanticKITTI, Argoverse 2, and MAN TruckScenes). See the Hugging Face model card for the full licensing notice, upstream-dataset attributions, and the Waymo-specific Derivative IP restriction. The source code in this repository is released separately under the Apache License 2.0 (see the License section below).
Researchers and developers who want to retrain TerraSeg, evaluate on the OmniLiDAR validation splits, or run the published ablation studies should refer to the dedicated sub-project READMEs. The high-level flow is:
- Build the unified OmniLiDAR dataset with the converter scripts under
OmniLiDAR_scripts/. - Generate ground / non-ground pseudo-labels on every OmniLiDAR scan with
PseudoLabeler_scripts/. - Train and evaluate TerraSeg-B and TerraSeg-S with the scripts under
TerraSeg_scripts/.
Training takes ~10 epochs on a single GPU and uses the balanced multi-dataset sampler matching paper section A.1. The released checkpoints in this repository were produced by exactly this flow on OmniLiDAR minus VoD (the only dataset we cannot redistribute). See the model card for the full reproduction notes.
| Hardware | Examples | Status |
|---|---|---|
| Volta CUDA (sm_70) | V100, V100S | ❌ flash-attn requires Ampere+ |
| Turing CUDA (sm_75) | RTX 20xx, T4 | ❌ flash-attn requires Ampere+ |
| Ampere CUDA (sm_80, sm_86) | RTX 30xx, A40, A6000, A100 | ✅ Verified |
| Hopper CUDA (sm_90) | H100, H200 | ✅ Expected to work |
| Blackwell CUDA (sm_120) | RTX 50xx, RTX PRO, B100, B200 | ❌ Needs pin override (planned) |
| CPU only | any | ❌ spconv MaskImplicitGemm is CUDA-only |
The default stack (torch 2.4.1 + cu124, spconv-cu124, torch-scatter pt24cu124, flash-attn 2.8.3 cu12torch2.4) is verified on Ampere A100 hardware. A future release will add Blackwell support!
If TerraSeg is useful to your research, please kindly recognize our contributions by citing our paper.
@inproceedings{lentsch2026terraseg,
title={TerraSeg: Self-Supervised Ground Segmentation for Any LiDAR},
author={Lentsch, Ted and Montiel-Marín, Santiago and Caesar, Holger and Gavrila, Dariu M},
booktitle={Computer Vision and Pattern Recognition (CVPR)},
year={2026}
}
This project is released under the Apache-2.0 License. See LICENSE for details.
This research has been conducted as part of the EVENTS project, which is funded by the European Union, under grant agreement No 101069614. Views and opinions expressed are, however, those of the author(s) only and do not necessarily reflect those of the European Union or European Commission. Neither the European Union nor the granting authority can be held responsible for them. This work has also been supported by project PID2024-161576OB-I00, funded by Spanish MICIU/AEI/10.13039/501100011033 and co-funded by the European Regional Development Fund (ERDF, “A way of making Europe”).